Creating and Modifying DataFrames

DataFrames are the core two-dimensional data structures in Pandas, similar to tables in a database or spreadsheet. This page covers creating DataFrames from different sources and modifying them effectively.


Creating DataFrames

1. From Lists and Dictionaries

Using Lists

import pandas as pd
 
# Create a DataFrame from a list of lists
data = [["Anika", 25, "Delhi"], ["Rahul", 30, "Mumbai"], ["Sneha", 22, "Bangalore"]]
columns = ["Name", "Age", "City"]
df = pd.DataFrame(data, columns=columns)
print(df)

Output:

    Name  Age       City
0  Anika   25      Delhi
1  Rahul   30     Mumbai
2  Sneha   22  Bangalore

Using Dictionaries

# Create a DataFrame from a dictionary
data = {
    "Name": ["Anika", "Rahul", "Sneha"],
    "Age": [25, 30, 22],
    "City": ["Delhi", "Mumbai", "Bangalore"]
}
df = pd.DataFrame(data)
print(df)

2. From NumPy Arrays

import numpy as np
 
# Create a DataFrame from a NumPy array
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
df = pd.DataFrame(data, columns=["Column1", "Column2", "Column3"])
print(df)

Output:

   Column1  Column2  Column3
0        1        2        3
1        4        5        6
2        7        8        9

3. From CSV and Excel Files

Reading from a CSV File

# Read data from a CSV file
df = pd.read_csv("data.csv")
print(df.head())  # Display the first 5 rows

Reading from an Excel File

# Read data from an Excel file
df = pd.read_excel("data.xlsx")
print(df.head())

Modifying DataFrames

1. Adding Rows

# Append a new row to the DataFrame
new_row = {"Name": "Amit", "Age": 28, "City": "Pune"}
df = df.append(new_row, ignore_index=True)
print(df)

Output:

    Name  Age       City
0  Anika   25      Delhi
1  Rahul   30     Mumbai
2  Sneha   22  Bangalore
3   Amit   28       Pune

2. Adding Columns

# Add a new column to the DataFrame
df["Country"] = ["India", "India", "India", "India"]
print(df)

Output:

    Name  Age       City Country
0  Anika   25      Delhi   India
1  Rahul   30     Mumbai   India
2  Sneha   22  Bangalore   India
3   Amit   28       Pune   India

3. Deleting Rows

# Delete a row by index
df = df.drop(index=1)
print(df)

Output:

    Name  Age       City Country
0  Anika   25      Delhi   India
2  Sneha   22  Bangalore   India
3   Amit   28       Pune   India

4. Deleting Columns

# Delete a column
df = df.drop(columns=["Country"])
print(df)

Output:

    Name  Age       City
0  Anika   25      Delhi
2  Sneha   22  Bangalore
3   Amit   28       Pune

Try It Yourself

Problem 1: Create a DataFrame from a Dictionary

Create a DataFrame with columns Product, Price, and Stock. Add sample data for 3 products and display the DataFrame.

Show Code
import pandas as pd
 
data = {
    "Product": ["Laptop", "Phone", "Tablet"],
    "Price": [80000, 30000, 20000],
    "Stock": [50, 150, 100]
}
df = pd.DataFrame(data)
print(df)

Problem 2: Modify a DataFrame

Using the DataFrame created in Problem 1:

  1. Add a new column Discount with values [10, 15, 5].
  2. Remove the row corresponding to Tablet.
Show Code
# Add a new column
df["Discount"] = [10, 15, 5]
print("After Adding Discount Column:\n", df)
 
# Remove the row for 'Tablet'
df = df[df["Product"] != "Tablet"]
print("After Removing Tablet Row:\n", df)

Pyground

Play with Python!

Output: