Creating and Modifying DataFrames
DataFrames are the core two-dimensional data structures in Pandas, similar to tables in a database or spreadsheet. This page covers creating DataFrames from different sources and modifying them effectively.
Creating DataFrames
1. From Lists and Dictionaries
Using Lists
import pandas as pd
# Create a DataFrame from a list of lists
data = [["Anika", 25, "Delhi"], ["Rahul", 30, "Mumbai"], ["Sneha", 22, "Bangalore"]]
columns = ["Name", "Age", "City"]
df = pd.DataFrame(data, columns=columns)
print(df)
Output:
Name Age City
0 Anika 25 Delhi
1 Rahul 30 Mumbai
2 Sneha 22 Bangalore
Using Dictionaries
# Create a DataFrame from a dictionary
data = {
"Name": ["Anika", "Rahul", "Sneha"],
"Age": [25, 30, 22],
"City": ["Delhi", "Mumbai", "Bangalore"]
}
df = pd.DataFrame(data)
print(df)
2. From NumPy Arrays
import numpy as np
# Create a DataFrame from a NumPy array
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
df = pd.DataFrame(data, columns=["Column1", "Column2", "Column3"])
print(df)
Output:
Column1 Column2 Column3
0 1 2 3
1 4 5 6
2 7 8 9
3. From CSV and Excel Files
Reading from a CSV File
# Read data from a CSV file
df = pd.read_csv("data.csv")
print(df.head()) # Display the first 5 rows
Reading from an Excel File
# Read data from an Excel file
df = pd.read_excel("data.xlsx")
print(df.head())
Modifying DataFrames
1. Adding Rows
# Append a new row to the DataFrame
new_row = {"Name": "Amit", "Age": 28, "City": "Pune"}
df = df.append(new_row, ignore_index=True)
print(df)
Output:
Name Age City
0 Anika 25 Delhi
1 Rahul 30 Mumbai
2 Sneha 22 Bangalore
3 Amit 28 Pune
2. Adding Columns
# Add a new column to the DataFrame
df["Country"] = ["India", "India", "India", "India"]
print(df)
Output:
Name Age City Country
0 Anika 25 Delhi India
1 Rahul 30 Mumbai India
2 Sneha 22 Bangalore India
3 Amit 28 Pune India
3. Deleting Rows
# Delete a row by index
df = df.drop(index=1)
print(df)
Output:
Name Age City Country
0 Anika 25 Delhi India
2 Sneha 22 Bangalore India
3 Amit 28 Pune India
4. Deleting Columns
# Delete a column
df = df.drop(columns=["Country"])
print(df)
Output:
Name Age City
0 Anika 25 Delhi
2 Sneha 22 Bangalore
3 Amit 28 Pune
Try It Yourself
Problem 1: Create a DataFrame from a Dictionary
Create a DataFrame with columns Product
, Price
, and Stock
. Add sample data for 3 products and display the DataFrame.
Show Code
import pandas as pd
data = {
"Product": ["Laptop", "Phone", "Tablet"],
"Price": [80000, 30000, 20000],
"Stock": [50, 150, 100]
}
df = pd.DataFrame(data)
print(df)
Problem 2: Modify a DataFrame
Using the DataFrame created in Problem 1:
- Add a new column
Discount
with values[10, 15, 5]
. - Remove the row corresponding to
Tablet
.
Show Code
# Add a new column
df["Discount"] = [10, 15, 5]
print("After Adding Discount Column:\n", df)
# Remove the row for 'Tablet'
df = df[df["Product"] != "Tablet"]
print("After Removing Tablet Row:\n", df)