Exporting and Importing Data in Pandas
Data import and export are key features of Pandas, making it simple to interact with various file formats. Whether you’re handling a small dataset or processing large files, Pandas provides efficient methods for data manipulation.
Importing Data
Pandas allows you to import data from a variety of file formats. Here are some commonly used methods:
1. Reading CSV Files
import pandas as pd
# Reading a CSV file
df = pd.read_csv("data.csv")
print(df.head()) # Display the first 5 rows
2. Reading Excel Files
# Reading an Excel file
df = pd.read_excel("data.xlsx", sheet_name="Sheet1")
print(df.head())
3. Reading JSON Files
# Reading a JSON file
df = pd.read_json("data.json")
print(df.head())
4. Reading SQL Databases
import sqlite3
# Connect to an SQLite database
conn = sqlite3.connect("data.db")
# Query data from a table
df = pd.read_sql_query("SELECT * FROM table_name", conn)
print(df.head())
Exporting Data
Pandas also provides methods to export data to different formats:
1. Writing to CSV Files
# Writing to a CSV file
df.to_csv("output.csv", index=False)
2. Writing to Excel Files
# Writing to an Excel file
df.to_excel("output.xlsx", sheet_name="Sheet1", index=False)
3. Writing to JSON Files
# Writing to a JSON file
df.to_json("output.json")
4. Writing to SQL Databases
# Writing to an SQLite database
df.to_sql("table_name", conn, if_exists="replace", index=False)
Handling Large Datasets
When working with large datasets, consider the following techniques to optimize performance:
1. Reading Data in Chunks
# Read CSV file in chunks
chunk_size = 1000
for chunk in pd.read_csv("large_data.csv", chunksize=chunk_size):
print(chunk.head())
2. Specifying Columns to Read
# Read specific columns
df = pd.read_csv("large_data.csv", usecols=["col1", "col2"])
3. Using Data Types for Optimization
# Specify data types for better memory usage
df = pd.read_csv("large_data.csv", dtype={"col1": "int32", "col2": "float32"})
Try It Yourself
Problem 1: Import a CSV File
Write a program to read a CSV file containing student names and grades, and display the top 5 rows.
Show Code
import pandas as pd
# Read the CSV file
df = pd.read_csv("students.csv")
print(df.head())
Problem 2: Export Data to JSON
Write a program to create a DataFrame of employee details and export it to a JSON file.
Show Code
import pandas as pd
# Create a DataFrame
data = {
"Name": ["Rhea", "Niyati", "Asif"],
"Age": [25, 30, 28],
"Department": ["HR", "Finance", "IT"]
}
df = pd.DataFrame(data)
# Export to JSON
df.to_json("employees.json", orient="records", indent=4)
print("Data exported to employees.json")