Python ModulesPandas TutorialWorking with Time Series

Working with Time Series in Pandas

Pandas provides robust tools for working with time series data, making it ideal for analyzing temporal patterns and trends. This page covers creating time-based indices, resampling, frequency conversion, and handling date/time data.


Creating Time-Based Indices

Time-based indices allow you to organize and analyze data with timestamps effectively.

Example: Creating a Time-Based Index

import pandas as pd
 
# Create a time-based index
time_index = pd.date_range(start="2023-01-01", periods=5, freq="D")
data = [100, 200, 150, 300, 400]
df = pd.DataFrame(data, index=time_index, columns=["Sales"])
print(df)

Output:

            Sales
2023-01-01    100
2023-01-02    200
2023-01-03    150
2023-01-04    300
2023-01-05    400

Resampling and Frequency Conversion

Resampling involves converting a time series from one frequency to another (e.g., daily to monthly). It is useful for aggregating or interpolating data.

Example: Downsampling

# Downsample to monthly frequency (sum of daily sales)
monthly_sales = df.resample("M").sum()
print(monthly_sales)

Output:

            Sales
2023-01-31   1150

Example: Upsampling

# Upsample to hourly frequency (fill missing values with forward fill)
hourly_sales = df.resample("H").ffill()
print(hourly_sales.head())

Output (partial):

                     Sales
2023-01-01 00:00:00    100
2023-01-01 01:00:00    100
2023-01-01 02:00:00    100
2023-01-01 03:00:00    100
2023-01-01 04:00:00    100

Handling Date/Time Data

Pandas makes it easy to parse and manipulate date/time data.

Parsing Date Strings

# Parse a column of date strings
data = {"Date": ["2023-01-01", "2023-01-02", "2023-01-03"], "Sales": [100, 200, 150]}
df = pd.DataFrame(data)
 
# Convert 'Date' column to datetime
df["Date"] = pd.to_datetime(df["Date"])
print(df.dtypes)

Output:

Date     datetime64[ns]
Sales             int64
dtype: object

Extracting Components

# Extract year, month, and day
df["Year"] = df["Date"].dt.year
df["Month"] = df["Date"].dt.month
df["Day"] = df["Date"].dt.day
print(df)

Output:

        Date  Sales  Year  Month  Day
0 2023-01-01    100  2023      1    1
1 2023-01-02    200  2023      1    2
2 2023-01-03    150  2023      1    3

Filtering by Date

# Filter rows by date range
filtered_df = df[(df["Date"] >= "2023-01-02") & (df["Date"] <= "2023-01-03")]
print(filtered_df)

Output:

        Date  Sales  Year  Month  Day
1 2023-01-02    200  2023      1    2
2 2023-01-03    150  2023      1    3

Try It Yourself

Problem 1: Create a Time Series

Create a DataFrame with daily temperatures for a week starting from “2023-01-01”. Set the date as the index and calculate the weekly average temperature.

Show Code
import pandas as pd
 
# Create time series data
dates = pd.date_range(start="2023-01-01", periods=7, freq="D")
temperatures = [30, 32, 31, 29, 28, 27, 26]
df = pd.DataFrame(temperatures, index=dates, columns=["Temperature"])
 
# Calculate weekly average
weekly_avg = df.resample("W").mean()
print(weekly_avg)

Problem 2: Filter Time Series

Create a DataFrame with monthly sales data for the year 2023. Filter the data to show sales for Q2 (April to June).

Show Code
import pandas as pd
 
# Create monthly sales data
months = pd.date_range(start="2023-01-01", periods=12, freq="M")
sales = [500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600]
df = pd.DataFrame(sales, index=months, columns=["Sales"])
 
# Filter for Q2
q2_sales = df["2023-04-01":"2023-06-30"]
print(q2_sales)

Pyground

Play with Python!

Output: