Scatter Plots in Matplotlib
Scatter plots are used to visualize relationships between two numerical variables. Each point on the scatter plot represents a single observation, with its position determined by its x and y values.
Creating Scatter Plots
To create a scatter plot, use the scatter()
function in Matplotlib.
Example: Basic Scatter Plot
import matplotlib.pyplot as plt
# Data
x = [5, 10, 15, 20, 25]
y = [7, 14, 21, 28, 35]
# Create scatter plot
plt.scatter(x, y, color='blue')
# Add title and labels
plt.title("Basic Scatter Plot")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
# Display the plot
plt.show()
Customizing Scatter Plots
Matplotlib allows customization of scatter plots with various parameters:
Parameter | Description | Example Value |
---|---|---|
color | Color of the points | 'red' |
s | Size of the points | 100 |
marker | Shape of the marker | 'o' , 's' |
edgecolor | Color of the marker’s edge | 'black' |
alpha | Transparency of the points (0 to 1) | 0.5 |
Example: Customized Scatter Plot
# Data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]
# Create scatter plot
plt.scatter(x, y, color='green', s=100, marker='s', edgecolor='black', alpha=0.7)
# Add title and labels
plt.title("Customized Scatter Plot")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
# Display the plot
plt.show()
Plotting Multiple Datasets
You can plot multiple datasets on the same scatter plot to compare trends.
Example: Scatter Plot with Multiple Datasets
# Data
x1, y1 = [1, 2, 3, 4], [2, 4, 6, 8]
x2, y2 = [1, 2, 3, 4], [3, 6, 9, 12]
# Create scatter plot
plt.scatter(x1, y1, label="Dataset 1", color='blue', s=50)
plt.scatter(x2, y2, label="Dataset 2", color='red', s=100)
# Add title, labels, and legend
plt.title("Scatter Plot with Multiple Datasets")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.legend()
# Display the plot
plt.show()
Practical Examples
Example 1: Student Scores
# Data
subjects = ["Math", "Science", "English", "History"]
score1 = [90, 85, 88, 75] # Student A
score2 = [80, 88, 84, 90] # Student B
# Create scatter plot
plt.scatter(subjects, score1, label="Student A", color='blue', s=80)
plt.scatter(subjects, score2, label="Student B", color='green', s=80)
# Add title, labels, and legend
plt.title("Student Scores Across Subjects")
plt.xlabel("Subjects")
plt.ylabel("Scores")
plt.legend()
# Display the plot
plt.show()
Example 2: House Prices
# Data
area = [500, 700, 1000, 1200, 1500] # Area in square feet
prices = [300000, 400000, 600000, 750000, 1000000] # Prices in USD
# Create scatter plot
plt.scatter(area, prices, color='orange', s=100, marker='^')
# Add title and labels
plt.title("House Prices vs. Area")
plt.xlabel("Area (sq ft)")
plt.ylabel("Price (USD)")
# Display the plot
plt.show()
Try It Yourself
Problem 1: Visualize Sales Data
Create a scatter plot to visualize the sales data for three products (Product A, Product B, Product C) across four regions (North, South, East, West).
Show Code
# Data
regions = ["North", "South", "East", "West"]
product_a = [100, 150, 200, 180]
product_b = [120, 140, 220, 160]
# Create scatter plot
plt.scatter(regions, product_a, label="Product A", color='blue', s=80)
plt.scatter(regions, product_b, label="Product B", color='red', s=80)
# Add title, labels, and legend
plt.title("Sales Data Across Regions")
plt.xlabel("Regions")
plt.ylabel("Sales (Units)")
plt.legend()
# Display the plot
plt.show()
Problem 2: Compare Population Growth
Create a scatter plot comparing population growth in two countries over five decades.
Show Code
# Data
decades = ["1970", "1980", "1990", "2000", "2010"]
country_a = [50, 55, 60, 70, 80]
country_b = [40, 45, 55, 65, 75]
# Create scatter plot
plt.scatter(decades, country_a, label="Country A", color='green', s=100)
plt.scatter(decades, country_b, label="Country B", color='purple', s=100)
# Add title, labels, and legend
plt.title("Population Growth Comparison")
plt.xlabel("Decades")
plt.ylabel("Population (Millions)")
plt.legend()
# Display the plot
plt.show()
Scatter plots are an excellent way to visualize the relationship between two variables. Experiment with various customization options to create clear and informative visualizations.