Boxplots in Matplotlib

Boxplots, also known as box-and-whisker plots, are used to display the distribution of a dataset based on five summary statistics: minimum, first quartile (Q1), median, third quartile (Q3), and maximum. They are helpful in identifying outliers and understanding data variability.


Creating Boxplots

The boxplot() function in Matplotlib is used to create boxplots.

Example: Basic Boxplot

import matplotlib.pyplot as plt
 
# Data
data = [7, 8, 5, 6, 9, 10, 6, 8, 7, 5]
 
# Create boxplot
plt.boxplot(data)
 
# Add title and labels
plt.title("Basic Boxplot")
plt.ylabel("Values")
 
# Display the plot
plt.show()

Customizing Boxplots

Matplotlib provides several parameters to customize boxplots:

ParameterDescriptionExample Value
notchDraw a notch to represent confidence intervalsTrue
vertOrientation of the boxplot (vertical/horizontal)True, False
patch_artistFill the box with colorTrue
boxpropsProperties of the boxdict(color='blue')
whiskerpropsProperties of the whiskersdict(color='red')

Example: Customized Boxplot

# Data
data = [7, 8, 5, 6, 9, 10, 6, 8, 7, 5]
 
# Create customized boxplot
plt.boxplot(data, notch=True, patch_artist=True, boxprops=dict(facecolor='lightblue'), whiskerprops=dict(color='green'))
 
# Add title and labels
plt.title("Customized Boxplot")
plt.ylabel("Values")
 
# Display the plot
plt.show()

Boxplots with Multiple Datasets

Boxplots can also be used to compare multiple datasets.

Example: Multiple Boxplots

# Data
data1 = [7, 8, 5, 6, 9, 10, 6, 8, 7, 5]
data2 = [6, 7, 8, 5, 6, 7, 6, 5, 7, 6]
 
# Create boxplots
plt.boxplot([data1, data2], labels=["Dataset 1", "Dataset 2"], patch_artist=True, boxprops=dict(facecolor='lightblue'))
 
# Add title and labels
plt.title("Boxplots for Multiple Datasets")
plt.ylabel("Values")
 
# Display the plot
plt.show()

Practical Examples

Example 1: Exam Scores

# Data
scores = [55, 65, 70, 75, 80, 85, 90, 95, 100]
 
# Create boxplot
plt.boxplot(scores, notch=True, patch_artist=True, boxprops=dict(facecolor='lightgreen'))
 
# Add title and labels
plt.title("Exam Scores Distribution")
plt.ylabel("Scores")
 
# Display the plot
plt.show()

Example 2: Monthly Rainfall

# Data
rainfall = [100, 120, 85, 90, 150, 130, 110, 140, 95, 105, 125, 115]
 
# Create boxplot
plt.boxplot(rainfall, notch=True, patch_artist=True, boxprops=dict(facecolor='lightcoral'))
 
# Add title and labels
plt.title("Monthly Rainfall Distribution")
plt.ylabel("Rainfall (mm)")
 
# Display the plot
plt.show()

Try It Yourself

Problem 1: Analyze Heights of Students

Create a boxplot to analyze the height distribution of students in your class.

Show Code
# Data
heights = [150, 160, 165, 170, 155, 180, 175, 165, 158, 162]
 
# Create boxplot
plt.boxplot(heights, notch=True, patch_artist=True, boxprops=dict(facecolor='skyblue'))
 
# Add title and labels
plt.title("Height Distribution")
plt.ylabel("Height (cm)")
 
# Display the plot
plt.show()

Problem 2: Compare Sales Data

Visualize the distribution of sales data for two products using boxplots.

Show Code
# Data
product_a = [200, 300, 400, 150, 250, 350, 450, 300, 220, 310]
product_b = [180, 280, 380, 130, 230, 330, 430, 290, 200, 300]
 
# Create boxplots
plt.boxplot([product_a, product_b], labels=["Product A", "Product B"], notch=True, patch_artist=True, boxprops=dict(facecolor='lightyellow'))
 
# Add title and labels
plt.title("Sales Data Distribution")
plt.ylabel("Sales (Units)")
 
# Display the plot
plt.show()

Boxplots are powerful tools for visualizing data variability and identifying outliers. Use the customization options to create meaningful and visually appealing boxplots.


Pyground

Play with Python!

Output: