Python ModulesPandas TutorialAttributes and Methods

Attributes and Methods of Pandas Data Structures

Pandas provides several attributes and methods for both Series and DataFrame objects. These tools help you understand and manipulate data efficiently.


Commonly Used Attributes

Attributes for Both Series and DataFrame

AttributeDescription
indexReturns the index (row labels) of the object.
columnsReturns the column labels (only for DataFrame).
dtypesReturns the data types of each column.
shapeReturns the dimensions (rows, columns) as a tuple.
sizeReturns the total number of elements.
valuesReturns the underlying data as a NumPy array.
ndimReturns the number of dimensions.
head()Displays the first few rows (default: 5).
tail()Displays the last few rows (default: 5).

Examples

Example 1: Checking Attributes of a DataFrame

import pandas as pd
 
# Sample DataFrame
data = {
    'Name': ['Anika', 'Rahul', 'Sneha'],
    'Age': [25, 30, 22],
    'City': ['Delhi', 'Mumbai', 'Bangalore']
}
df = pd.DataFrame(data)
 
print("Index:", df.index)
print("Columns:", df.columns)
print("Data Types:\n", df.dtypes)
print("Shape:", df.shape)
print("Total Elements:", df.size)
print("Data Values:\n", df.values)

Output:

Index: RangeIndex(start=0, stop=3, step=1)
Columns: Index(['Name', 'Age', 'City'], dtype='object')
Data Types:
 Name    object
Age      int64
City    object
dtype: object
Shape: (3, 3)
Total Elements: 9
Data Values:
 [['Anika' 25 'Delhi']
 ['Rahul' 30 'Mumbai']
 ['Sneha' 22 'Bangalore']]

Commonly Used Methods

Methods for Data Exploration

MethodDescription
info()Provides a summary of the DataFrame.
describe()Generates summary statistics for numeric columns.
isnull()Checks for missing values.
notnull()Checks for non-missing values.
count()Returns the count of non-null elements.
unique()Returns unique values in a Series.
nunique()Returns the number of unique values.

Methods for Data Manipulation

MethodDescription
sort_values()Sorts by values in a column or Series.
sort_index()Sorts by index labels.
drop()Removes specified rows or columns.
fillna()Replaces missing values with specified values.
astype()Converts data to a specified type.
apply()Applies a function to each element.
groupby()Groups data by a column or index for aggregation.

Examples

Example 2: Exploring Data

# Check info and summary statistics
print(df.info())
print(df.describe())
 
# Check for missing values
print(df.isnull())

Example 3: Manipulating Data

# Add a new column and fill missing values
new_data = {
    'Name': ['Amit', 'Sonal', 'Neha'],
    'Age': [None, 28, None],
    'City': ['Pune', 'Chennai', 'Hyderabad']
}
new_df = pd.DataFrame(new_data)
 
# Fill missing ages with 25
data_filled = new_df.fillna({'Age': 25})
print(data_filled)
 
# Sorting by Age
data_sorted = data_filled.sort_values(by='Age')
print(data_sorted)

Try It Yourself

Problem 1: Check Attributes

Create a DataFrame for student marks in Math, Science, and English. Use attributes like shape, dtypes, and values to explore the data.

Show Code
import pandas as pd
 
data = {
    'Math': [88, 92, 79],
    'Science': [85, 90, 84],
    'English': [91, 89, 76]
}
df = pd.DataFrame(data)
 
print("Shape:", df.shape)
print("Data Types:\n", df.dtypes)
print("Values:\n", df.values)

Problem 2: Manipulate DataFrame

Using the DataFrame from Problem 1:

  1. Add a new column Total that sums the marks for each student.
  2. Sort the DataFrame by the Total column.
Show Code
# Add a Total column
df['Total'] = df['Math'] + df['Science'] + df['English']
print("With Total Column:\n", df)
 
# Sort by Total
df_sorted = df.sort_values(by='Total', ascending=False)
print("Sorted by Total:\n", df_sorted)

Pyground

Play with Python!

Output: