Attributes and Methods of Pandas Data Structures

Pandas provides several attributes and methods for both Series and DataFrame objects. These tools help you understand and manipulate data efficiently.

Commonly Used Attributes

Attributes for Both Series and DataFrame

Attribute	Description
`index`	Returns the index (row labels) of the object.
`columns`	Returns the column labels (only for DataFrame).
`dtypes`	Returns the data types of each column.
`shape`	Returns the dimensions (rows, columns) as a tuple.
`size`	Returns the total number of elements.
`values`	Returns the underlying data as a NumPy array.
`ndim`	Returns the number of dimensions.
`head()`	Displays the first few rows (default: 5).
`tail()`	Displays the last few rows (default: 5).

Examples

Example 1: Checking Attributes of a DataFrame

import pandas as pd
 
# Sample DataFrame
data = {
    'Name': ['Anika', 'Rahul', 'Sneha'],
    'Age': [25, 30, 22],
    'City': ['Delhi', 'Mumbai', 'Bangalore']
}
df = pd.DataFrame(data)
 
print("Index:", df.index)
print("Columns:", df.columns)
print("Data Types:\n", df.dtypes)
print("Shape:", df.shape)
print("Total Elements:", df.size)
print("Data Values:\n", df.values)

Output:

Index: RangeIndex(start=0, stop=3, step=1)
Columns: Index(['Name', 'Age', 'City'], dtype='object')
Data Types:
 Name    object
Age      int64
City    object
dtype: object
Shape: (3, 3)
Total Elements: 9
Data Values:
 [['Anika' 25 'Delhi']
 ['Rahul' 30 'Mumbai']
 ['Sneha' 22 'Bangalore']]

Commonly Used Methods

Methods for Data Exploration

Method	Description
`info()`	Provides a summary of the DataFrame.
`describe()`	Generates summary statistics for numeric columns.
`isnull()`	Checks for missing values.
`notnull()`	Checks for non-missing values.
`count()`	Returns the count of non-null elements.
`unique()`	Returns unique values in a Series.
`nunique()`	Returns the number of unique values.

Methods for Data Manipulation

Method	Description
`sort_values()`	Sorts by values in a column or Series.
`sort_index()`	Sorts by index labels.
`drop()`	Removes specified rows or columns.
`fillna()`	Replaces missing values with specified values.
`astype()`	Converts data to a specified type.
`apply()`	Applies a function to each element.
`groupby()`	Groups data by a column or index for aggregation.

Examples

Example 2: Exploring Data

# Check info and summary statistics
print(df.info())
print(df.describe())
 
# Check for missing values
print(df.isnull())

Example 3: Manipulating Data

# Add a new column and fill missing values
new_data = {
    'Name': ['Amit', 'Sonal', 'Neha'],
    'Age': [None, 28, None],
    'City': ['Pune', 'Chennai', 'Hyderabad']
}
new_df = pd.DataFrame(new_data)
 
# Fill missing ages with 25
data_filled = new_df.fillna({'Age': 25})
print(data_filled)
 
# Sorting by Age
data_sorted = data_filled.sort_values(by='Age')
print(data_sorted)

Try It Yourself

Problem 1: Check Attributes

Create a DataFrame for student marks in Math, Science, and English. Use attributes like shape, dtypes, and values to explore the data.

Show Code

import pandas as pd
 
data = {
    'Math': [88, 92, 79],
    'Science': [85, 90, 84],
    'English': [91, 89, 76]
}
df = pd.DataFrame(data)
 
print("Shape:", df.shape)
print("Data Types:\n", df.dtypes)
print("Values:\n", df.values)

Problem 2: Manipulate DataFrame

Using the DataFrame from Problem 1:

Add a new column Total that sums the marks for each student.
Sort the DataFrame by the Total column.

Show Code

# Add a Total column
df['Total'] = df['Math'] + df['Science'] + df['English']
print("With Total Column:\n", df)
 
# Sort by Total
df_sorted = df.sort_values(by='Total', ascending=False)
print("Sorted by Total:\n", df_sorted)

Pyground

Play with Python!

Output:

Creating Dataframe Data Selection and Filtering