In Pandas, Series and DataFrame are the two main data structures used for handling data.
They help you organize, analyze, and manipulate structured datasets efficiently.
First, import pandas:
import pandas as pd
1. Series
A Series is a one-dimensional labeled array.
It is similar to a single column in a table.
Creating a Series
data = pd.Series([10, 20, 30, 40])
print(data)
By default, pandas assigns index values starting from 0.
Creating a Series with Custom Index
data = pd.Series([10, 20, 30], index=["A", "B", "C"])
print(data)
Accessing Values
data["A"]
data[0]
Series from Dictionary
data = pd.Series({
"Ali": 85,
"Sara": 90,
"Ahmed": 78
})
print(data)
2. DataFrame
A DataFrame is a two-dimensional data structure.
It is like a table with rows and columns.
Creating a DataFrame
data = {
"Name": ["Ali", "Sara", "Ahmed"],
"Age": [25, 28, 30],
"Salary": [50000, 60000, 70000]
}df = pd.DataFrame(data)
print(df)
Viewing DataFrame Information
First 5 rows:
df.head()
Basic information:
df.info()
Statistical summary:
df.describe()
Selecting Data in DataFrame
Select a single column:
df["Name"]
Select multiple columns:
df[["Name", "Salary"]]
Select rows by index:
df.loc[0]
Filter data using condition:
df[df["Age"] > 26]
Key Differences Between Series and DataFrame
Series:
- One-dimensional
- Represents a single column
- Has index and values
DataFrame:
- Two-dimensional
- Contains multiple columns
- Each column is a Series
Why Series and DataFrame Matter
These structures allow you to:
Store structured data
Analyze real-world datasets
Perform filtering and grouping
Handle missing values
Prepare data for visualization and machine learning
Key Takeaway
Series and DataFrame are the foundation of data analysis in Pandas.
Understanding these two structures is essential for working with real-world datasets efficiently.