Introduction to Pandas

Pandas is a powerful Python library used for data manipulation and analysis. It is one of the most important tools in Data Analytics and Data Science.

Pandas makes it easy to work with structured data such as tables, spreadsheets, and CSV files.

Why Use Pandas?

  • Handles structured data efficiently
  • Works with CSV, Excel, SQL, and more
  • Provides powerful data filtering and transformation tools
  • Built on top of NumPy
  • Widely used in industry

Installing Pandas

If not installed, use:

pip install pandas

Import Pandas:

import pandas as pd

pd is the common alias for pandas.

Core Data Structures in Pandas

Pandas mainly uses two data structures:

1. Series

A Series is a one-dimensional labeled array.

import pandas as pddata = pd.Series([10, 20, 30, 40])
print(data)

2. DataFrame

A DataFrame is a two-dimensional table with rows and columns.

data = {
"Name": ["Ali", "Sara", "Ahmed"],
"Age": [25, 28, 30],
"Salary": [50000, 60000, 70000]
}df = pd.DataFrame(data)
print(df)

Loading Data into Pandas

From a CSV file:

df = pd.read_csv("data.csv")

From an Excel file:

df = pd.read_excel("data.xlsx")

Viewing Data

First 5 rows:

df.head()

Last 5 rows:

df.tail()

Check data information:

df.info()

Statistical summary:

df.describe()

Selecting Data

Select a column:

df["Age"]

Select multiple columns:

df[["Name", "Salary"]]

Filter data:

df[df["Age"] > 26]

Why Pandas is Important in Data Analytics

Pandas helps you:

Clean data
Filter and sort records
Handle missing values
Group and summarize data
Prepare data for visualization and machine learning

Key Takeaway

Pandas is the backbone of data analysis in Python. Mastering DataFrames and Series will allow you to efficiently load, clean, analyze, and transform real-world datasets.

Home » PYTHON FOR DATA ANALYTICS (PYDA) > Pandas > Introduction to Pandas