Classification Models

Classification models are Machine Learning algorithms used to predict categories or classes.

If the output is a label (Yes/No, Spam/Not Spam, Pass/Fail), classification is used.

Example:

Email spam detection
Disease diagnosis
Customer churn prediction
Fraud detection

What is Classification?

Classification predicts a categorical outcome.

Instead of predicting a number, it predicts a class label.

Example:

Input → Study Hours
Output → Pass or Fail

The model learns patterns to decide which category the input belongs to.

Types of Classification

1. Binary Classification

Only two classes.

Examples:

Spam / Not Spam
Yes / No
Fraud / Not Fraud

2. Multi-Class Classification

More than two classes.

Examples:

Grade A / B / C
Cat / Dog / Horse
Product categories

3. Multi-Label Classification

One input can belong to multiple categories.

Example:

Movie genres → Action and Comedy

Common Classification Algorithms

1. Logistic Regression

Used for binary classification.

Outputs probability between 0 and 1.

Simple and effective.

2. K-Nearest Neighbors (KNN)

Classifies data based on nearest neighbors.

Works well for small datasets.

3. Decision Tree

Creates tree-like structure of decisions.

Easy to understand and interpret.

4. Random Forest

Collection of multiple decision trees.

More accurate and stable than a single tree.

5. Support Vector Machine (SVM)

Finds the best boundary (hyperplane) between classes.

Effective for high-dimensional data.

6. Naive Bayes

Based on probability and Bayes’ theorem.

Commonly used for text classification.

7. Neural Networks

Used for complex classification problems.

Common in deep learning tasks.

Evaluation Metrics for Classification

1. Accuracy

Percentage of correct predictions.

Accuracy = Correct Predictions / Total Predictions

2. Confusion Matrix

Shows:

True Positive (TP)
True Negative (TN)
False Positive (FP)
False Negative (FN)

Helps understand errors.

3. Precision

How many predicted positives are actually correct.

Precision = TP / (TP + FP)

4. Recall

How many actual positives are correctly predicted.

Recall = TP / (TP + FN)

5. F1-Score

Balance between precision and recall.

F1 = 2 × (Precision × Recall) / (Precision + Recall)

Example Using Scikit-Learn

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import numpy as np# Sample Data
X = np.array([[1], [2], [3], [4], [5], [6]])
y = np.array([0, 0, 0, 1, 1, 1])# Split Data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)# Create Model
model = LogisticRegression()# Train Model
model.fit(X_train, y_train)# Predict
predictions = model.predict(X_test)print("Accuracy:", accuracy_score(y_test, predictions))

When to Use Classification Models

Use classification when:

Target variable is categorical
You need decision-making systems
You want to categorize data

Real-World Applications

Spam detection
Credit risk analysis
Medical diagnosis
Image recognition
Sentiment analysis

Key Takeaway

Classification models predict categories instead of numbers.

They are widely used in decision-making systems and help automate classification tasks in real-world applications.

Home » PYTHON FOR AI AND LLM (PYAI) > Scikit-Learn > Classification Models