Classification models are Machine Learning algorithms used to predict categories or classes.
If the output is a label (Yes/No, Spam/Not Spam, Pass/Fail), classification is used.
Example:
Email spam detection
Disease diagnosis
Customer churn prediction
Fraud detection
What is Classification?
Classification predicts a categorical outcome.
Instead of predicting a number, it predicts a class label.
Example:
Input → Study Hours
Output → Pass or Fail
The model learns patterns to decide which category the input belongs to.
Types of Classification
1. Binary Classification
Only two classes.
Examples:
Spam / Not Spam
Yes / No
Fraud / Not Fraud
2. Multi-Class Classification
More than two classes.
Examples:
Grade A / B / C
Cat / Dog / Horse
Product categories
3. Multi-Label Classification
One input can belong to multiple categories.
Example:
Movie genres → Action and Comedy
Common Classification Algorithms
1. Logistic Regression
Used for binary classification.
Outputs probability between 0 and 1.
Simple and effective.
2. K-Nearest Neighbors (KNN)
Classifies data based on nearest neighbors.
Works well for small datasets.
3. Decision Tree
Creates tree-like structure of decisions.
Easy to understand and interpret.
4. Random Forest
Collection of multiple decision trees.
More accurate and stable than a single tree.
5. Support Vector Machine (SVM)
Finds the best boundary (hyperplane) between classes.
Effective for high-dimensional data.
6. Naive Bayes
Based on probability and Bayes’ theorem.
Commonly used for text classification.
7. Neural Networks
Used for complex classification problems.
Common in deep learning tasks.
Evaluation Metrics for Classification
1. Accuracy
Percentage of correct predictions.
Accuracy = Correct Predictions / Total Predictions
2. Confusion Matrix
Shows:
True Positive (TP)
True Negative (TN)
False Positive (FP)
False Negative (FN)
Helps understand errors.
3. Precision
How many predicted positives are actually correct.
Precision = TP / (TP + FP)
4. Recall
How many actual positives are correctly predicted.
Recall = TP / (TP + FN)
5. F1-Score
Balance between precision and recall.
F1 = 2 × (Precision × Recall) / (Precision + Recall)
Example Using Scikit-Learn
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import numpy as np# Sample Data
X = np.array([[1], [2], [3], [4], [5], [6]])
y = np.array([0, 0, 0, 1, 1, 1])# Split Data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)# Create Model
model = LogisticRegression()# Train Model
model.fit(X_train, y_train)# Predict
predictions = model.predict(X_test)print("Accuracy:", accuracy_score(y_test, predictions))
When to Use Classification Models
Use classification when:
Target variable is categorical
You need decision-making systems
You want to categorize data
Real-World Applications
Spam detection
Credit risk analysis
Medical diagnosis
Image recognition
Sentiment analysis
Key Takeaway
Classification models predict categories instead of numbers.
They are widely used in decision-making systems and help automate classification tasks in real-world applications.