Activation functions are a key component of neural networks. They introduce non-linearity into the model, allowing it to learn complex patterns and relationships in data. Without activation functions, a neural network would behave like a simple linear model and fail to capture real-world complexities.

Why Activation Functions are Important

Enable neural networks to learn non-linear patterns
Help models make meaningful predictions
Control how signals pass from one layer to another
Improve model performance and convergence during training

1. ReLU (Rectified Linear Unit)
ReLU is the most commonly used activation function in deep learning.

Formula
f(x) = max(0, x)

How it Works

If the input is positive, it returns the same value
If the input is negative, it returns 0

Advantages

Simple and computationally efficient
Helps reduce vanishing gradient problem
Speeds up training of deep networks

Limitations

Can suffer from “dead neurons” (neurons that stop updating)

Example

import numpy as npdef relu(x):
    return np.maximum(0, x)print(relu(np.array([-2, -1, 0, 1, 2])))

2. Sigmoid Function
Sigmoid is widely used for binary classification problems.

Formula
f(x) = 1 / (1 + e^(-x))

How it Works

Maps input values to a range between 0 and 1
Useful for probabilities and output layers

Advantages

Smooth and differentiable
Suitable for binary outputs

Limitations

Can cause vanishing gradient problem
Slower convergence compared to ReLU

Example

def sigmoid(x):
    return 1 / (1 + np.exp(-x))print(sigmoid(np.array([-2, 0, 2])))

3. Tanh (Hyperbolic Tangent)
Tanh is similar to sigmoid but outputs values between -1 and 1.

Formula
f(x) = tanh(x)

How it Works

Centers data around zero
Useful in hidden layers

Advantages

Zero-centered output
Often performs better than sigmoid in hidden layers

Limitations

Still suffers from vanishing gradient problem

Example

def tanh(x):
    return np.tanh(x)print(tanh(np.array([-2, 0, 2])))

Comparison of Activation Functions

ReLU: Fast, widely used, best for hidden layers
Sigmoid: Outputs probabilities, best for binary classification output
Tanh: Zero-centered, better than sigmoid for hidden layers

When to Use Which Function

Use ReLU in hidden layers for most deep learning models
Use Sigmoid in the output layer for binary classification
Use Tanh when you need zero-centered outputs

Applications in Deep Learning

Image classification and computer vision tasks
Natural language processing models
Speech recognition systems
Neural networks for prediction and classification

Lesson Summary
In this lesson, you learned about activation functions and their role in neural networks. You explored ReLU, Sigmoid, and Tanh functions, their formulas, advantages, limitations, and use cases. Activation functions are essential for enabling neural networks to learn complex patterns and make accurate predictions.

Home » Deep Learning Foundations (Beginner) > Neural Networks Basics > Activation Functions (ReLU, Sigmoid, Tanh)

Free Video Tutorial

Want Mentorship on this Training?

Book a 1-on-1 Consultancy Session

Activation Functions (ReLU, Sigmoid, Tanh)