An array is a data structure used to store multiple values in a single variable.
In Data Analytics, arrays are mainly used through the NumPy library because they are faster and more efficient than normal Python lists.
What is an Array?
An array is a collection of elements stored in a structured format (usually in rows and columns).
Example of a simple list:
numbers = [1, 2, 3, 4]
Example of a NumPy array:
import numpy as nparr = np.array([1, 2, 3, 4])
Why Use NumPy Arrays Instead of Lists?
- Faster performance
- Less memory usage
- Supports mathematical operations directly
- Works efficiently with large datasets
Types of Arrays
1D Array (Single Dimension)
arr = np.array([10, 20, 30, 40])
Shape: (4,)
2D Array (Matrix)
arr2 = np.array([[1, 2, 3],
[4, 5, 6]])
Shape: (2, 3)
2 rows and 3 columns
3D Array
arr3 = np.array([
[[1, 2], [3, 4]],
[[5, 6], [7, 8]]
])
Used in advanced data and image processing.
Array Properties
Check shape:
arr.shape
Check number of dimensions:
arr.ndim
Check size (total elements):
arr.size
Check data type:
arr.dtype
Accessing Elements
Access by index:
arr[0] # First element
arr2[1, 2] # Row 2, Column 3
Slicing:
arr[1:3]
Array Operations
Addition:
arr + 5
Multiplication:
arr * 2
Array + Array:
arr + arr
Special Array Creation Functions
Zeros:
np.zeros((2, 3))
Ones:
np.ones((3, 3))
Range:
np.arange(0, 10)
Random numbers:
np.random.rand(3, 3)
Why Arrays Matter in Data Analytics
Arrays allow you to:
Perform fast calculations
Handle structured data
Apply mathematical models
Work with large datasets efficiently
Key Takeaway
Arrays are the backbone of numerical computing in Python.
Understanding NumPy arrays is essential for Data Analytics, Machine Learning, and Scientific Computing.