The heart of NumPy is the ndarray
(N - dimensional array) object. It is a homogeneous multi - dimensional container of items of the same type. For example, you can have a 1 - D array (similar to a list), a 2 - D array (like a matrix), or arrays with even higher dimensions.
import numpy as np
# Create a 1 - D array
one_d_array = np.array([1, 2, 3, 4, 5])
print("1 - D Array:", one_d_array)
# Create a 2 - D array
two_d_array = np.array([[1, 2, 3], [4, 5, 6]])
print("2 - D Array:", two_d_array)
ndarray
objects have several useful attributes. shape
returns a tuple indicating the size of each dimension, dtype
returns the data type of the array elements, and ndim
returns the number of dimensions.
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
print("Shape:", arr.shape)
print("Data Type:", arr.dtype)
print("Number of Dimensions:", arr.ndim)
NumPy provides a wide range of functions for data manipulation. You can perform element - wise operations, reshape arrays, and concatenate multiple arrays.
It is easy to calculate statistical measures such as mean, median, standard deviation, etc., on NumPy arrays.
NumPy has functions for performing linear algebra operations like matrix multiplication, finding eigenvalues and eigenvectors, etc.
import numpy as np
# Create two arrays
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
# Element - wise addition
result = arr1 + arr2
print("Element - wise addition:", result)
# Element - wise multiplication
result = arr1 * arr2
print("Element - wise multiplication:", result)
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
mean = np.mean(arr)
median = np.median(arr)
std_dev = np.std(arr)
print("Mean:", mean)
print("Median:", median)
print("Standard Deviation:", std_dev)
import numpy as np
# Create two matrices
matrix1 = np.array([[1, 2], [3, 4]])
matrix2 = np.array([[5, 6], [7, 8]])
# Matrix multiplication
result = np.dot(matrix1, matrix2)
print("Matrix multiplication:", result)
NumPy arrays can consume a significant amount of memory, especially when dealing with large datasets. It’s important to be aware of memory usage and free up unnecessary arrays.
If you try to perform operations on arrays with incompatible data types, it can lead to unexpected results or errors. Always check and ensure that the data types of your arrays are appropriate for the operations you want to perform.
Incorrect indexing can lead to accessing elements outside the bounds of the array, which will raise an IndexError
in Python.
Vectorized operations are much faster than traditional Python loops because they are implemented in highly optimized C code. Whenever possible, use NumPy’s built - in functions for element - wise operations.
If you know the size of the array you need in advance, pre - allocate it instead of appending elements one by one. This can significantly improve performance.
Before performing operations on arrays, check and ensure that the data types are compatible. You can use the astype()
method to convert the data type if necessary.
NumPy is a powerful library for data analysis in Python. By understanding its core concepts, typical usage scenarios, and being aware of common pitfalls and best practices, you can use NumPy effectively in real - world data analysis tasks. Whether you’re a beginner or an experienced data scientist, NumPy will be a valuable addition to your toolkit.