NumPy Cheatsheet: A Comprehensive Guide

NumPy (Numerical Python) is a fundamental library in the Python ecosystem for scientific computing. It provides a powerful N - dimensional array object, along with a vast collection of high - level mathematical functions to operate on these arrays. This blog serves as a cheatsheet, which will cover the key concepts, usage, and best practices of NumPy to help you quickly master this important library.

Table of Contents

  1. Fundamental Concepts
  2. Usage Methods
  3. Common Practices
  4. Best Practices
  5. Conclusion
  6. References

Fundamental Concepts

N - Dimensional Arrays

The core of NumPy is the ndarray (N - dimensional array) object. An ndarray is a table of elements (usually numbers), all of the same type, indexed by a tuple of non - negative integers.

import numpy as np

# Create a 1 - D array
arr_1d = np.array([1, 2, 3, 4, 5])
print("1 - D array:", arr_1d)

# Create a 2 - D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
print("2 - D array:\n", arr_2d)

Shape and Dimensions

The shape of an array is a tuple that gives the size of each dimension of the array. The ndim attribute gives the number of dimensions of the array.

print("Shape of 1 - D array:", arr_1d.shape)
print("Dimensions of 1 - D array:", arr_1d.ndim)
print("Shape of 2 - D array:", arr_2d.shape)
print("Dimensions of 2 - D array:", arr_2d.ndim)

Data Types

NumPy arrays can have different data types such as integers, floating - point numbers, etc. You can specify the data type when creating an array.

int_arr = np.array([1, 2, 3], dtype=np.int32)
float_arr = np.array([1.1, 2.2, 3.3], dtype=np.float64)
print("Integer array data type:", int_arr.dtype)
print("Floating - point array data type:", float_arr.dtype)

Usage Methods

Array Creation

There are multiple ways to create NumPy arrays.

From Python Lists

list_data = [1, 2, 3]
arr_from_list = np.array(list_data)
print("Array from list:", arr_from_list)

Using Built - in Functions

# Create an array of zeros
zeros_arr = np.zeros((2, 3))
print("Array of zeros:\n", zeros_arr)

# Create an array of ones
ones_arr = np.ones((3, 2))
print("Array of ones:\n", ones_arr)

# Create an array with a range of values
range_arr = np.arange(0, 10, 2)
print("Array with a range:", range_arr)

Array Indexing and Slicing

Indexing and slicing in NumPy arrays work similarly to Python lists but can be done in multiple dimensions.

arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Access an element
element = arr[1, 2]
print("Element at position (1, 2):", element)

# Slice a sub - array
sub_arr = arr[0:2, 1:3]
print("Sub - array:\n", sub_arr)

Array Operations

NumPy arrays support a wide range of mathematical operations.

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
# Element - wise addition
add_result = a + b
print("Element - wise addition:", add_result)

# Element - wise multiplication
mul_result = a * b
print("Element - wise multiplication:", mul_result)

Common Practices

Reshaping Arrays

You can change the shape of an array without changing its data.

arr = np.arange(9)
reshaped_arr = arr.reshape((3, 3))
print("Reshaped array:\n", reshaped_arr)

Aggregation Functions

NumPy provides many aggregation functions such as sum, mean, max, and min.

arr = np.array([[1, 2, 3], [4, 5, 6]])
print("Sum of all elements:", np.sum(arr))
print("Mean of all elements:", np.mean(arr))
print("Maximum value in the array:", np.max(arr))

Boolean Indexing

You can use boolean arrays to index and select elements from an array.

arr = np.array([1, 2, 3, 4, 5])
bool_index = arr > 3
print("Boolean index:", bool_index)
print("Elements greater than 3:", arr[bool_index])

Best Practices

Memory Efficiency

When working with large datasets, it’s important to manage memory efficiently. Avoid creating unnecessary copies of arrays.

# Instead of creating a new array for a simple operation
arr = np.array([1, 2, 3])
arr += 1  # This modifies the original array in - place
print("Modified array:", arr)

Vectorization

Vectorization is the process of performing operations on entire arrays at once, which is much faster than using traditional Python loops.

import time

# Using a loop
arr = np.arange(1000000)
start_time = time.time()
for i in range(len(arr)):
    arr[i] = arr[i] * 2
end_time = time.time()
print("Time taken with loop:", end_time - start_time)

# Using vectorization
arr = np.arange(1000000)
start_time = time.time()
arr = arr * 2
end_time = time.time()
print("Time taken with vectorization:", end_time - start_time)

Code Readability

Use meaningful variable names and comments to make your NumPy code more understandable.

# Create an array representing the heights of students
student_heights = np.array([170, 175, 168, 182])
# Calculate the average height
average_height = np.mean(student_heights)
print("Average student height:", average_height)

Conclusion

NumPy is an indispensable library for scientific computing in Python. With its powerful ndarray object, efficient array operations, and a rich set of functions, it enables users to perform complex numerical tasks with ease. By mastering the fundamental concepts, usage methods, common practices, and best practices outlined in this cheatsheet, you can write efficient and effective NumPy code. Whether you are dealing with data analysis, machine learning, or any other numerical task, NumPy will be a reliable tool in your Python toolkit.

References

The above blog provides a comprehensive overview of NumPy, including fundamental concepts, usage, common practices, and best practices. It is designed to serve as a quick reference guide for both beginners and experienced users of NumPy.