In NumPy, an array’s data is stored in memory. A contiguous region refers to a block of memory where the elements of the array are stored in a sequential manner. There are two types of contiguousness:
You can check the contiguousness of an array using the flags
attribute of a NumPy array.
import numpy as np
# Create a C - contiguous array
c_arr = np.arange(12).reshape(3, 4)
print("C - contiguous array flags:")
print(c_arr.flags)
# Create a Fortran - contiguous array
f_arr = np.asfortranarray(c_arr)
print("\nFortran - contiguous array flags:")
print(f_arr.flags)
The contiguousness of an array affects the performance of operations that involve memory access. For example, when performing operations that iterate over rows, a C - contiguous array will generally be faster, while operations that iterate over columns will be faster on a Fortran - contiguous array.
c_arr = np.array([[1, 2, 3], [4, 5, 6]])
print(c_arr.flags['C_CONTIGUOUS']) # True
np.asfortranarray
function to create a Fortran - contiguous array from an existing array.original_arr = np.array([[1, 2, 3], [4, 5, 6]])
f_arr = np.asfortranarray(original_arr)
print(f_arr.flags['F_CONTIGUOUS']) # True
You can use the flags
attribute of a NumPy array to check its contiguousness.
arr = np.array([[1, 2], [3, 4]])
print("C - contiguous:", arr.flags['C_CONTIGUOUS'])
print("Fortran - contiguous:", arr.flags['F_CONTIGUOUS'])
Reshaping an array can sometimes change its contiguousness. However, if the reshaping operation is simple (e.g., transposing a 2D array), the array may still retain its contiguousness properties.
arr = np.arange(12).reshape(3, 4)
print("Original array C - contiguous:", arr.flags['C_CONTIGUOUS'])
transposed_arr = arr.T
print("Transposed array Fortran - contiguous:", transposed_arr.flags['F_CONTIGUOUS'])
When creating a view of an array, the contiguousness may be affected. A view shares the same underlying data, and operations on the view can change the memory access pattern.
arr = np.arange(10)
view = arr[::2]
print("Original array C - contiguous:", arr.flags['C_CONTIGUOUS'])
print("View C - contiguous:", view.flags['C_CONTIGUOUS'])
If you are performing operations that involve iterating over rows, use a C - contiguous array. If you are iterating over columns, use a Fortran - contiguous array.
# Example of row - wise operation on C - contiguous array
c_arr = np.arange(12).reshape(3, 4)
for row in c_arr:
# Do something with each row
pass
# Example of column - wise operation on Fortran - contiguous array
f_arr = np.asfortranarray(c_arr)
for col in f_arr.T:
# Do something with each column
pass
When working with arrays, try to minimize unnecessary copying of data. Views can be a great way to avoid copying, but be aware of how they affect contiguousness.
Understanding NumPy contiguous regions of arrays is crucial for optimizing the performance of your code. By being aware of the memory layout of your arrays, you can choose the appropriate contiguousness for your operations, leading to faster and more efficient code. Whether you are performing simple array manipulations or complex scientific computations, keeping an eye on contiguousness can make a significant difference in your program’s performance.