np.array()
np.zeros()
and np.ones()
np.arange()
np.linspace()
np.reshape()
np.dot()
np.sum()
np.mean()
np.std()
np.random.rand()
np.array()
The np.array()
function is used to create a NumPy array from a Python list or tuple. It is the most basic way to initialize a NumPy array.
When you have existing data in a Python list and want to convert it into a NumPy array for further numerical operations.
import numpy as np
# Create a Python list
python_list = [1, 2, 3, 4, 5]
# Convert the list to a NumPy array
numpy_array = np.array(python_list)
print(numpy_array)
np.zeros()
and np.ones()
np.zeros()
creates an array filled with zeros, and np.ones()
creates an array filled with ones. You can specify the shape of the array as an argument.
When you need to initialize an array with a specific shape and fill it with a constant value (either 0 or 1) as a starting point for further calculations.
import numpy as np
# Create a 2x3 array of zeros
zeros_array = np.zeros((2, 3))
print(zeros_array)
# Create a 3x2 array of ones
ones_array = np.ones((3, 2))
print(ones_array)
np.arange()
np.arange()
is similar to the built - in Python range()
function, but it returns a NumPy array. It generates evenly spaced values within a given interval.
When you need to create an array of sequential numbers with a specific step size.
import numpy as np
# Create an array from 0 to 9
arange_array = np.arange(10)
print(arange_array)
# Create an array from 2 to 10 with a step of 2
arange_array_step = np.arange(2, 10, 2)
print(arange_array_step)
np.linspace()
np.linspace()
creates an array of evenly spaced numbers over a specified interval. The main difference from np.arange()
is that you can specify the number of elements in the array instead of the step size.
When you need to generate a fixed number of evenly spaced points between two values, which is useful for plotting and interpolation.
import numpy as np
# Create an array of 5 evenly spaced numbers between 0 and 1
linspace_array = np.linspace(0, 1, 5)
print(linspace_array)
endpoint
parameter to False
.np.reshape()
np.reshape()
is used to change the shape of an existing NumPy array without changing its data.
When you need to transform a 1 - D array into a multi - dimensional array or vice versa, or change the dimensions of a multi - dimensional array.
import numpy as np
# Create a 1 - D array
one_d_array = np.arange(6)
# Reshape the 1 - D array into a 2x3 array
reshaped_array = np.reshape(one_d_array, (2, 3))
print(reshaped_array)
ValueError
will be raised.np.dot()
np.dot()
performs matrix multiplication or the dot product of two arrays. If the arrays are 1 - D, it computes the scalar dot product. If they are 2 - D, it performs matrix multiplication.
In linear algebra operations, such as solving systems of linear equations, neural network calculations, etc.
import numpy as np
# Create two 1 - D arrays
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
# Compute the dot product
dot_product = np.dot(a, b)
print(dot_product)
# Create two 2 - D arrays
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
# Perform matrix multiplication
matrix_product = np.dot(A, B)
print(matrix_product)
ValueError
will be raised.np.sum()
np.sum()
calculates the sum of all elements in an array or along a specified axis.
When you need to calculate the total of all values in an array or the sum of each row/column in a multi - dimensional array.
import numpy as np
# Create a 2 - D array
array_2d = np.array([[1, 2], [3, 4]])
# Calculate the sum of all elements
total_sum = np.sum(array_2d)
print(total_sum)
# Calculate the sum along the rows (axis = 1)
row_sum = np.sum(array_2d, axis = 1)
print(row_sum)
# Calculate the sum along the columns (axis = 0)
col_sum = np.sum(array_2d, axis = 0)
print(col_sum)
axis
parameter. Remember that axis = 0
refers to columns and axis = 1
refers to rows in a 2 - D array.axis
parameter affects the summation operation.np.mean()
np.mean()
calculates the arithmetic mean of the elements in an array or along a specified axis.
When you need to find the average value of a set of data points or the average of each row/column in a multi - dimensional array.
import numpy as np
# Create a 1 - D array
one_d_array = np.array([1, 2, 3, 4, 5])
# Calculate the mean of the array
mean_value = np.mean(one_d_array)
print(mean_value)
# Create a 2 - D array
two_d_array = np.array([[1, 2], [3, 4]])
# Calculate the mean along the rows
row_mean = np.mean(two_d_array, axis = 1)
print(row_mean)
np.sum()
, misinterpreting the axis
parameter can lead to incorrect results.np.std()
np.std()
calculates the standard deviation of the elements in an array or along a specified axis. The standard deviation measures the amount of variation or dispersion in a set of values.
When you need to analyze the spread of data in an array or compare the variability between different rows/columns in a multi - dimensional array.
import numpy as np
# Create a 1 - D array
one_d_array = np.array([1, 2, 3, 4, 5])
# Calculate the standard deviation of the array
std_value = np.std(one_d_array)
print(std_value)
# Create a 2 - D array
two_d_array = np.array([[1, 2], [3, 4]])
# Calculate the standard deviation along the columns
col_std = np.std(two_d_array, axis = 0)
print(col_std)
axis
parameter can lead to wrong standard deviation calculations.axis
parameter carefully and double - check the results by hand for simple arrays.np.random.rand()
np.random.rand()
generates an array of random numbers from a uniform distribution over the interval [0, 1).
When you need to introduce randomness in your data, such as initializing weights in a neural network or simulating random events.
import numpy as np
# Create a 2x3 array of random numbers
random_array = np.random.rand(2, 3)
print(random_array)
These 10 NumPy functions are essential tools for data scientists. They cover a wide range of operations, from array creation and manipulation to numerical calculations and random number generation. By understanding the core concepts, typical usage scenarios, common pitfalls, and best practices of these functions, you can use NumPy more effectively in your data science projects.