NumPy’s random number generation is based on a pseudorandom number generator (PRNG). A PRNG is an algorithm that produces a sequence of numbers that approximate the properties of random numbers. The numpy.random
module provides various functions to generate random numbers from different probability distributions, such as uniform, normal, and Poisson distributions.
When we talk about the random range, we are usually interested in generating random numbers within a specified minimum and maximum value. For example, generating random floating - point numbers between 0 and 1, or random integers between 1 and 100.
The numpy.random.uniform
function is used to generate random floating - point numbers from a uniform distribution over a specified range. The uniform distribution means that all values within the range have an equal probability of being selected.
import numpy as np
# Generate 5 random floating-point numbers between 2 and 5
random_floats = np.random.uniform(2, 5, 5)
print(random_floats)
In the above code, the first argument 2
is the lower bound of the range, the second argument 5
is the upper bound of the range, and the third argument 5
is the number of random numbers to generate.
The numpy.random.randint
function is used to generate random integers from a discrete uniform distribution over a specified range.
import numpy as np
# Generate 3 random integers between 10 and 20 (inclusive)
random_ints = np.random.randint(10, 21, 3)
print(random_ints)
Note that the upper bound in randint
is exclusive. So, to include the number 20
in the range, we pass 21
as the upper bound.
Random range generation is often used for data sampling. For example, if you have a large dataset and you want to select a random subset for testing or analysis.
import numpy as np
# Assume we have a dataset of 100 elements
dataset = np.arange(100)
# Select 10 random elements from the dataset
sample_indices = np.random.randint(0, 100, 10)
sample = dataset[sample_indices]
print(sample)
Random numbers in a range can be used to simulate real - world random processes. For example, simulating the number of customers arriving at a store per hour, where the number of customers can be between 0 and 50.
import numpy as np
# Simulate the number of customers arriving at a store for 7 days
customers_per_day = np.random.randint(0, 51, 7)
print(customers_per_day)
When you need reproducible results, it is important to set the random seed. The random seed initializes the PRNG, and if you use the same seed, you will get the same sequence of random numbers.
import numpy as np
# Set the random seed
np.random.seed(42)
# Generate random numbers
random_numbers = np.random.randint(1, 10, 5)
print(random_numbers)
# Set the same seed again
np.random.seed(42)
# Generate random numbers again
random_numbers_again = np.random.randint(1, 10, 5)
print(random_numbers_again)
When using random range for data sampling, make sure to avoid over - sampling (selecting the same element multiple times when it’s not intended) and under - sampling (not covering the full range of the data). You can use techniques like np.random.choice
with the replace=False
parameter to avoid over - sampling.
import numpy as np
dataset = np.arange(10)
sample = np.random.choice(dataset, 5, replace=False)
print(sample)
NumPy’s random range functionality provides a convenient and efficient way to generate random numbers within specified ranges. By understanding the fundamental concepts, usage methods, common practices, and best practices, you can effectively use these functions for data sampling, simulation, and other numerical tasks. Remember to set the random seed for reproducibility and avoid over - and under - sampling in data sampling scenarios.