How can I use a seed inside a loop to get the same random samples everytime the code is ran?

Question:

I want to generate data using random numbers and then generate random samples with replacement using the generated data. The problem is that using random.seed(10) only fixes the initial random numbers for the generated data but it does not fix the random samples generated inside the loop, everytime I run the code I get the same generated data but different random samples and I would like to get the same random samples in order to get reproducible results. The code is the following:

import numpy as np
import random

np.random.seed(10)

data = list(np.random.binomial(size = 215 , n=1, p= 0.3))

sample_mean = []

for i in range(1000):

    sample = random.choices(data, k=215)
    mean = np.mean(sample)
    sample_mean.append(mean)

print(np.mean(sample_mean))

np.mean(sample_mean) should retrieve the same value every time the code is ran but it does not happen.

I tried typing random.seed(i) inside the loop but it didn’t work.

Asked By: Various Listen

||

Answers:

your random.choices(data, k=215) is from python builtin random library which has a different seed than the one inside numpy.random, so seeding numpy isn’t enough.

the correct solution here is to use numpy np.random.choice here as you are already using numpy.

import numpy as np

np.random.seed(10)

data = np.random.binomial(size=215, n=1, p=0.3)

sample_mean = []

for i in range(1000):
    sample = np.random.choice(data,size=215)
    mean = np.mean(sample)
    sample_mean.append(mean)


print(np.mean(sample_mean))

ps: calling list on data is not necessary, and will slow your code down.

Answered By: Ahmed AEK

Fixing the seed for np.random doesn’t fix the seed for random
So adding a simple line for fixing both seeds will give you reproducible results:

import numpy as np
import random

np.random.seed(10)
random.seed(10)

data = list(np.random.binomial(size=215, n=1, p=0.3))

sample_mean = []

for i in range(1000):
    sample = random.choices(data, k=215)
    mean = np.mean(sample)
    sample_mean.append(mean)

print(np.mean(sample_mean))

Or, alternatively, you can use np.random.choices instead of random.choices.

Answered By: ShlomiF
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.