Which seeds have to be set where to realize 100% reproducibility of training results in tensorflow?

Question:

In a general tensorflow setup like

model = construct_model()
with tf.Session() as sess:
    train_model(sess)

Where construct_model() contains the model definition including random initialization of weights (tf.truncated_normal) and train_model(sess) executes the training of the model –

Which seeds do I have to set where to ensure 100% reproducibility between repeated runs of the code snippet above? The documentation for tf.random.set_random_seed may be concise, but left me a bit confused. I tried:

tf.set_random_seed(1234)
model = construct_model()
    with tf.Session() as sess:
        train_model(sess)

But got different results each time.

Asked By: Oblomov

||

Answers:

One possible reason is that when constructing the model, there are some code using numpy.random module. So maybe you can try to set the seed for numpy, too.

Answered By: Jiren Jin

The best solution which works as of today with GPU is to install tensorflow-determinism with the following:

pip install tensorflow-determinism

Then include the following code to your code

import tensorflow as tf
import os
os.environ['TF_DETERMINISTIC_OPS'] = '1'

source: https://github.com/NVIDIA/tensorflow-determinism

Answered By: eugen

What has worked for me is following this answer with a few modifications:

import tensorflow as tf
import numpy as np
import random

# Setting seed value
# from https://stackoverflow.com/a/52897216
# generated randomly by running `random.randint(0, 100)` once
SEED = 75
# 1. Set the `PYTHONHASHSEED` environment variable at a fixed value
os.environ['PYTHONHASHSEED'] = str(SEED)
# 2. Set the `python` built-in pseudo-random generator at a fixed value
random.seed(SEED)
# 3. Set the `numpy` pseudo-random generator at a fixed value
np.random.seed(SEED)
# 4. Set the `tensorflow` pseudo-random generator at a fixed value
tf.random.set_seed(SEED)

I was not able to figure out how to set the session seed (step 5), but it didn’t seem like it was necessary.

I am running Google Colab Pro on a high-RAM TPU, and my training results (the graph of the loss function) have been exactly the same three times in a row with this method.

Answered By: Pro Q
SEED = 42
import os
import random

os.environ["TF_DETERMINISTIC_OPS"] = "1"
keras.utils.set_random_seed(SEED)
os.environ['PYTHONHASHSEED']=str(SEED)
random.seed(SEED)
np.random.seed(SEED)
tf.random.set_seed(SEED)
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.