Tensorflow: How to train for a length of time instead of epochs?


Prior research:
Most relevant tensorflow article
How can I calculate the time spent for overall training a model in Tensorflow (for all epochs)?
Show Estimated remaining time to train a model Tensorflow with large epochs


y = to_categorical(self.ydata, num_classes=self.vocab_size)
model = Sequential()
model.add(Embedding(self.vocab_size, 10, input_length=1))
model.add(LSTM(1000, return_sequences=True))
model.add(Dense(1000, activation="relu"))
model.add(Dense(self.vocab_size, activation="softmax"))
keras.utils.plot_model(model, show_layer_names=True)
checkpoint = ModelCheckpoint(modelFilePath, monitor='loss', verbose=1,save_best_only=True, mode='auto')
reduce = ReduceLROnPlateau(monitor='loss', factor=0.2,patience=3, min_lr=0.0001, verbose=1)
tensorboard_Visualization = TensorBoard(log_dir=logdirPath)
model.compile(loss="categorical_crossentropy", optimizer=Adam(lr=0.001))
history = model.fit(self.Xdata, y, epochs=epochs, batch_size=64, callbacks=[checkpoint, reduce, tensorboard_Visualization]).history

Inspiration from:

  1. https://www.analyticsvidhya.com/blog/2021/08/predict-the-next-word-of-your-text-using-long-short-term-memory-lstm/
  2. https://towardsdatascience.com/building-a-next-word-predictor-in-tensorflow-e7e681d4f03f

This code takes a list of one-word "questions" and "answers" to train. Impressive background knowledge if you guessed the model’s goals before reading this. Anyways, this code works. I’m looking only to enhance it at this point.

How can I train a model for a set amount of time? The time an epoch takes varies based on what text I feed this AI. It changes a lot, generally around 10 seconds to 4 minutes. I could use that to approximate epochs from time, but if another way exists, I would appreciate a more concrete idea from TensorFlow’s resources.

I really want a usable answer. Please add some code to your explanation, especially some useful docs would be a plus. I hope you like the question and upvote it!


Asked By: Pranit Shah



If it’s imperative to define a timeout the way you formulated the problem, then take a look at this answer

However, accuracy will greatly change based on the text you’ll feed it. From extremely poor, to overfitted. So you’d end up spending more time just to verify. A better fit for your problem would be a custom EarlyStopping.

from tensorflow.keras.callbacks import EarlyStopping

custom_early_stopping = EarlyStopping(

history = model.fit(
    # rest of parameters
    callbacks=[custom_early_stopping ] # and rest of callbacks

In this example you set validation accuracy as performance monitor to determine when to stop the training. I don’t think you’d have any use for a low accuracy trained model, or to continue after you hit it just because you set a time.

patience=8 means the training is terminated as soon as 8 epochs with no improvement. min_delta=0.001 means the validation accuracy has to improve by at least 0.001 for it to count as an improvement. mode='max' means it will stop when the quantity monitored has stopped increasing.

Answered By: Attersson
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.