What does the standard Keras model output mean? What is epoch and loss in Keras?

Question:

I have just built my first model using Keras and this is the output. It looks like the standard output you get after building any Keras artificial neural network. Even after looking in the documentation, I do not fully understand what the epoch is and what the loss is which is printed in the output.

What is epoch and loss in Keras?

(I know it’s probably an extremely basic question, but I couldn’t seem to locate the answer online, and if the answer is really that hard to glean from the documentation I thought others would have the same question and thus decided to post it here.)

Epoch 1/20
1213/1213 [==============================] - 0s - loss: 0.1760     
Epoch 2/20
1213/1213 [==============================] - 0s - loss: 0.1840     
Epoch 3/20
1213/1213 [==============================] - 0s - loss: 0.1816     
Epoch 4/20
1213/1213 [==============================] - 0s - loss: 0.1915     
Epoch 5/20
1213/1213 [==============================] - 0s - loss: 0.1928     
Epoch 6/20
1213/1213 [==============================] - 0s - loss: 0.1964     
Epoch 7/20
1213/1213 [==============================] - 0s - loss: 0.1948     
Epoch 8/20
1213/1213 [==============================] - 0s - loss: 0.1971     
Epoch 9/20
1213/1213 [==============================] - 0s - loss: 0.1899     
Epoch 10/20
1213/1213 [==============================] - 0s - loss: 0.1957     
Epoch 11/20
1213/1213 [==============================] - 0s - loss: 0.1923     
Epoch 12/20
1213/1213 [==============================] - 0s - loss: 0.1910     
Epoch 13/20
1213/1213 [==============================] - 0s - loss: 0.2104     
Epoch 14/20
1213/1213 [==============================] - 0s - loss: 0.1976     
Epoch 15/20
1213/1213 [==============================] - 0s - loss: 0.1979     
Epoch 16/20
1213/1213 [==============================] - 0s - loss: 0.2036     
Epoch 17/20
1213/1213 [==============================] - 0s - loss: 0.2019     
Epoch 18/20
1213/1213 [==============================] - 0s - loss: 0.1978     
Epoch 19/20
1213/1213 [==============================] - 0s - loss: 0.1954     
Epoch 20/20
1213/1213 [==============================] - 0s - loss: 0.1949
Asked By: pr338

||

Answers:

One epoch ends when your model had run the data through all nodes in your network and ready to update the weights to reach optimal loss value. That is, smaller is better. In your case, as there are higher loss scores on higher epoch, it “seems” the model is better on first epoch.

I said “seems” since we can’t actually tell for sure yet as the model has not been tested using proper cross validation method i.e. it is evaluated only against its training data.

Ways to improve your model:

  • Use cross validation in your Keras model in order to find out how the model actually perform, does it generalize well when predicting new data it has never seen before?
  • Adjust your learning rate, structure of neural network model, number of hidden units / layers, init, optimizer, and activator parameters used in your model among myriad other things.

Combining sklearn’s GridSearchCV with Keras can automate this process.

Answered By: jaycode

Just to answer the questions more specifically, here’s a definition of epoch and loss:

Epoch: A full pass over all of your training data.

For example, in your view above, you have 1213 observations. So an epoch concludes when it has finished a training pass over all 1213 of your observations.

Loss: A scalar value that we attempt to minimize during our training of the model. The lower the loss, the closer our predictions are to the true labels.

This is usually Mean Squared Error (MSE) as David Maust said above, or often in Keras, Categorical Cross Entropy


What you’d expect to see from running fit on your Keras model, is a decrease in loss over n number of epochs. Your training run is rather abnormal, as your loss is actually increasing. This could be due to a learning rate that is too large, which is causing you to overshoot optima.

As jaycode mentioned, you will want to look at your model’s performance on unseen data, as this is the general use case of Machine Learning.

As such, you should include a list of metrics in your compile method, which could look like:

model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

As well as run your model on validation during the fit method, such as:

model.fit(data, labels, validation_split=0.2)

There’s a lot more to explain, but hopefully this gets you started.

Answered By: Lucas Ramadan