Keras LSTM None value output shape

Question:

this is my data X_train prepared for LSTM of shape (7000, 2, 200)

[[[0.500858   0.         0.5074856  ... 1.         0.4911533  0.        ]
  [0.4897923  0.         0.48860878 ... 0.         0.49446714 1.        ]]

 [[0.52411383 0.         0.52482396 ... 0.         0.48860878 1.        ]
  [0.4899698  0.         0.48819458 ... 1.         0.4968341  1.        ]]

 ...

 [[0.6124623  1.         0.6118705  ... 1.         0.6328777  0.        ]
  [0.6320492  0.         0.63512635 ... 1.         0.6960175  0.        ]]

 [[0.6118113  1.         0.6126989  ... 0.         0.63512635 1.        ]
  [0.63530385 1.         0.63595474 ... 1.         0.69808865 0.        ]]]

I create my sequential model

model = Sequential()
model.add(LSTM(units = 50, activation = 'relu', input_shape = (X_train.shape[1], 200)))
model.add(Dropout(0.2))
model.add(Dense(1, activation = 'linear'))
model.compile(loss = 'mean_squared_error', optimizer = 'adam')

Then I fit my model:

history = model.fit(
    X_train, 
    Y_train, 
    epochs = 20, 
    batch_size = 200, 
    validation_data = (X_test, Y_test), 
    verbose = 1, 
    shuffle = False,
)
model.summary()

And at the end I can see something like this:

 Layer (type)                Output Shape              Param #   
=================================================================
 lstm_16 (LSTM)              (None, 2, 50)             50200     
                                                                 
 dropout_10 (Dropout)        (None, 2, 50)             0         
                                                                 
 dense_10 (Dense)            (None, 2, 1)              51  

Why does it say that output shape have a None value as a first element? Is it a problem? Or it should be like this? What does it change and how can I change it?

I will appreciate any help, thanks!

Asked By: Adrian Kurzeja

||

Answers:

The first value in TensorFlow is always reserved for the batch-size. Your model doesn’t know in advance what is your batch-size so it makes it None. If we go into more details let’s suppose your dataset is 1000 samples and your batch-size is 32. So, 1000/32 will become 31.25, if we just take the floor value which is 31. So, there would be 31 batches in a total of size 32. But if you look here the total sample size of your dataset is 1000 but you have 31 batches of size 32, which is 32 * 31 = 992, where 1000 – 992 = 8, it means there would be one more batch of size 8. But the model doesn’t know in advance so, what does it do? it reserves a space in the memory where it doesn’t define a specific shape for it, in other words, the memory is dynamic for the batch-size. Therefore, you are seeing it None there. So, the model doesn’t know in advance what would be the shape of my batch-size so it makes it None so it should know it later when it computes the first epoch meaning computes all of the batches.

The None value can’t be changed because it is Dynamic in Tensorflow, the model knows it and fix it when your model completes its first epoch. So, always set the shapes which are after it like in your case it is (2, 200). The 7000 is your model’s total number of samples so the model doesn’t know in advance what would be your batch-size and the other big issue is most of the time your batch-size is not evenly divisible by your total number of samples in dataset therefore, it is necessary for the model to make it None to know it later when it computes all the batches in the very first epoch.

Answered By: Mohammad Ahmed
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.