What is the input dimension for a LSTM in Keras?
Question:
I’m trying to use deeplearning with LSTM in keras .
I use a number of signal as input (nb_sig
) that may vary during the training with a fixed number of samples (nb_sample
)
I would like to make parameter identification, so my output layer is the size of my parameter number (nb_param
)
so I created my training set of size (nb_sig
x nb_sample
) and the label (nb_param
x nb_sample
)
my issue is I cannot find the correct dimension for the deep learning model.
I tried this :
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, LSTM
nb_sample = 500
nb_sig = 100 # number that may change during the training
nb_param = 10
train = np.random.rand(nb_sig,nb_sample)
label = np.random.rand(nb_sig,nb_param)
print(train.shape,label.shape)
DLmodel = Sequential()
DLmodel.add(LSTM(units=nb_sample, return_sequences=True, input_shape =(None,nb_sample), activation='tanh'))
DLmodel.add(Dense(nb_param, activation="linear", kernel_initializer="uniform"))
DLmodel.compile(loss='mean_squared_error', optimizer='RMSprop', metrics=['accuracy', 'mse'], run_eagerly=True)
print(DLmodel.summary())
DLmodel.fit(train, label, epochs=10, batch_size=nb_sig)
but I get this error message:
Traceback (most recent call last):
File "C:UsersmaximeDesktopSESAMEPycharmProjectsLargeScale_2022_09_07di3.py", line 22, in <module>
DLmodel.fit(train, label, epochs=10, batch_size=nb_sig)
File "C:Python310libsite-packageskerasutilstraceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:Python310libsite-packageskerasengineinput_spec.py", line 232, in assert_input_compatibility
raise ValueError(
ValueError: Exception encountered when calling layer "sequential" " f"(type Sequential).
Input 0 of layer "lstm" is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (100, 500)
Call arguments received by layer "sequential" " f"(type Sequential):
• inputs=tf.Tensor(shape=(100, 500), dtype=float32)
• training=True
• mask=None
I don’t understand what I’m suppose to put as input_shape
for the LSTM layer and as the number of signals I use during the training will changed, this is not so clear to me.
Answers:
The input to the LSTM should be 3D with the first dimension being the sample size in your case 500. Assuming input having shape (500,x,y), input_shape should (x,y).
As per the Keras documentation, the LSTM layer takes a three-dimensional tensor as input, and requires one dimension dedicated to timesteps. Since you are using the default parameter time_major=False, the input should be in the form [batch, timesteps, feature].
This related question may help you understand LSTM input shapes better.
The input to LSTM has to be in the following format:
[sample, timestep, n_features]
or in your notation
[nb_sig x nb_sample x n_features]
Therefore you need to reshape the training data to that format. Instead you have:
so I created my training set of size (nb_sig x nb_sample)
As you can see that is 2D input not 3D. You are missing the 3D dimension which in your case seems to be number of features.
Simply add 1 extra dimension to the vector using numpy.expand_dims()
https://numpy.org/doc/stable/reference/generated/numpy.expand_dims.html
and you should be good to go 🙂 – assuming you have univariate time-series.
I’m trying to use deeplearning with LSTM in keras .
I use a number of signal as input (nb_sig
) that may vary during the training with a fixed number of samples (nb_sample
)
I would like to make parameter identification, so my output layer is the size of my parameter number (nb_param
)
so I created my training set of size (nb_sig
x nb_sample
) and the label (nb_param
x nb_sample
)
my issue is I cannot find the correct dimension for the deep learning model.
I tried this :
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, LSTM
nb_sample = 500
nb_sig = 100 # number that may change during the training
nb_param = 10
train = np.random.rand(nb_sig,nb_sample)
label = np.random.rand(nb_sig,nb_param)
print(train.shape,label.shape)
DLmodel = Sequential()
DLmodel.add(LSTM(units=nb_sample, return_sequences=True, input_shape =(None,nb_sample), activation='tanh'))
DLmodel.add(Dense(nb_param, activation="linear", kernel_initializer="uniform"))
DLmodel.compile(loss='mean_squared_error', optimizer='RMSprop', metrics=['accuracy', 'mse'], run_eagerly=True)
print(DLmodel.summary())
DLmodel.fit(train, label, epochs=10, batch_size=nb_sig)
but I get this error message:
Traceback (most recent call last):
File "C:UsersmaximeDesktopSESAMEPycharmProjectsLargeScale_2022_09_07di3.py", line 22, in <module>
DLmodel.fit(train, label, epochs=10, batch_size=nb_sig)
File "C:Python310libsite-packageskerasutilstraceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:Python310libsite-packageskerasengineinput_spec.py", line 232, in assert_input_compatibility
raise ValueError(
ValueError: Exception encountered when calling layer "sequential" " f"(type Sequential).
Input 0 of layer "lstm" is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (100, 500)
Call arguments received by layer "sequential" " f"(type Sequential):
• inputs=tf.Tensor(shape=(100, 500), dtype=float32)
• training=True
• mask=None
I don’t understand what I’m suppose to put as input_shape
for the LSTM layer and as the number of signals I use during the training will changed, this is not so clear to me.
The input to the LSTM should be 3D with the first dimension being the sample size in your case 500. Assuming input having shape (500,x,y), input_shape should (x,y).
As per the Keras documentation, the LSTM layer takes a three-dimensional tensor as input, and requires one dimension dedicated to timesteps. Since you are using the default parameter time_major=False, the input should be in the form [batch, timesteps, feature].
This related question may help you understand LSTM input shapes better.
The input to LSTM has to be in the following format:
[sample, timestep, n_features]
or in your notation
[nb_sig x nb_sample x n_features]
Therefore you need to reshape the training data to that format. Instead you have:
so I created my training set of size (nb_sig x nb_sample)
As you can see that is 2D input not 3D. You are missing the 3D dimension which in your case seems to be number of features.
Simply add 1 extra dimension to the vector using numpy.expand_dims()
https://numpy.org/doc/stable/reference/generated/numpy.expand_dims.html
and you should be good to go 🙂 – assuming you have univariate time-series.