Input 0 of layer "conv2d_5" is incompatible with the layer: expected min_ndim=4, found ndim=2. Full shape received: (None, 2)

Question:

I am trying to use CNN on multivariate time series instead the most common usage on images. The number of features is between 90 and 120, depending on which I need to consider and experiment with. This is my code

scaler = StandardScaler()
X_train_s = scaler.fit_transform(X_train)
X_test_s = scaler.transform(X_test)

X_train_s = X_train_s.reshape((X_train_s.shape[0], X_train_s.shape[1],1))
X_test_s = X_test_s.reshape((X_test_s.shape[0], X_test_s.shape[1],1))

batch_size = 1024
length = 120
n_features = X_train_s.shape[1]

generator = TimeseriesGenerator(X_train_s, pd.DataFrame.to_numpy(Y_train[['TARGET_KEEP_LONG', 
                                                                          'TARGET_KEEP_SHORT']]), 
                                                                 length=length, 
                                                                 batch_size=batch_size)

validation_generator = TimeseriesGenerator(X_test_s, pd.DataFrame.to_numpy(Y_test[['TARGET_KEEP_LONG', 'TARGET_KEEP_SHORT']]), length=length, batch_size=batch_size)


early_stop = EarlyStopping(monitor = 'val_accuracy', mode = 'max', verbose = 1, patience = 20)

CNN_model = Sequential()
   
model.add(
    Conv2D(
        filters=64,
        kernel_size=(1, 5),
        strides=1,
        activation="relu",
        padding="valid",
        input_shape=(length, n_features, 1),
        use_bias=True,
    )
)
model.add(MaxPooling2D(pool_size=(1, 2)))
model.add(
    Conv2D(
        filters=64,
        kernel_size=(1, 5),
        strides=1,
        activation="relu",
        padding="valid",
        use_bias=True,
    )
)
[... code continuation ...]

In other words, I take the features as one dimension and a certain number of rows as the other dimension. But I get this error

"ValueError: Input 0 of layer "conv2d_5" is incompatible with the layer: expected min_ndim=4, found ndim=2. Full shape received: (None, 2)"

that is referred to the first CNN layer.

Asked By: fede72bari

||

Answers:

Data loading

I have made a simple class that demonstrates a reasonable approach to doing so. Mind you, I am not that familiar with TensorFlow, mainly using PyTorch, so the code might not be optimized.

You are probably best at defining a custom generator if one can’t be used for this. After reading the comments, I noticed that you don’t want to compute all ahead of time all the values; this would do so because we are only keeping the underlying data in self.data and creating new tensors based on this.

import tensorflow as tf
import numpy as np
v = np.array([[12055., 11430., 10966., 12055., 11430., 10966.], 
              [11430., 10966., 10725., 11430., 10966., 10725.],
              [10966., 10725., 10672.,10966., 10725., 10672.]])
q = tf.constant(v)
class MyData():

    def __init__(self, data, windows_size):
        self.data = data
        self.windows_size = windows_size
        self._dataset = tf.data.Dataset.from_generator(self._generator,
                        output_types=tf.float32,
                        output_shapes=(self.windows_size, self.data.shape[1]))

    def _generator(self):
        for i in range(self.data.shape[0] - self.windows_size + 1):
            yield self.data[i:i+self.windows_size]
    
    def __len__(self):
        return self.data.shape[0] - self.windows_size + 1

    def get_dataset(self):
        return self._dataset
# Example usage:
test = MyData(q, 2)
it = iter(test.get_dataset())

for data in it:
    print(data.shape)

This produces tensors that have a windows_size for the first dimension. The code was made to work with [N, DATA] -> [W, DATA], where N is for the time_series, and W is for the reduced window size; I added part of the example code from the previous link.

Model design

Multiple design decisions can be made for the model design.

Firstly, you can treat it as an embedding problem (Embedding layer). Then you can reshape it to use with your 2D convolutions.

The second approach is to reshape the data into something resembling 2D images directly. Note that the second approach will be bad if the sequence length changes between different examples. You cannot batchify the training without modifying the network (adding extra layers to process images depending on the size is not relatively straightforward).

Lastly, there already exist tutorials that do such things with features of time series data, shown below:

def basic_conv2D(n_filters=10, fsize=5, window_size=5, n_features=2):
 new_model = keras.Sequential()
 new_model.add(tf.keras.layers.Conv2D(n_filters, (1,fsize), padding=”same”, activation=”relu”, input_shape=(window_size, n_features, 1)))
 new_model.add(tf.keras.layers.Flatten())
 new_model.add(tf.keras.layers.Dense(1000, activation=’relu’))
 new_model.add(tf.keras.layers.Dense(100))
 new_model.add(tf.keras.layers.Dense(1))
 new_model.compile(optimizer=”adam”, loss=”mean_squared_error”) 
 return new_model
m2 = basic_conv2D(n_filters=24, fsize=2, window_size=window_size, n_features=data_train_wide.shape[2])
m2.summary()
Answered By: Warkaz

After days of attempts and looking to post that gave some indirect insights, I found the trouble and I can share the solution for

  • using 2DCNN models with time series and not images
  • avoiding memory troubles for preparing the dataset using TimeseriesGenerator

As expected the trouble was in preparing the dataset with the proper shape. The main bug in my code was this

X_train_s = X_train_s.reshape((X_train_s.shape[0], X_train_s.shape[1],1))
X_test_s = X_test_s.reshape((X_test_s.shape[0], X_test_s.shape[1],1))

that should be replaced with this (I also changed the names of the series, but just to keep the original one untouched)

X_train_s_CNN = X_train_s.reshape(*X_train_s.shape, 1)
X_test_s_CNN = X_test_s.reshape(*X_test_s.shape, 1)

Here is the full working code

from tensorflow.keras.layers import Conv1D, MaxPooling1D, Dense, Flatten, Conv2D, MaxPooling2D
from tensorflow.keras.layers import BatchNormalization

scaler = StandardScaler()
X_train_s = scaler.fit_transform(X_train)
X_test_s = scaler.transform(X_test)

X_train_s_CNN = X_train_s.reshape(*X_train_s.shape, 1)
X_test_s_CNN = X_test_s.reshape(*X_test_s.shape, 1)


batch_size = 64
length = 300
n_features = X_train_s.shape[1]

generator = TimeseriesGenerator(X_train_s_CNN, pd.DataFrame.to_numpy(Y_train[['TARGET_KEEP_LONG', 
                                                                          'TARGET_KEEP_SHORT']]), 
                                                                    length=length, 
                                                                    batch_size=batch_size)

validation_generator = TimeseriesGenerator(X_test_s, pd.DataFrame.to_numpy(Y_test[['TARGET_KEEP_LONG', 
                                                                                   'TARGET_KEEP_SHORT']]), 
                                                                           length=length, 
                                                                           batch_size=batch_size)


early_stop = EarlyStopping(monitor = 'val_accuracy', mode = 'max', verbose = 1, patience = 10)

CNN_model = Sequential()


CNN_model.add(
    Conv2D(
        filters=64,
        kernel_size=(2,2),
        strides=1,
        activation="relu",
        padding="same",
        input_shape=(length, n_features, 1),
        use_bias=True,
    )
)
CNN_model.add(BatchNormalization())
CNN_model.add(MaxPooling2D(pool_size=(2, 2)))
#CNN_model.add(Dropout(0.2))

CNN_model.add(
    Conv2D(
        filters=128,
        kernel_size=(2,2),
        strides=1,
        activation="relu",
        padding="same"
    )
)
CNN_model.add(BatchNormalization())
CNN_model.add(MaxPooling2D(pool_size=(2, 2)))
#CNN_model.add(Dropout(0.3))

# CNN_model.add(
#    Conv2D(
#        filters=256,
#        kernel_size=(2,2),
#        strides=1,
#        activation="relu",
#        padding="same"
#    )
# )
# CNN_model.add(
#    Conv2D(
#        filters=256,
#        kernel_size=(2,2),
#        strides=1,
#        activation="relu",
#        padding="same"
#    )
# )
# CNN_model.add(BatchNormalization())
# CNN_model.add(MaxPooling2D(pool_size=(2, 2)))
# CNN_model.add(Dropout(0.3))


CNN_model.add(Flatten())
# CNN_model.add(Dense(units=4096, activation="relu", ))
# CNN_model.add(BatchNormalization())
# #CNN_model.add(Dropout(0.5))
# CNN_model.add(Dense(units=128, activation="relu", ))
# CNN_model.add(BatchNormalization())
# # CNN_model.add(Dropout(0.5))
CNN_model.add(Dense(units=2, activation="softmax", ))


CNN_model.compile(
    optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"]
)



CNN_model.fit(
    generator, steps_per_epoch=1, 
    validation_data=validation_generator,
    epochs=200,
)

In the remarked parts the variants that I have tested. Regretfully, in this specific case, the results are very unstable in terms of accuracy and val_accuracy. The most disturbing is the accuracy of having really erratic behavior. Not clear why.

Answered By: fede72bari
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.