Shapes for training data and labels

Question:

I am trying to train a convolutional neural network using Conv1D and sparse_categorical_crossentropy as a loss function but I keep having problems related to the shapes of the data.

Here is the network:

model=models.Sequential()
model=tf.keras.models.Sequential([
  tf.keras.layers.Conv1D(16, 24, input_shape=(1000, 4), activation="relu"),
  tf.keras.layers.MaxPooling1D(pool_size=24, strides=24),
  tf.keras.layers.Conv1D(8, 8, activation="relu"),
  tf.keras.layers.MaxPooling1D(pool_size=8, strides=8),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(64, activation="relu"),
  tf.keras.layers.Dense(64, activation="relu"),
  tf.keras.layers.Dense(4, activation="softmax")
])

model.compile(optimizer=tf.keras.optimizers.Adam(), loss='sparse_categorical_crossentropy', metrics=['accuracy'])

The input data has a shape (1, 1000, 4). It consists of 4 signals made of 1000 floats between which I am trying to discriminate.

The training consistently fails with various error messages depending on the shape of the labels.

model.fit(data, labels, epochs=1000)

For example, if I use

labels=np.array(np.array([0, 1, 2, 3]))

or

labels=np.array(np.array([[0], [1], [2], [3]])),

I obtain

ValueError: Data cardinality is ambiguous:
  x sizes: 1
  y sizes: 4
Make sure all arrays contain the same number of samples.

whereas, if I use,

labels=np.array(np.array([[0, 1, 2, 3]])),

I obtain

logits and labels must have the same first dimension, got logits shape [1,4] and labels shape [4].

I have tried many things based on the documentation or the examples I have found but I don’t manage to make it work.

Thank you very much!

It no longer crashes with some other loss functions such as binary_crossentropy for example. But the minimisation does not work (unless the learning rate is tiny) and the end result does not tally with the training data.

Asked By: zaphod

||

Answers:

change sparse_categorical_crossentropy to categorical_crossentropy. And try with respecting the shape of data = (batch_size, 1000, 4) and for labels (batch_size, 4) as you want to output 4 values.

data = tf.random.uniform([32, 1000, 4])
labels = tf.random.uniform([32, 4])

model.fit(data, labels, epochs=10, batch_size = 8)

Here I created 32 samples in the training data with shape (1000, 4) respecting your model input and labels for 32 samples with 4 values each. And trained the model with batch size 8.

Data shape (32, 1000, 4)
Labels shape (32, 4)

Epoch 1/10
2023-03-13 15:45:16.718451: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:113] Plugin optimizer for device_type GPU is enabled.
4/4 [==============================] - 1s 11ms/step - loss: 2.8993 - accuracy: 0.0938
Epoch 2/10
4/4 [==============================] - 0s 9ms/step - loss: 2.8917 - accuracy: 0.2812
Epoch 3/10
4/4 [==============================] - 0s 9ms/step - loss: 2.8971 - accuracy: 0.2812
Epoch 4/10
4/4 [==============================] - 0s 10ms/step - loss: 2.9205 - accuracy: 0.2812
Epoch 5/10
4/4 [==============================] - 0s 9ms/step - loss: 2.9841 - accuracy: 0.2500
Epoch 6/10
4/4 [==============================] - 0s 9ms/step - loss: 3.1142 - accuracy: 0.2812
Epoch 7/10
4/4 [==============================] - 0s 9ms/step - loss: 3.2714 - accuracy: 0.2812
Epoch 8/10
4/4 [==============================] - 0s 9ms/step - loss: 3.3365 - accuracy: 0.2812
Epoch 9/10
4/4 [==============================] - 0s 9ms/step - loss: 3.1053 - accuracy: 0.2812
Epoch 10/10
4/4 [==============================] - 0s 9ms/step - loss: 3.0414 - accuracy: 0.2500
Answered By: kev_ta