Tensorflow keras model with [nan nan] output

Question:

I am trying to build an expectation model for my (19502,3) data using Keras Sequential model. I used two hidden layers(the first one with input shaps) and an output layer. I am still confused about the input shape of these layers. Did I insert the layers correctly?
The model gives me a [nan nan] prediction. Can anyone tell me what is wrong with my code?
Another question; the input shape for the first hidden layer must have the shape of X_train which is (14626,). However, when I used input_shape=(14626,) I got an error.
Any help would be appreciated!
Thank you
Alex

ValueError                                Traceback (most recent call last)
Input In [44], in <cell line: 39>()
     34     print(layer.output_shape)      
     36 model.compile(loss='categorical_crossentropy',
     37               optimizer='sgd',
     38               metrics=['accuracy'])
---> 39 model.fit(X_train, y_train, epochs=5, batch_size=32)
     40 loss_and_metrics = model.evaluate(X_test, y_test, batch_size=128)
     41 classes = model.predict(X_test, batch_size=128)

File ~Anaconda3libsite-packageskerasutilstraceback_utils.py:67, in filter_traceback.<locals>.error_handler(*args, **kwargs)
     65 except Exception as e:  # pylint: disable=broad-except
     66   filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67   raise e.with_traceback(filtered_tb) from None
     68 finally:
     69   del filtered_tb

File ~AppDataLocalTemp__autograph_generated_filerazklnil.py:15, in outer_factory.<locals>.inner_factory.<locals>.tf__train_function(iterator)
     13 try:
     14     do_return = True
---> 15     retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
     16 except:
     17     do_return = False

ValueError: in user code:

    File "C:UsersalexAnaconda3libsite-packageskerasenginetraining.py", line 1051, in train_function  *
        return step_function(self, iterator)
    File "C:UsersalexAnaconda3libsite-packageskerasenginetraining.py", line 1040, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "C:UsersalexAnaconda3libsite-packageskerasenginetraining.py", line 1030, in run_step  **
        outputs = model.train_step(data)
    File "C:UsersalexAnaconda3libsite-packageskerasenginetraining.py", line 889, in train_step
        y_pred = self(x, training=True)
    File "C:UsersalexAnaconda3libsite-packageskerasutilstraceback_utils.py", line 67, in error_handler
        raise e.with_traceback(filtered_tb) from None
    File "C:UsersalexAnaconda3libsite-packageskerasengineinput_spec.py", line 248, in assert_input_compatibility
        raise ValueError(

    ValueError: Exception encountered when calling layer "sequential_37" (type Sequential).
    
    Input 0 of layer "dense_95" is incompatible with the layer: expected axis -1 of input shape to have value 14626, but received input with shape (None, 1)
    
    Call arguments received by layer "sequential_37" (type Sequential):
      • inputs=tf.Tensor(shape=(None, 1), dtype=float32)
      • training=True
      • mask=None

Here is my code

from IPython import get_ipython
get_ipython().magic('reset -sf') 
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential

model = Sequential()
from tensorflow.keras.layers import Dense

data_file_path = 'F:/Dl_project/dss_project/data_price.csv'
my_data = pd.read_csv(data_file_path)
my_data.columns = ["length","weight","cost"]
X = my_data['cost']
y = my_data.drop(columns=['cost'])
predictionData = ([[80]])

X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=0.25, random_state=40)
#X_train= X_train.values.reshape(-1, 1)
#y_train= y_train.values.reshape(-1, 1)
#X_test = X_test.values.reshape(-1, 1)
print(X_train.shape)# = (14626,)
print(y_train.shape)# = (14626,2)

model.add(Dense(units=64,input_shape=(1,), activation='relu')) #hidden layer 1 with input
model.add(Dense(units=64, activation='relu'))#hidden layer 2
model.add(Dense(units=2, activation='softmax')) #output layer 

for layer in model.layers:
    print(layer.output_shape)      

model.compile(loss='categorical_crossentropy',
              optimizer='sgd',
              metrics=['accuracy'])
model.fit(X_train, y_train, epochs=5, batch_size=32)
loss_and_metrics = model.evaluate(X_test, y_test, batch_size=128)
classes = model.predict(X_test, batch_size=128)
print(classes)
print(model.predict(predictionData),'n')
Asked By: Alex Moh

||

Answers:

Your train and test data is supposed to have the shape (number_of_data_points, number_of_features), but your data only has the shape (number_of_data_points, ). As you only have one feature you just have to use the expand_dims function for your train- and test- x-values. This will result in the right data shape with, in your case, 14626 data points with 1 feature

X_train = np.expand_dims(X_train, axis=-1)
X_test = np.expand_dims(X_test, axis=-1)

Now you can just pass the data into fit function and check out the results.

Answered By: Blindschleiche

You are passing 14626 independent data points to your model that are scalar values. In that case your data format (shape of X_train/X_test) should be (14626, 1) as you have specified in the input_shape of your first layer. You have to do this, because the first dimension (batch dimension) corresponds to the number of samples you pass to the model. The second dimension is your data dimension which in your case is scalar.

The input shape contains everything except for the batch dimension (which is the first dimension). In your case you would pass 14626 datapoints with dimensionality 1 to your model. Therefore input_shape=(14626,) is wrong because it describes how much data you want to pass through your model, but input_shape=(1,) describes your feature-dimensions and is therefore correct.
(As a side note: you have specified a batch_size of 32, so your model will actually get data chunks of (32,1).)

Second, it looks like you want to infer length and weight of an item based on cost, right? If so, this problem is not a classification but a regression task. Therefore you should not use softmax activated output, because this would force both output values to sum up to one. You can instead use linear (activation=None) activation.
Also, you should use mse or mae instead of categorical_crossentropy, because the latter is used to fit categorical probability distributions.

Lastly, double-check if you have normalized data. If the values you pass to the model and expect as output are actual prices like 24,99 or something, you should normalize your data first, otherwise you training will be very instable.

Answered By: Chillston