Keras to predict number based on graph, with no accuracy at all

Question

I’m new to the neral network world and made an atempt to write an prediction algoritm with tensorflow/keras. This code is just trying to predict an roc depending on the Alt and Temp based on a graph.

(Not able to show the graph here though.)

After a lot of attempts I got some accuracy, about 0.2 to 0.5. Not great but I at leas got something to work with. After a while it dropped to 0 and however I tweak, it dosn’t give me any accuracy at all.
Any idead why I won’t get any accuracy?

#import tensorflow as tf
from tensorflow import keras
import numpy as np
import pandas as pd
import sklearn.model_selection

#Data collection
factor = 10
data = pd.read_csv("roc_6800_ibf.csv", sep=",")
data = data.apply(pd.to_numeric, errors='coerce')
data = (data / factor) + 5

predict = "Roc"

x = np.array(data.drop([predict], axis=1))
y = np.array(data[predict])

x_train, x_test, y_train, y_test = sklearn.model_selection.train_test_split(x, y, 
test_size=0.2)

x_shape = int(x.ndim)
y_shape = int(y.ndim)

#Model

model = keras.Sequential([
keras.layers.Dense(units=(2), input_shape=(2,), activation="relu"),
keras.layers.Dense(4, activation="relu"),
keras.layers.Dense(1, activation="relu")
])

model.compile(optimizer="adam", loss="MeanSquaredError", metrics=["accuracy"])

model.fit(x_train, y_train, epochs=20, batch_size=10, verbose=1)

results = model.evaluate(x_test, y_test)

print("- - - - - - - - - - - - - - - - - - - - - - - -")
print(results)

#Prediction

def dataPredict(inputvalues, outputvalues):
    print("- - - - - - - - - - - - - - - - - - - - - - - -")
    test_q = np.array([inputvalues])
    test_a = outputvalues
    prediction = model.predict((test_q / factor) + 5)

    print("Prediction " + str((prediction[0] - 5) * factor))
    print("Actual " + str(test_a[0]))
    print("Input " + str(test_q))


dataPredict([5.5,20.0],[3.6])
dataPredict([6.8,30.0],[0.4])

My indata is about 80 rows from points that I have taken myself from the graph and looks like this. I want to take Alt and Temp to get Roc.

Updated the dataset, 72 rows:

Alt,Temp,Roc
-1.0,-40.0,9.6
0.0,-40.0,9.6
1.0,-40.0,9.6
2.0,-40.0,9.6
3.0,-40.0,9.6
4.0,-40.0,9.6
5.0,-40.0,9.6
6.0,-40.0,9.6
7.0,-40.0,8.1
8.0,-40.0,7.9
7.5,-40.0,9.1
-1.0,0.0,9.6
0.0,0.0,9.6
1.0,0.0,9.6
2.0,0.0,9.6
2.1,0.0,9.6
3.0,0.0,9.0
4.0,0.0,8.0
5.0,0.0,6.6
6.0,0.0,5.5
7.0,0.0,4.2
8.0,0.0,3.2
-1.0,20.0,9.6
0.0,20.0,9.6
0.5,20.0,9.0
1.0,20.0,8.6
2.0,20.0,7.8
3.0,20.0,6.2
4.0,20.0,5.2
5.0,20.0,4.0
6.0,20.0,2.9
7.0,20.0,1.8
8.0,20.0,0.5
-1.0,40.0,7.5
0.0,40.0,6.8
1.0,40.0,5.6
2.0,40.0,4.2
3.0,40.0,3.2
4.0,40.0,2.2
5.0,40.0,1.0
-1.0,50.0,5.4
0.0,50.0,4.2
-0.5,-40.0,9.5
0.5,-40.0,9.5
1.5,-40.0,9.5
2.5,-40.0,9.5
3.5,-40.0,9.5
4.5,-40.0,9.5
5.5,-40.0,9.5
6.5,-40.0,9.1
7.5,-40.0,8.1
-0.5,-10.0,9.5
0.5,-10.0,9.5
1.5,-10.0,9.5
2.5,-10.0,9.5
3.5,-10.0,9.5
4.5,-10.0,8.3
5.5,-10.0,7.1
6.5,-10.0,6.0
7.5,-10.0,5.0
-0.5,30.0,8.4
0.5,30.0,7.6
1.5,30.0,6.4
2.5,30.0,5.5
3.5,30.0,4.2
4.5,30.0,3.1
5.5,30.0,1.9
6.5,30.0,0.8
7.5,30.0,-0.5
5.2,10.0,5.3
6.8,10.0,4.0

I have tried to tweak with the dataset (indata) in the code to make all numbers posetive and devided them by 10, then I got the best resault so far but suddenly it just shot down to 0

Epoch 20/20
6/6 [==============================] - 0s 2ms/step - loss: 32.5049 - accuracy: 0.0000e+00

Asked By: Bengt B

||

Source

Answer 1

Alright so I tried implementing some ML on your Dataset (TLDR: XGBoost worked better in this case)

Now that I had a look at the dataset, your accuracy comes 0 as this is a Regression task, and your output is a continuous number, not in the form of [0 or 1]. Hence matching of the predicted output will be almost 0, hence the 0 accuracy. Better way to evaluate these kind of tasks are using different loss functions like MAE, MSE, RMSE, MAPE, and for accuracy you can use R Squared.

Anyway here’s the code:

import pandas as pd
import numpy as np
import seaborn as sns
import collections
import xgboost
from sklearn.linear_model import LinearRegression

df = pd.read_csv("sample_data_1.csv") # Your dataset

from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(df[['Alt','Temp']], df['Roc'], test_size=0.3)

So first I fitted a Linear Model on your data, because the data entries as well as the complexity seemed pretty simple

lin_model = LinearRegression()
lin_model.fit(x_train, y_train)
preds = lin_model.predict(x_test)

from sklearn.metrics import r2_score
"Accuracy is " + str(r2_score(preds, y_test))
Output: 'Accuracy is 0.6826956688194117'

As you can see, the Linear Model got low accuracy, but now its certain that the inputs are related to the outputs in some fashion.

Next I tried a Keras Model similar to yours, The code is below:

import tensorflow as tf
import tensorflow.keras.layers as layers

model = tf.keras.Sequential([
    layers.Dense(1000, activation = 'relu', input_shape = (2, )),
    layers.Dropout(0.2),
    layers.Dense(500, activation = 'relu'),
    layers.Dropout(0.2),
    layers.Dense(1, activation = 'relu')
])

model.compile(optimizer = 'adam', loss = 'mape', metrics=['mape','mae','mse'])
model.fit(x_train, y_train, epochs = 100, batch_size = 16)
model.evaluate(x_test, y_test)
Output: 1/1 [==============================] - 0s 130ms/step - loss: 53.3907 - mape: 53.3907 - mae: 2.6886 - mse: 15.3293

The results here are really poor as the loss is pretty much 50%, but if you see the Mean Average Error, in magnitude its not a lot.

It means that the model could have performed better if it was scaled down using MinMaxScaler() from scikit-learn’s preprocessing library. (You can try that)

Finally I implemented an XGBoost model, which performed much better than the rest:

xgb_clf = xgboost.XGBRegressor(
    learning_rate=0.3,
    max_depth=6,
    n_estimators=1000
)
xgb_clf.fit(x_train, y_train)
preds = xgb_clf.predict(x_test)
"Accuracy is " + str(r2_score(preds, y_test))
Output: 'Accuracy is 0.8968514145069562'

Almost 90%. And keeping mind the rudimentary state of the data, and minimal preprocessing, the XGBoost model can have a good increase of 5 to 6% in accuracy if proper processing and augmentation is used.

Cheers!

Answered By: Gautam Chettiar

Keras to predict number based on graph, with no accuracy at all

Question:

Answers: