For loop using Pandas dataframe to predict price doesn't append

Question:

I’m currently using a TensorFlow model that I’ve made to predict the X next prices for a curve using a for loop that calls the append() fonction of the pandas dataframe.

The model is a time series one so at each loop I calculate the "next date" uing the last dataframe row and I calculate the predicted price using the last row of the Dataframe, then I append the new row containing the "next date" and the predicted price to the dataframe so that it can predict the next price in the following loop.

The problem is that the dataframe doesn’t get appended

Here’s the code if anyone knows, also if it’s not the way that it should be done don’t hesitate to correct me I did this knowing that I’m new to the whole TensorFlow / Pandas modules

last_data = pd.read_excel("Nickel.xlsx")
print('Old dataset before loop : ', last_data)
for i in range(10):
        new_df = last_data.filter(['Valeur'])
        last_60_days = new_df[-60+(-i):].values
        last_60_days_scaled = scaler.transform(last_60_days)
        X_test = []
        X_test.append(last_60_days_scaled)
        X_test = np.array(X_test)
        X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
        pred_price = model.predict(X_test)
        pred_price = scaler.inverse_transform(pred_price)
        #print('Prix predit : ', pred_price)
        dernieredate = last_data['Date'].iloc[-1]
        datecorrect = pd.to_datetime(dernieredate)
        print('Old date : ', datecorrect)
        nextdate = datecorrect + pd.to_timedelta(1,unit='d')
        print('New date : ', nextdate)
        last_data.append([nextdate, pred_price])
print('New dataset final after loop : ', last_data)

Here’s the log :

Old dataset before loop :             Date  Valeur
0    2002-09-16    6770
1    2002-09-17    6550
2    2002-09-18    6590
3    2002-09-19    6610
4    2002-09-20    6580
...         ...     ...
4995 2022-11-14   27000
4996 2022-11-15   29595
4997 2022-11-16   28550
4998 2022-11-17   26050
4999 2022-11-18   24800

[5000 rows x 2 columns]
1/1 [==============================] - 0s 20ms/step
Prix predit :  [[26672.488]]
Old date :  2022-11-18 00:00:00
New date :  2022-11-19 00:00:00
1/1 [==============================] - 0s 22ms/step
Old date :  2022-11-18 00:00:00
New date :  2022-11-19 00:00:00
1/1 [==============================] - 0s 21ms/step
Old date :  2022-11-18 00:00:00
New date :  2022-11-19 00:00:00
1/1 [==============================] - 0s 20ms/step
Old date :  2022-11-18 00:00:00
New date :  2022-11-19 00:00:00
1/1 [==============================] - 0s 22ms/step
Old date :  2022-11-18 00:00:00
New date :  2022-11-19 00:00:00
1/1 [==============================] - 0s 21ms/step
Old date :  2022-11-18 00:00:00
New date :  2022-11-19 00:00:00
1/1 [==============================] - 0s 22ms/step
Old date :  2022-11-18 00:00:00
New date :  2022-11-19 00:00:00
1/1 [==============================] - 0s 21ms/step
Old date :  2022-11-18 00:00:00
New date :  2022-11-19 00:00:00
1/1 [==============================] - 0s 20ms/step
Old date :  2022-11-18 00:00:00
New date :  2022-11-19 00:00:00
1/1 [==============================] - 0s 22ms/step
Old date :  2022-11-18 00:00:00
New date :  2022-11-19 00:00:00
New dataset final after loop :             Date  Valeur
0    2002-09-16    6770
1    2002-09-17    6550
2    2002-09-18    6590
3    2002-09-19    6610
4    2002-09-20    6580
...         ...     ...
4995 2022-11-14   27000
4996 2022-11-15   29595
4997 2022-11-16   28550
4998 2022-11-17   26050
4999 2022-11-18   24800

[5000 rows x 2 columns]

Thank you a lot!

Asked By: Adam Ben kahla

||

Answers:

Try changing:

last_data.append([nextdate, pred_price])

to:

last_data = last_data.append([nextdate, pred_price])

or:

last_data = pd.concat([nextdate, pred_price])
Answered By: gtomer

Thank you a lot @9769953 !

The append fonction didn’t worked as the list.append() fonction from Python as he said, the solution was assigning the pred_price and next_data to a new variable !

last_data = pd.read_excel("Nickel.xlsx")
print('Old dataset before loop : ', last_data)
for i in range(10):
        new_df = last_data.filter(['Valeur'])
        last_60_days = new_df[-60+(-i):].values
        last_60_days_scaled = scaler.transform(last_60_days)
        X_test = []
        X_test.append(last_60_days_scaled)
        X_test = np.array(X_test)
        X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
        pred_price = model.predict(X_test)
        pred_price = scaler.inverse_transform(pred_price)
        #print('Prix predit : ', pred_price)
        dernieredate = last_data['Date'].iloc[-1]
        datecorrect = pd.to_datetime(dernieredate)
        print('Old date : ', datecorrect)
        nextdate = datecorrect + pd.to_timedelta(1,unit='d')
        print('New date : ', nextdate)
        Data_Temp = {'Date':nextdate, 'Valeur':pred_price[0]}
        New_data = last_data.append(Data_Temp, ignore_index=True)
        
print('New dataset final after loop : ', New_data)
Answered By: Adam Ben kahla

the pd.DataFrame.append deprecated since versions 1.4.0 pd.DataFrame.append and pd.concat() is more flexible since you telling that it does not update the Dataframe see the below examples.

Sample: Simple codes initialized variables and simulate the shape of the data in the DataFrame.

import tensorflow as tf
import pandas as pd


variables = pd.read_excel('F:\temp\20220305\Book 2.xlsx', , index_col=None, header=[0])
print( 'variables: ' )
print( variables )
print( tf.constant(variables).numpy() )         # (6, 6)
print( 'variables.append: ' )
temp = tf.constant(variables.append( pd.DataFrame([[0, 0, 0, 0, 0, 2], [0, 0, 0, 0, 2, 0], [0, 0, 0, 2, 0, 0]]) ) ).numpy()
temp = pd.DataFrame( temp )
print( temp )   
print( 'variables.append: ' )
print( tf.constant(temp.append( pd.DataFrame([[0, 0, 0, 0, 0, 0, 0, 4]]) ) ).numpy() )  


print( 'pd.concat: ' )
temp = tf.constant( pd.concat([ variables, pd.DataFrame([0, 0, 0, 0, 0, 2]) ], axis=1) ).numpy()
print( temp )
print( 'pd.concat: ' )
print( tf.constant( pd.concat([ pd.DataFrame(temp), pd.DataFrame([0, 0, 0, 0, 0, 3]) ], axis=1) ).numpy() )

Sample: Create datasets from input as Dataframe variables, inquiry of fixed sizes variables match of data and label.

for Index, Image, Label in variables.values:
    print( Label )
    list_label.append( Label )

    image = tf.io.read_file( Image )
    image = tfio.experimental.image.decode_tiff(image, index=0)
    list_file_actual.append(image)
    image = tf.image.resize(image, [32,32], method='nearest')
    list_Image.append(image)

    list_label = tf.cast( list_label, dtype=tf.int32 )
    list_label = tf.constant( list_label, shape=( 33, 1, 1 ) )
    list_Image = tf.cast( list_Image, dtype=tf.int32 )
    list_Image = tf.constant( list_Image, shape=( 33, 1, 32, 32, 4 ) )

dataset = tf.data.Dataset.from_tensor_slices(( list_Image, list_label ))

Output: Pandas read input from Book 2.xlsx

   0  1  2  3  4  5
0  1  0  0  0  0  0
1  0  1  0  0  0  0
2  0  0  1  0  0  0
3  0  0  0  1  0  0
4  0  0  0  0  1  0
5  0  0  0  0  0  1

[[1 0 0 0 0 0]
 [0 1 0 0 0 0]
 [0 0 1 0 0 0]
 [0 0 0 1 0 0]
 [0 0 0 0 1 0]
 [0 0 0 0 0 1]]

Output: DataFrame.append() at the y-axis : tf.constant(variables.append( pd.DataFrame([[0, 0, 0, 0, 0, 2], [0, 0, 0, 0, 2, 0], [0, 0, 0, 2, 0, 0]]) ) ).numpy()

variables.append:

   0  1  2  3  4  5
0  1  0  0  0  0  0
1  0  1  0  0  0  0
2  0  0  1  0  0  0
3  0  0  0  1  0  0
4  0  0  0  0  1  0
5  0  0  0  0  0  1
6  0  0  0  0  0  2
7  0  0  0  0  2  0
8  0  0  0  2  0  0

Output: DataFrame.append() at the x-axis : tf.constant(variables.append( pd.DataFrame([[0, 0, 0, 0, 0, 0, 0, 4]]) ) ).numpy()

variables.append:

[[ 1.  0.  0.  0.  0.  0. nan nan]
 [ 0.  1.  0.  0.  0.  0. nan nan]
 [ 0.  0.  1.  0.  0.  0. nan nan]
 [ 0.  0.  0.  1.  0.  0. nan nan]
 [ 0.  0.  0.  0.  1.  0. nan nan]
 [ 0.  0.  0.  0.  0.  1. nan nan]
 [ 0.  0.  0.  0.  0.  2. nan nan]
 [ 0.  0.  0.  0.  2.  0. nan nan]
 [ 0.  0.  0.  2.  0.  0. nan nan]
 [ 0.  0.  0.  0.  0.  0.  0.  4.]]

Output: DataFrame.concat() at the y-axis : tf.constant( pd.concat([ variables, pd.DataFrame([0, 0, 0, 0, 0, 2]) ], axis=1) ).numpy()

pd.concat:
[[1 0 0 0 0 0 0]
 [0 1 0 0 0 0 0]
 [0 0 1 0 0 0 0]
 [0 0 0 1 0 0 0]
 [0 0 0 0 1 0 0]
 [0 0 0 0 0 1 2]]

Output: DataFrame.concat() at the x-axis : tf.constant( pd.concat([ pd.DataFrame(variables), pd.DataFrame([0, 0, 0, 0, 0, 3]) ], axis=1) ).numpy()

pd.concat:
[[1 0 0 0 0 0 0 0]
 [0 1 0 0 0 0 0 0]
 [0 0 1 0 0 0 0 0]
 [0 0 0 1 0 0 0 0]
 [0 0 0 0 1 0 0 0]
 [0 0 0 0 0 1 2 3]]
Answered By: Jirayu Kaewprateep