ValueError: At least one array required as input error when dividing the set

Question:

I work on a finance forecasting project I did my preprocessing part but when I try to construct train and test set I got an error

I have constructed my DataFrame as

alldata = pd.DataFrame({'Date':date,
                        'S&P 500 Price':normalised_snp,
                        'S&P 500 Open': normalised_snpopen,
                        '10 Year Bond Price': normalised_tybp,
                        '10 Year Bond Open': normalised_tybpopen,
                        '2 Year Bond Price': normalised_twybp,
                        '2 Year Bond Open': normalised_twybpopen,
                        'US Inflation' : normalised_USInflation,
                        'US GDP' : normalised_USGDP,
                        'US Insterest' : normalised_USInterest,
                        'Global Inflation Rate' : normalised_GlobalInflation,
                        'Global GDP' : normalised_GlobalGDP})

It looks like this

        Date  S&P 500 Price  ...  Global Inflation Rate  Global GDP
0 2006-01-03       0.143754  ...               0.588237         0.0
1 2006-01-04       0.144885  ...               0.588237         0.0
2 2006-01-05       0.144890  ...               0.588237         0.0
3 2006-01-06       0.147795  ...               0.588237         0.0
4 2006-01-09       0.148936  ...               0.588237         0.0
5 2006-01-10       0.148824  ...               0.588237         0.0
6 2006-01-11       0.149914  ...               0.588237         0.0
7 2006-01-12       0.147943  ...               0.588237         0.0
8 2006-01-13       0.148319  ...               0.588237         0.0
9 2006-01-17       0.147208  ...               0.588237         0.0

and then I have tried to construct test and train for this set as

X= alldata['S&P 500 Price'].to_numpy()
y= alldata.drop(columns=['Date','S&P 500 Price','10 Year Bond Open','2 Year Bond Open']).to_numpy()
print(y)
X_train,X_test,y_train,y_test = train_test_split(test_size=0.25,random_state=0)
print(X_test.shape)
print(X_train.shape)

But I got an error as

ValueError: At least one array required as input

I couldn’t find my mistake is ther any solution for this?

Asked By: samet.bnc

||

Answers:

You should include the training and testing data into the train_test_split function:

X_train,X_test,y_train,y_test = train_test_split(x, y, test_size=0.25,random_state=0)
Answered By: Treeco