How to perform scikit learn's test-train split for a 2D input?


This is a beginner level question on scikit learn’s test-train split module.

I am working trying to feed in 2 inputs to the input layer of my neural network, but I am not able to get the input matrix’s dimensions correct! What change I should implement to get this working!

X1 and X2 are my inputs and y is my output. e.g. I wish to input X1 = 3.14 and X2 = -1.0 and my y should be equal to 0.0 . This way I want to train my network.

As of now I am getting an error saying:

ValueError: Found input variables with inconsistent numbers of samples: [2, 126]


import numpy as np

X1 = np.arange(0,4*np.pi,0.1)   # start,stop,step
X2 = np.cos(X1)

y = np.sin(X1)

X = [X1, X2]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.4)

For my network I will further build on a deep NN using keras, which will follow further code from here on.

model = Sequential()
model.add(Dense(10, input_dim=2, activation='relu'))
Asked By: Formal_that



Your X1 and X2 are not vectors
X1.shape – (126,)

When you created array X, you added two lists in two rows and got (2,126) shape.

but you need input X shape – (126,2), you features should be in columns.

first column X1, second column X2

You can simple transpose X array in your case, use this line instead:

X = np.array([X1, X2]).T
Answered By: Jeka Golovachev
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.