SVM bad input shape
Question:
I created a dataset and split it into train and test sets.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.20)
When I tried to implement a SVM classifier with the code below:
from sklearn.svm import SVC
svc_classifier = SVC(kernel='rbf')
svc_classifier.fit(X_train, y_train)
X_train.shape
and y_train.shape
are both (160,2)
.
When I run the last part I got ValueError: bad input shape (160, 2)
error. I know my training and testing samples must be the same size. But I’m wondering if there is a method to deal with this problem. Thank you!
Answers:
This is the code that you want –
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
std = [[0.5, 0], [0, 0.5]]
X1 = np.vstack((
np.random.multivariate_normal([2, -2], std, size=200),
np.random.multivariate_normal([-2, 2], std, size=200)
))
y1 = np.zeros(X1.shape[0])
X2 = np.vstack((
np.random.multivariate_normal([2, 2], std, size=200),
np.random.multivariate_normal([-2, -2], std, size=200)
))
y2 = np.ones(X2.shape[0])
X = np.vstack((X1, X2))
y = np.hstack((y1, y2))
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20)
svc_classifier = SVC(kernel='rbf', gamma='auto')
svc_classifier.fit(X_train, y_train)
The original code you are using to create your data was just has Y
as the name. It is not supposed to represent the labels. You need to create labels separately.
I created a dataset and split it into train and test sets.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.20)
When I tried to implement a SVM classifier with the code below:
from sklearn.svm import SVC
svc_classifier = SVC(kernel='rbf')
svc_classifier.fit(X_train, y_train)
X_train.shape
and y_train.shape
are both (160,2)
.
When I run the last part I got ValueError: bad input shape (160, 2)
error. I know my training and testing samples must be the same size. But I’m wondering if there is a method to deal with this problem. Thank you!
This is the code that you want –
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
std = [[0.5, 0], [0, 0.5]]
X1 = np.vstack((
np.random.multivariate_normal([2, -2], std, size=200),
np.random.multivariate_normal([-2, 2], std, size=200)
))
y1 = np.zeros(X1.shape[0])
X2 = np.vstack((
np.random.multivariate_normal([2, 2], std, size=200),
np.random.multivariate_normal([-2, -2], std, size=200)
))
y2 = np.ones(X2.shape[0])
X = np.vstack((X1, X2))
y = np.hstack((y1, y2))
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20)
svc_classifier = SVC(kernel='rbf', gamma='auto')
svc_classifier.fit(X_train, y_train)
The original code you are using to create your data was just has Y
as the name. It is not supposed to represent the labels. You need to create labels separately.