How to predict a new instance using a Random Forest model predict?

Question:

I’m trying to build a Random Forest model using SKLearn:

data = sns.load_dataset('diamonds')

from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()

data["cut"] = le.fit_transform(data["cut"])
data["color"] = le.fit_transform(data["color"])
data["clarity"] = le.fit_transform(data["clarity"])

X = data[['carat', 'depth', 'table', 'x', 'y', 'z', 'clarity', 'cut', 'color']]
y = data[['price']]

And the model:

from sklearn.ensemble import RandomForestRegressor

regr = RandomForestRegressor(n_estimators = 50, max_depth = 10, random_state = 101)
regr.fit(X, y)

How to predict the price of a diamond with the following features:

novo = np.array([[0.3, 64.0, 55.0, 4.25, 4.28, 2.73, 2.0, 1.0, 6.0]])
novo

The following code:

regr.predict(novo)

Outputs the warning:

/usr/local/lib/python3.8/dist-packages/sklearn/base.py:450: UserWarning: X does not have valid feature names, but RandomForestRegressor was fitted with feature names
  warnings.warn(

array([454.27157199])

What’s the right syntax for predict?

Asked By: unstuck

||

Answers:

The warning indicates that your model was trained with a Dataframe, but is predicting with an array. You can predict using Dataframes, or so long as you are careful about not changing the order of the features, you can ignore the warning.

Answered By: eschibli
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.