How to predict a new instance using a Random Forest model predict?
Question:
I’m trying to build a Random Forest model using SKLearn:
data = sns.load_dataset('diamonds')
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
data["cut"] = le.fit_transform(data["cut"])
data["color"] = le.fit_transform(data["color"])
data["clarity"] = le.fit_transform(data["clarity"])
X = data[['carat', 'depth', 'table', 'x', 'y', 'z', 'clarity', 'cut', 'color']]
y = data[['price']]
And the model:
from sklearn.ensemble import RandomForestRegressor
regr = RandomForestRegressor(n_estimators = 50, max_depth = 10, random_state = 101)
regr.fit(X, y)
How to predict the price of a diamond with the following features:
novo = np.array([[0.3, 64.0, 55.0, 4.25, 4.28, 2.73, 2.0, 1.0, 6.0]])
novo
The following code:
regr.predict(novo)
Outputs the warning:
/usr/local/lib/python3.8/dist-packages/sklearn/base.py:450: UserWarning: X does not have valid feature names, but RandomForestRegressor was fitted with feature names
warnings.warn(
array([454.27157199])
What’s the right syntax for predict
?
Answers:
The warning indicates that your model was trained with a Dataframe, but is predicting with an array. You can predict using Dataframes, or so long as you are careful about not changing the order of the features, you can ignore the warning.
I’m trying to build a Random Forest model using SKLearn:
data = sns.load_dataset('diamonds')
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
data["cut"] = le.fit_transform(data["cut"])
data["color"] = le.fit_transform(data["color"])
data["clarity"] = le.fit_transform(data["clarity"])
X = data[['carat', 'depth', 'table', 'x', 'y', 'z', 'clarity', 'cut', 'color']]
y = data[['price']]
And the model:
from sklearn.ensemble import RandomForestRegressor
regr = RandomForestRegressor(n_estimators = 50, max_depth = 10, random_state = 101)
regr.fit(X, y)
How to predict the price of a diamond with the following features:
novo = np.array([[0.3, 64.0, 55.0, 4.25, 4.28, 2.73, 2.0, 1.0, 6.0]])
novo
The following code:
regr.predict(novo)
Outputs the warning:
/usr/local/lib/python3.8/dist-packages/sklearn/base.py:450: UserWarning: X does not have valid feature names, but RandomForestRegressor was fitted with feature names
warnings.warn(
array([454.27157199])
What’s the right syntax for predict
?
The warning indicates that your model was trained with a Dataframe, but is predicting with an array. You can predict using Dataframes, or so long as you are careful about not changing the order of the features, you can ignore the warning.