# NotFittedError (instance is not fitted yet) after invoked cross_validate

## Question:

This is my minimal reproducible example:

```
x = np.array([
[1, 2],
[3, 4],
[5, 6],
[6, 7]
])
y = [1, 0, 0, 1]
model = GaussianNB()
scores = cross_validate(model, x, y, cv=2, scoring=("accuracy"))
model.predict([8,9])
```

What I intended to do is instantiating a Gaussian Naive Bayes Classifier and use sklearn.model_selection.cross_validate for cross validate my model (I am using `cross_validate`

instead of `cross_val_score`

since in my real project I need precision, recall and f1 as well).

I have read in the doc that `cross_validate`

does "evaluate metric(s) by cross-validation and also record fit/score times."

I expected that my `model`

would have been fitted on `x`

(features), `y`

(labels) data but when I invoke `model.predict(.)`

I get:

sklearn.exceptions.NotFittedError: This GaussianNB instance is not fitted yet. Call ‘fit’ with appropriate arguments before using this estimator.

Of course it says me about invoking `model.fit(x,y)`

before "using the estimator" (that is before invoking `model.predict(.)`

.

Shouldn’t the model have been fitted `cv=2`

times when I invoke `cross_validate(...)`

?

## Answers:

A close look at `cross_validate`

documentation reveals that it includes an argument:

return_estimator :bool, default=FalseWhether to return the estimators fitted on each split.

So, by default it will not return any fitted estimator (hence it cannot be used to `predict`

).

In order to predict with the fitted estimator(s), you need to set the argument to `True`

; but beware, you will **not** get a *single* fitted model, but a number of models equal to your `cv`

parameter value (here 2):

```
import numpy as np
from sklearn.naive_bayes import GaussianNB
from sklearn.model_selection import cross_validate
x = np.array([
[1, 2],
[3, 4],
[5, 6],
[6, 7]
])
y = [1, 0, 0, 1]
model = GaussianNB()
scores = cross_validate(model, x, y, cv=2, scoring=("accuracy"), return_estimator=True)
scores
# result:
{'fit_time': array([0.00124454, 0.00095725]),
'score_time': array([0.00090432, 0.00054836]),
'estimator': [GaussianNB(), GaussianNB()],
'test_score': array([0.5, 0.5])}
```

So, in order to get predictions from each fitted model, you need:

```
scores['estimator'][0].predict([[8,9]])
# array([1])
scores['estimator'][1].predict([[8,9]])
# array([0])
```

This may look inconvenient, but it is like that by design: `cross_validate`

is generally meant only to return the scores necessary for diagnosis and assessment, not to be used for fitting models which are to be used for predictions.