AttributeError: 'Series' object has no attribute 'reshape'

Question:

I’m using sci-kit learn linear regression algorithm.
While scaling Y target feature with:

Ys = scaler.fit_transform(Y)

I got

ValueError: Expected 2D array, got 1D array instead:

After that I reshaped using:

Ys = scaler.fit_transform(Y.reshape(-1,1))

But got error again:

AttributeError: ‘Series’ object has no attribute ‘reshape’

So I checked pandas.Series documentation page and it says:

reshape(*args, **kwargs) Deprecated since version 0.19.0.

Asked By: Hrvoje

||

Answers:

Solution was linked on reshaped method on documentation page.

Insted of Y.reshape(-1,1) you need to use:

Y.values.reshape(-1,1)
Answered By: Hrvoje

The solution is indeed to do:

Y.values.reshape(-1,1)

This extracts a numpy array with the values of your pandas Series object and then reshapes it to a 2D array.

The reason you need to do this is that pandas Series objects are by design one dimensional. Another solution if you would like to stay within the pandas library would be to convert the Series to a DataFrame which would then be 2D:

Y = pd.Series([1,2,3,1,2,3,4,32,2,3,42,3])

scaler = StandardScaler()

Ys = scaler.fit_transform(pd.DataFrame(Y))
Answered By: João Almeida

You cannot reshape a pandas series, so you need to perform the operation on a numpy array. As others have suggested, you can use y.values.reshape(-1, 1), but if you want to impress your friends, you can use:

y.values[Ellipsis, None]

Which is equivalent to:

y.values[..., None]

It basically means all dimensions as they where, then a new dimension for the last one. Here’s a fully working example:

import numpy as np
import pandas as pd
from sklearn.preprocessing import StandardScaler

y = pd.Series(np.random.rand(5))
0    0.497165
1    0.818659
2    0.327064
3    0.772548
4    0.095715
dtype: float64
scaler = StandardScaler()

scaler.fit_transform(y.values[Ellipsis, None])
array([[-0.019],
       [ 1.165],
       [-0.645],
       [ 0.995],
       [-1.496]])
Answered By: Nicolas Gervais

Using MinMaxScaler to transform the Series to Dataframe worked on my end.

from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()

Y = scaler.fit_transform(pd.DataFrame(y))
Answered By: Frederico23