ValueError: operands could not be broadcast together with shapes – inverse_transform- Python
Question:
I know ValueError
question has been asked many times. I am still struggling to find an answer because I am using inverse_transform
in my code.
Say I have an array a
a.shape
> (100,20)
and another array b
b.shape
> (100,3)
When I did a np.concatenate
,
hat = np.concatenate((a, b), axis=1)
Now shape of hat
is
hat.shape
(100,23)
After this, I tried to do this,
inversed_hat = scaler.inverse_transform(hat)
When I do this, I am getting an error:
ValueError: operands could not be broadcast together with shapes (100,23) (25,) (100,23)
Is this broadcast error in inverse_transform
? Any suggestion will be helpful. Thanks in advance!
Answers:
Although you didn’t specify, I’m assuming you are using inverse_transform()
from scikit learn’s StandardScaler
. You need to fit the data first.
import numpy as np
from sklearn.preprocessing import MinMaxScaler
In [1]: arr_a = np.random.randn(5*3).reshape((5, 3))
In [2]: arr_b = np.random.randn(5*2).reshape((5, 2))
In [3]: arr = np.concatenate((arr_a, arr_b), axis=1)
In [4]: scaler = MinMaxScaler(feature_range=(0, 1)).fit(arr)
In [5]: scaler.inverse_transform(arr)
Out[5]:
array([[ 0.19981115, 0.34855509, -1.02999482, -1.61848816, -0.26005923],
[-0.81813499, 0.09873672, 1.53824716, -0.61643731, -0.70210801],
[-0.45077786, 0.31584348, 0.98219019, -1.51364126, 0.69791054],
[ 0.43664741, -0.16763207, -0.26148908, -2.13395823, 0.48079204],
[-0.37367434, -0.16067958, -3.20451107, -0.76465428, 1.09761543]])
In [6]: new_arr = scaler.inverse_transform(arr)
In [7]: new_arr.shape == arr.shape
Out[7]: True
It seems you are using pre-fit scaler object of sklearn.preprocessing.
If it’s true, according to me data that you have used for fitting is of dimension (x,25) whereas your data shape is of (x,23) dimension and thats the reason you are getting this issue.
The problem here is that the scaler has the information of your 25-column df, but you have updated your df to 23 columns, so it cannot do the ‘inverse’ function.
To fix the problem, you can do the fit on the 23-column original dataframe, and then do the ‘inverse’ on your desired 23-column dataframe.
More info:
scaler object keeps track of the information needed to perform the inverse transformation. When you fit a scaler to a dataset using the fit() method, the scaler computes the statistics (such as mean and variance for StandardScaler or minimum and maximum for MinMaxScaler) of the data and stores them in its internal state.
I know ValueError
question has been asked many times. I am still struggling to find an answer because I am using inverse_transform
in my code.
Say I have an array a
a.shape
> (100,20)
and another array b
b.shape
> (100,3)
When I did a np.concatenate
,
hat = np.concatenate((a, b), axis=1)
Now shape of hat
is
hat.shape
(100,23)
After this, I tried to do this,
inversed_hat = scaler.inverse_transform(hat)
When I do this, I am getting an error:
ValueError: operands could not be broadcast together with shapes (100,23) (25,) (100,23)
Is this broadcast error in inverse_transform
? Any suggestion will be helpful. Thanks in advance!
Although you didn’t specify, I’m assuming you are using . You need to fit the data first.inverse_transform()
from scikit learn’s StandardScaler
import numpy as np
from sklearn.preprocessing import MinMaxScaler
In [1]: arr_a = np.random.randn(5*3).reshape((5, 3))
In [2]: arr_b = np.random.randn(5*2).reshape((5, 2))
In [3]: arr = np.concatenate((arr_a, arr_b), axis=1)
In [4]: scaler = MinMaxScaler(feature_range=(0, 1)).fit(arr)
In [5]: scaler.inverse_transform(arr)
Out[5]:
array([[ 0.19981115, 0.34855509, -1.02999482, -1.61848816, -0.26005923],
[-0.81813499, 0.09873672, 1.53824716, -0.61643731, -0.70210801],
[-0.45077786, 0.31584348, 0.98219019, -1.51364126, 0.69791054],
[ 0.43664741, -0.16763207, -0.26148908, -2.13395823, 0.48079204],
[-0.37367434, -0.16067958, -3.20451107, -0.76465428, 1.09761543]])
In [6]: new_arr = scaler.inverse_transform(arr)
In [7]: new_arr.shape == arr.shape
Out[7]: True
It seems you are using pre-fit scaler object of sklearn.preprocessing.
If it’s true, according to me data that you have used for fitting is of dimension (x,25) whereas your data shape is of (x,23) dimension and thats the reason you are getting this issue.
The problem here is that the scaler has the information of your 25-column df, but you have updated your df to 23 columns, so it cannot do the ‘inverse’ function.
To fix the problem, you can do the fit on the 23-column original dataframe, and then do the ‘inverse’ on your desired 23-column dataframe.
More info:
scaler object keeps track of the information needed to perform the inverse transformation. When you fit a scaler to a dataset using the fit() method, the scaler computes the statistics (such as mean and variance for StandardScaler or minimum and maximum for MinMaxScaler) of the data and stores them in its internal state.