lambda function to scale column in pandas dataframe returns: "'float' object has no attribute 'min'"
Question:
I am just getting started in Python and Machine Learning and have encountered an issue which I haven’t been able to fix myself or with any other online resource.
I am trying to scale a column in a pandas dataframe using a lambda function in the following way:
X['col1'] = X['col1'].apply(lambda x: (x - x.min()) / (x.max() - x.min()))
and get the following error message:
‘float’ object has no attribute ‘min’
I have tried to convert the data type into integer and the following error is returned:
‘int’ object has no attribute ‘min’
I believe I am getting something pretty basic wrong, hope anyone can point me in the right direction.
Answers:
I think apply here is not necessary, because exist faster vectorized solution – change x
to column X['col1']
:
X = pd.DataFrame({'col1': [100,10,1,20,10,-20,200]})
X['col2'] = (X['col1'] - X['col1'].min()) / (X['col1'].max() - X['col1'].min())
print (X)
col1 col2
0 100 0.545455
1 10 0.136364
2 1 0.095455
3 20 0.181818
4 10 0.136364
5 -20 0.000000
6 200 1.000000
Like @meW pointed in comments another solution is use MinMaxScaler
:
from sklearn import preprocessing
min_max_scaler = preprocessing.MinMaxScaler()
X['col2'] = min_max_scaler.fit_transform(X[['col1']])
print (X)
col1 col2
0 100 0.545455
1 10 0.136364
2 1 0.095455
3 20 0.181818
4 10 0.136364
5 -20 0.000000
6 200 1.000000
Check below code !
@ if condition is required for .apply(–if–else,axis=0/1) !
@ else use assign method, it will also give same result !
df=pd.DataFrame({‘salary’:[10,29,76,54,32]})
df.apply(lambda x: ((x-x.min())/(x.max()-x.min())) if x.name == ‘salary’ else x, axis=0)
df.assign(salary=lambda x: ((x[‘salary’]-x[‘salary’].min() )/(x[‘salary’].max()-x[‘salary’].min()) ))
I am just getting started in Python and Machine Learning and have encountered an issue which I haven’t been able to fix myself or with any other online resource.
I am trying to scale a column in a pandas dataframe using a lambda function in the following way:
X['col1'] = X['col1'].apply(lambda x: (x - x.min()) / (x.max() - x.min()))
and get the following error message:
‘float’ object has no attribute ‘min’
I have tried to convert the data type into integer and the following error is returned:
‘int’ object has no attribute ‘min’
I believe I am getting something pretty basic wrong, hope anyone can point me in the right direction.
I think apply here is not necessary, because exist faster vectorized solution – change x
to column X['col1']
:
X = pd.DataFrame({'col1': [100,10,1,20,10,-20,200]})
X['col2'] = (X['col1'] - X['col1'].min()) / (X['col1'].max() - X['col1'].min())
print (X)
col1 col2
0 100 0.545455
1 10 0.136364
2 1 0.095455
3 20 0.181818
4 10 0.136364
5 -20 0.000000
6 200 1.000000
Like @meW pointed in comments another solution is use MinMaxScaler
:
from sklearn import preprocessing
min_max_scaler = preprocessing.MinMaxScaler()
X['col2'] = min_max_scaler.fit_transform(X[['col1']])
print (X)
col1 col2
0 100 0.545455
1 10 0.136364
2 1 0.095455
3 20 0.181818
4 10 0.136364
5 -20 0.000000
6 200 1.000000
Check below code !
@ if condition is required for .apply(–if–else,axis=0/1) !
@ else use assign method, it will also give same result !
df=pd.DataFrame({‘salary’:[10,29,76,54,32]})
df.apply(lambda x: ((x-x.min())/(x.max()-x.min())) if x.name == ‘salary’ else x, axis=0)
df.assign(salary=lambda x: ((x[‘salary’]-x[‘salary’].min() )/(x[‘salary’].max()-x[‘salary’].min()) ))