Sklearn's MinMaxScaler only returns zeros

Question:

I am trying to scale a some number to a range of 0 – 1 using preprocessing from sklearn. Thats what i did:

data = [44.645, 44.055, 44.54, 44.04, 43.975, 43.49, 42.04, 42.6, 42.46, 41.405]
min_max_scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))
data_scaled = min_max_scaler.fit_transform([data])
print data_scaled

But data_scaled only contains zeros. What am i doing wrong?

Asked By: Gizmo

||

Answers:

You’re putting your data into a list for some reason, but you shouldn’t:

data_scaled = min_max_scaler.fit_transform(data)
Answered By: John Zwinck

This is because data is a int32 or int64 and the MinMaxScaler needs a float. Try this:

import numpy as np
data = [44.645, 44.055, 44.54, 44.04, 43.975, 43.49, 42.04, 42.6, 42.46, 41.405]
min_max_scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))
data_scaled = min_max_scaler.fit_transform([np.float32(data)])
print data_scaled
Answered By: Cslayer20

I had the same problem when I tried scaling with MinMaxScaler from sklearn.preprocessing. Scaler returned me zeros when I used a shape a numpy array as list, i.e. [1, n] which looks like the following:

data = [[44.645, 44.055, 44.54, 44.04, 43.975, 43.49, 42.04, 42.6, 42.46, 41.405]]

I changed the shape of array to [n, 1]. In your case it would like the following

data = [[44.645], 
        [44.055], 
        [44.540], 
        [44.040], 
        [43.975], 
        [43.490], 
        [42.040], 
        [42.600], 
        [42.460], 
        [41.405]]

Then MinMaxScaler worked in proper way.

Answered By: Antonina
data = []
data = np.array(data)
data.append([44.645, 44.055, 44.54, 44.04, 43.975, 43.49, 42.04, 42.6, 42.46, 41.405])
min_max_scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))
data_scaled = min_max_scaler.fit_transform(data.reshape(10,-1))
data = data_scaled.reshape( -1, 10)
print data

The reason behind this is when you’re trying to apply fit_transform method of StandardScaler object to array of size (1, n) you obviously get all zeros, because for each number of array you subtract from it mean of this number, which equal to number and divide to std of this number. If you want to get correct scaling of your array, you should convert it to array with size (n, 1).

See the correct answer of this link :

Answered By: alyssaeliyah

They already give the right answer, but i solve my problem using the function numpy.vstack(<your array>), in your problem you can write like this:

import numpy as np

data = [44.645, 44.055, 44.54, 44.04, 43.975, 43.49, 42.04, 42.6, 42.46, 41.405]
min_max_scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))
data_scaled = min_max_scaler.fit_transform(np.vstack(data))
print(data_scaled)
#If you want to return in original format you can use 
#hstack function
data_scaled = np.hstack(data_scaled)

`

Answered By: Lucas
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.