Set maximum value (upper bound) in pandas DataFrame

Question:

I’m trying to set a maximum value of a pandas DataFrame column. For example:

my_dict = {'a':[10,12,15,17,19,20]}
df = pd.DataFrame(my_dict)

df['a'].set_max(15)

would yield:

    a
0   10
1   12
2   15
3   15
4   15
5   15

But it doesn’t.

There are a million solutions to find the maximum value, but nothing to set the maximum value… at least that I can find.

I could iterate through the list, but I suspect there is a faster way to do it with pandas. My lists will be significantly longer and thus I would expect iteration to take relatively longer amount of time. Also, I’d like whatever solution to be able to handle NaN.

Asked By: elPastor

||

Answers:

I suppose you can do:

maxVal = 15
df['a'].where(df['a'] <= maxVal, maxVal)      # where replace values with other when the 
                                              # condition is not satisfied

#0    10
#1    12
#2    15
#3    15
#4    15
#5    15
#Name: a, dtype: int64

Or:

df['a'][df['a'] >= maxVal] = maxVal
Answered By: Psidom

You can use clip.

Apply to all columns of the data frame:

df.clip(upper=15)

Otherwise apply to selected columns as seen here:

df.clip(upper=pd.Series({'a': 15}), axis=1)
Answered By: tommy.carstensen

numpy.clip is a good, fast alternative.

df

    a
0  10
1  12
2  15
3  17
4  19
5  20

np.clip(df['a'], a_max=15, a_min=None)

0    10
1    12
2    15
3    15
4    15
5    15
Name: a, dtype: int64

# Or,
np.clip(df['a'].to_numpy(), a_max=15, a_min=None)
# array([10, 12, 15, 15, 15, 15])

From v0.21 onwards, you can also use DataFrame.clip_upper.

Note
This method (along with clip_lower) has been deprecated from v0.24 and will be removed in a future version.

df.clip_upper(15)
# Or, for a specific column,
df['a'].clip_upper(15)

    a
0  10
1  12
2  15
3  15
4  15
5  15

In similar vein, if you only want to set the lower bound, use DataFrame.clip_lower. These methods are also avaliable on Series objects.

Answered By: cs95
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.