Uncomfortable output of mode() in pandas Dataframe

Question:

I have a dataframe with several columns (the features).

>>> print(df)

   col1  col2
a     1     1
b     2     2
c     3     3
d     3     2

I would like to compute the mode of one of them. This is what happens:

>>> print(df['col1'].mode())

0    3
dtype: int64

I would like to output simply the value 3.
This behavoiur is quite strange, if you consider that the following very similar code is working:

>>> print(df['col1'].mean())

2.25

So two questions: why does this happen? How can I obtain the pure mode value as it happens for the mean?

Asked By: Bernheart

||

Answers:

Because Series.mode() can return multiple values:

consider the following DF:

In [77]: df
Out[77]:
   col1  col2
a     1     1
b     2     2
c     3     3
d     3     2
e     2     3

In [78]: df['col1'].mode()
Out[78]:
0    2
1    3
dtype: int64

From docstring:

Empty if nothing occurs at least 2 times. Always returns Series
even if only one value.

If you want to chose the first value:

In [83]: df['col1'].mode().iloc[0]
Out[83]: 2

In [84]: df['col1'].mode()[0]
Out[84]: 2

I agree that it’s too cumbersome

df[‘col1’].mode().iloc[0].values[0]

Answered By: alpha

mode() will return all values that tie for the most frequent value.

In order to support that functionality, it must return a collection, which takes the form of a dataFrame or Series.

For example, if you had a series:

[2, 2, 3, 3, 5, 5, 6]

Then the most frequent values occur twice. The result would then be the series [2, 3, 5] since each of these occur twice.

If you want to collapse this into a single value, you can access the first value, compute the max(), min(), or whatever makes most sense for your application.

Answered By: Ryan

a series can have one mean(), but a series can have more than one mode()

like

<2,2,2,3,3,3,4,4,4,5,6,7,8> its mode 2,3,4.

the output must be indexed

Answered By: Fatih
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.