How to groupby().transform() to find mode in dataframe?

Question

I have a dataframe like:

lst = [["High", "A"], ["High", "A"], ["High", "B"],["Medium", "A"], ["Medium", "B"], ["Medium", "C"]]

df = pd.DataFrame(lst, columns =["Class", "Grade"])

I need to get the mode (majority vote) of "Grade" in each "Class".
If it’s a tie vote, assign "x".

Below is what I expect to get:

Class	Grade	Majority_vote
High	A	A
High	A	A
High	B	A
Medium	A	x
Medium	B	x
Medium	C	x

This is my code:

df['majority_vote'] = df.groupby(['Class'])['Grade'].transform(lambda x: x.mode()[0])

I think the code will return ‘nan’ if it’s a tie vote. Then, I will change ‘nan’ to ‘x’ later.

However, what I get is below:

Class	Grade	Majority_vote
High	A	A
High	A	A
High	B	A
Medium	A	A
Medium	B	A
Medium	C	A

At class "Medium", the code returns the 1st element ("A") instead of ‘nan’.

Any other method is appreciated.
Could you please help me? Thank you in advance.

Asked By: Meeow

||

Source

Answer 1

The issue with using x.mode()[0] is that pd.Series(['A', 'B', 'C']).mode() evaluates to ['A', 'B', 'C']. Meanwhile, pd.Series(['A', 'A', 'B']).mode() evaluates to ['A'].

Here is a function that will return the mode (if there is only one) and "x" if there is a tie (i.e., multiple modes).

import pandas as pd
lst = [["High", "A"], ["High", "A"], ["High", "B"],["Medium", "A"], ["Medium", "B"], ["Medium", "C"]]
df = pd.DataFrame(lst, columns=["Class", "Grade"])

def get_mode_or_x(series):
    mode = series.mode()
    if mode.size == 1:
        return mode[0]
    return "x"

df.loc[:, "majority_vote"] = df.groupby("Class")["Grade"].transform(get_mode_or_x)

index	Class	Grade	majority_vote
0	High	A	A
1	High	A	A
2	High	B	A
3	Medium	A	x
4	Medium	B	x
5	Medium	C	x

Answered By: jakub

How to groupby().transform() to find mode in dataframe?

Question:

Answers: