AttributeError: 'list' object has no attribute 'max'

Question:

I have a DF as such where I want to get the MAX number of Votes per person for an election year.

However, I also want to sum the amount of votes that they had overall. So Mark Smith would have 70 votes and John Key would have 80 votes. I’ve been trying to use the following code to get the max amount per columns but I’m getting the following error: AttributeError: ‘list’ object has no attribute ‘max’
Can you point me in the right direction ?
Thank you!

DF.loc[DF.groupby(['name', 'election_year'],['votes'].max())]

votes     name              election_year
20        Mark Smith         2020   
30        Mark Smith         2020 
40        Mark Smith         2022  
20        John Key           2000
40        John Key           2000
40        John Key           2022
Asked By: sundaynightlive

||

Answers:

Problem is that you use the wrong syntax for groupby

out = df.groupby(['name', 'election_year'])['votes'].max().reset_index()
# or to keep the index
out = df.loc[df.groupby(['name', 'election_year'])['votes'].idxmax()].sort_index()
print(out)

   votes        name  election_year
1     30  Mark Smith           2020
2     40  Mark Smith           2022
4     40    John Key           2000
5     40    John Key           2022

To get the sum for maximum votes, you can do

out = out.groupby(['name'], as_index=False)['votes'].sum()
print(out)

         name  votes
0    John Key     80
1  Mark Smith     70
Answered By: Ynjxsjmh

The error you are seeing is because you are calling the max() function on a list, which is not a valid operation. You need to call the max() function on a Series or a DataFrame.

To get the maximum number of votes per person for an election year and also sum the amount of votes they had overall, you can use the groupby() method twice. First, group the data by ‘name’ and ‘election_year’ and then calculate the maximum number of votes for each group. Next, group the data again by ‘name’ and sum up the ‘votes’ column for each group. Finally, merge the two results together.

Here’s an example code that does that:

# Group the data by 'name' and 'election_year', and calculate the maximum number of votes for each group
max_votes = DF.groupby(['name', 'election_year'])['votes'].max()

# Group the data by 'name' and sum up the 'votes' column for each group
total_votes = DF.groupby(['name'])['votes'].sum()

# Merge the two results together using the 'name' and 'election_year' columns
result = pd.merge(max_votes, total_votes, on='name')

# Print the result
print(result)

This should give you a dataframe with the maximum number of votes for each person in each election year, as well as the total number of votes for each person.

Answered By: Moiz Ali Syed
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.