AttributeError: 'list' object has no attribute 'max'
Question:
I have a DF as such where I want to get the MAX number of Votes per person for an election year.
However, I also want to sum the amount of votes that they had overall. So Mark Smith would have 70 votes and John Key would have 80 votes. I’ve been trying to use the following code to get the max amount per columns but I’m getting the following error: AttributeError: ‘list’ object has no attribute ‘max’
Can you point me in the right direction ?
Thank you!
DF.loc[DF.groupby(['name', 'election_year'],['votes'].max())]
votes name election_year
20 Mark Smith 2020
30 Mark Smith 2020
40 Mark Smith 2022
20 John Key 2000
40 John Key 2000
40 John Key 2022
Answers:
Problem is that you use the wrong syntax for groupby
out = df.groupby(['name', 'election_year'])['votes'].max().reset_index()
# or to keep the index
out = df.loc[df.groupby(['name', 'election_year'])['votes'].idxmax()].sort_index()
print(out)
votes name election_year
1 30 Mark Smith 2020
2 40 Mark Smith 2022
4 40 John Key 2000
5 40 John Key 2022
To get the sum for maximum votes, you can do
out = out.groupby(['name'], as_index=False)['votes'].sum()
print(out)
name votes
0 John Key 80
1 Mark Smith 70
The error you are seeing is because you are calling the max() function on a list, which is not a valid operation. You need to call the max() function on a Series or a DataFrame.
To get the maximum number of votes per person for an election year and also sum the amount of votes they had overall, you can use the groupby() method twice. First, group the data by ‘name’ and ‘election_year’ and then calculate the maximum number of votes for each group. Next, group the data again by ‘name’ and sum up the ‘votes’ column for each group. Finally, merge the two results together.
Here’s an example code that does that:
# Group the data by 'name' and 'election_year', and calculate the maximum number of votes for each group
max_votes = DF.groupby(['name', 'election_year'])['votes'].max()
# Group the data by 'name' and sum up the 'votes' column for each group
total_votes = DF.groupby(['name'])['votes'].sum()
# Merge the two results together using the 'name' and 'election_year' columns
result = pd.merge(max_votes, total_votes, on='name')
# Print the result
print(result)
This should give you a dataframe with the maximum number of votes for each person in each election year, as well as the total number of votes for each person.
I have a DF as such where I want to get the MAX number of Votes per person for an election year.
However, I also want to sum the amount of votes that they had overall. So Mark Smith would have 70 votes and John Key would have 80 votes. I’ve been trying to use the following code to get the max amount per columns but I’m getting the following error: AttributeError: ‘list’ object has no attribute ‘max’
Can you point me in the right direction ?
Thank you!
DF.loc[DF.groupby(['name', 'election_year'],['votes'].max())]
votes name election_year
20 Mark Smith 2020
30 Mark Smith 2020
40 Mark Smith 2022
20 John Key 2000
40 John Key 2000
40 John Key 2022
Problem is that you use the wrong syntax for groupby
out = df.groupby(['name', 'election_year'])['votes'].max().reset_index()
# or to keep the index
out = df.loc[df.groupby(['name', 'election_year'])['votes'].idxmax()].sort_index()
print(out)
votes name election_year
1 30 Mark Smith 2020
2 40 Mark Smith 2022
4 40 John Key 2000
5 40 John Key 2022
To get the sum for maximum votes, you can do
out = out.groupby(['name'], as_index=False)['votes'].sum()
print(out)
name votes
0 John Key 80
1 Mark Smith 70
The error you are seeing is because you are calling the max() function on a list, which is not a valid operation. You need to call the max() function on a Series or a DataFrame.
To get the maximum number of votes per person for an election year and also sum the amount of votes they had overall, you can use the groupby() method twice. First, group the data by ‘name’ and ‘election_year’ and then calculate the maximum number of votes for each group. Next, group the data again by ‘name’ and sum up the ‘votes’ column for each group. Finally, merge the two results together.
Here’s an example code that does that:
# Group the data by 'name' and 'election_year', and calculate the maximum number of votes for each group
max_votes = DF.groupby(['name', 'election_year'])['votes'].max()
# Group the data by 'name' and sum up the 'votes' column for each group
total_votes = DF.groupby(['name'])['votes'].sum()
# Merge the two results together using the 'name' and 'election_year' columns
result = pd.merge(max_votes, total_votes, on='name')
# Print the result
print(result)
This should give you a dataframe with the maximum number of votes for each person in each election year, as well as the total number of votes for each person.