How to avoid graphing duplicate rows in Pandas .plot()

Question

I have a Pandas DataFrame that looks like this with many entries for all 50 US states:

State Name	School District	Schools Per District
Alabama	Alabama District 1	21
Alabama	Alabama District 2	5
Alaska	Alaska District 1	3
Alaska	Alaska District 2	4

I want to use Pandas to graph the numbers of school vs. state, and so far I have the following code:

school_data.plot(kind='bar', 
                     x="State Name", 
                     xlabel="State",
                     y="Schools Per District",
                     ylabel="Number of Schools",
                     rot=0,
                     width=10,
                     figsize=(15, 5),
                     title="Number of Schools per District vs. US State"
                     );

However, the resulting graph I believe is graphing every single school district instead of summing all school districts by state, and is therefore printing too much data.

How would I fix this so that there are only 50 bars on the graph, where each bar represents the total number of schools per state?

Asked By: FJJ

||

Source

Answer 1

You can group State Name and sum school per districs then create a bar chart using the agreggated data

# Group the data by 'State Name' and sum the 'Schools Per District' values
grouped_data = school_data.groupby('State Name')['Schools Per District'].sum().reset_index()

# Plot the aggregated data
ax = grouped_data.plot(kind='bar', 
                       x='State Name', 
                       xlabel='State',
                       y='Schools Per District',
                       ylabel='Number of Schools',
                       rot=0,
                       width=10,
                       figsize=(15, 5),
                       title='Number of Schools per District vs. US State'
                      )

The resulting graph with only 50 bars, where each bar represents the total number of schools per state.

Answered By: Hasan Patel

How to avoid graphing duplicate rows in Pandas .plot()

Question:

Answers: