Having trouble create a histogram with plt.hist, with year on the x-axis in python

Question:

Hi I have a data frame containing students and their graduation year. Column 1 is the student name, such as Peter, Henry… And column 2 is the student’s graduation year, such as 2023, 2024.

I tried to build a histogram to count the number of students in each graduation year, and display the year on the x-axis.

I tried this code:
”’

import matplotlib.pyplot as plt
plt.figure()
plt.hist(df['Student Grad Year'])

”’

But it doesn’t give the right result, not sure why the last two bars are connected. I also want to show the year value in the middle of each bar. Note that, the value in ‘Grad Year’ column is int. Should it be converted to datetime type first?

enter image description here

Asked By: Orchid9

||

Answers:

The default number of bins is 10. Since the data is discrete, you can just set the bins to the years that appear in the data set. However, you have to specify the bins’ boundaries, which would be half years, e.g. like this:

import matplotlib.pyplot as plt
plt.figure()
years = np.unique(df['Student Grad Year'].values)
bins = np.append(years[0] - 0.5, years + 0.5)
plt.hist(df['Student Grad Year'], bins=bins)
Answered By: BernieD