How do I adjust the tick labels on pandas histogram?

Question:

I am new to python and plotting and I am stuck on trying to adjust the tick labels to appear under the bins.

My data is 5 rows for this example:

9.50
11.80
46.68
4.38
30.97

I added this to a data frame called df.

My code is:

    xLabels = ['0 to 15','15 to 30','30 to 45','45 to 60','60 to 75']

    histCurr = df.hist(grid=False, rwidth=0.75, bins=[0, 15, 30, 45, 60, 75], range=[0,75])

    histCurr = histCurr[0]
    for x in histCurr:

        x.spines['right'].set_visible(False)
        x.spines['top'].set_visible(False)
        x.spines['left'].set_visible(False)

        x.tick_params(axis="both", bottom="off", top="off", labelbottom="on", left="off", right="off", labelleft="off")
        x.set_xlim(0,75)
        x.set_xticklabels(xLabels, ha = "center")

The labels appear all squished left.

Click here for image

I have tried changing the ha to right and left as well as center, and this does not help. I tried adding a range to the hist and a xlim and this did not help.

If I do not set xLabels (comment out the line where I have the x.set_xticklabels) and run this:

labels = [item.get_text() for item in x.get_xticklabels()]
labels

I get:

['0', '10', '20', '30', '40', '50', '60', '70', '80']

I found something online about changing the items in the list to the bin names but that was not exactly what I wanted either.

I would like the label for the bin to appear under the bin itself. Thank you for the help in advance!

Update:
I changed my code to this to help with the percentages over the bars and think I figured it out:
(source: https://towardsdatascience.com/advanced-histogram-using-python-bceae288e715)

         currDT = df[colNames[currLoc]]
        fig, ax = plt.subplots(figsize=(8,8))
        counts, bins, patches = ax.hist(currDT, rwidth=0.75, bins=[0, 15, 30, 45, 60, 75])
        ax.spines['right'].set_visible(False)
        ax.spines['top'].set_visible(False)
        ax.spines['left'].set_visible(False)
        ax.tick_params(axis="both", bottom="off", top="off", labelbottom="on", left="off", right="off", labelleft="off")

        bin_x_centers = 0.5 * np.diff(bins) + bins[:-1]

        ax.set_xticks(bin_x_centers)
        ax.set_xticklabels(xLabels)

        bin_x_centers = bin_x_centers-2
        bin_y_centers = ax.get_yticks()[-2]
        for i in range(len(bins)-1):
                if counts[i]/counts.sum() != 0:
                    bin_label = "  {0:,.0f}%".format((counts[i]/counts.sum())*100)
                else:
                    bin_label = ""
                plt.text(bin_x_centers[i], bin_y_centers, bin_label, rotation_mode='anchor')
Asked By: pav

||

Answers:

Here’s another way:

pd.cut(df[0], 
       bins=[0, 15, 30, 45, 60, 75], 
       labels = ['0 to 15','15 to 30',
                 '30 to 45','45 to 60',
                 '60 to 75'])
  .value_counts(sort=False).plot.bar()

Output:

enter image description here

Answered By: Scott Boston

I came here looking for a pd.Series.plot.hist() approach but found it easier to just do it in plotly

import plotly.express as px

px.histogram(
    df, x="t_depth", nbins=150 #<-- `x` is the column name
).update_layout(
    xaxis=dict(dtick=10), bargap=0.2
)

enter image description here

Answered By: Kermit
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.