How to center labels in histogram plot

Question:

I have a numpy array results that looks like

[ 0.  2.  0.  0.  0.  0.  3.  0.  0.  0.  0.  0.  0.  0.  0.  2.  0.  0.
  0.  0.  0.  1.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  0.  0.  0.  0.
  0.  1.  1.  0.  0.  0.  0.  2.  0.  3.  1.  0.  0.  2.  2.  0.  0.  0.
  0.  0.  0.  0.  0.  1.  1.  0.  0.  0.  0.  0.  0.  2.  0.  0.  0.  0.
  0.  1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  3.  1.  0.  0.  0.  0.  0.
  0.  0.  0.  1.  0.  0.  0.  1.  2.  2.]

I would like to plot a histogram of it. I have tried

import matplotlib.pyplot as plt
plt.hist(results, bins=range(5))
plt.show()

This gives me a histogram with the x-axis labelled 0.0 0.5 1.0 1.5 2.0 2.5 3.0. 3.5 4.0.

I would like the x-axis to be labelled 0 1 2 3 instead with the labels in the center of each bar. How can you do that?

Asked By: graffe

||

Answers:

you can build a bar plot out of a np.histogram.

Consider this

his = np.histogram(a,bins=range(5))
fig, ax = plt.subplots()
offset = .4
plt.bar(his[1][1:],his[0])
ax.set_xticks(his[1][1:] + offset)
ax.set_xticklabels( ('1', '2', '3', '4') )

enter image description here

EDIT: in order to get the bars touching one another, one has to play with the width parameter.

 fig, ax = plt.subplots()
 offset = .5
 plt.bar(his[1][1:],his[0],width=1)
 ax.set_xticks(his[1][1:] + offset)
 ax.set_xticklabels( ('1', '2', '3', '4') )

enter image description here

Answered By: Acorbe

The following alternative solution is compatible with plt.hist() (and this has the advantage for instance that you can call it after a pandas.DataFrame.hist().

import numpy as np

def bins_labels(bins, **kwargs):
    bin_w = (max(bins) - min(bins)) / (len(bins) - 1)
    plt.xticks(np.arange(min(bins)+bin_w/2, max(bins), bin_w), bins, **kwargs)
    plt.xlim(bins[0], bins[-1])

(The last line is not strictly requested by the OP but it makes the output nicer)

This can be used as in:

import matplotlib.pyplot as plt
bins = range(5)
plt.hist(results, bins=bins)
bins_labels(bins, fontsize=20)
plt.show()

Result: success!

Answered By: Pietro Battiston

The other answers just don’t do it for me. The benefit of using plt.bar over plt.hist is that bar can use align='center':

import numpy as np
import matplotlib.pyplot as plt

arr = np.array([ 0.,  2.,  0.,  0.,  0.,  0.,  3.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  2.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  1.,
        0.,  0.,  0.,  0.,  2.,  0.,  3.,  1.,  0.,  0.,  2.,  2.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  1.,  0.,  0.,  0.,  0.,
        0.,  0.,  2.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  3.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  1.,  0.,  0.,  0.,  1.,  2.,  2.])

labels, counts = np.unique(arr, return_counts=True)
plt.bar(labels, counts, align='center')
plt.gca().set_xticks(labels)
plt.show()

centering labels in a histogram

Answered By: Jarad

Here is a solution that only uses plt.hist().
Let’s break this down in two parts:

  1. Have the x-axis to be labelled 0 1 2 3.
  2. Have the labels in the center of each bar.

To have the x-axis labelled 0 1 2 3 without .5 values, you can use the function plt.xticks() and provide as argument the values that you want on the x axis. In your case, since you want 0 1 2 3, you can call plt.xticks(range(4)).

To have the labels in the center of each bar, you can pass the argument align='left' to the plt.hist() function. Below is your code, minimally modified to do that.

import matplotlib.pyplot as plt

results = [0,  2,  0,  0,  0,  0,  3,  0,  0,  0,  0,  0,  0,  0,  0,  2,  0,  0,
           0,  0,  0,  1,  0,  0,  0,  0,  0,  0,  0,  1,  0,  0,  0,  0,  0,  0,
           0,  1,  1,  0,  0,  0,  0,  2,  0,  3,  1,  0,  0,  2,  2,  0,  0,  0,
           0,  0,  0,  0,  0,  1,  1,  0,  0,  0,  0,  0,  0,  2,  0,  0,  0,  0,
           0,  1,  0,  0,  0,  0,  0,  0,  0,  0,  0,  3,  1,  0,  0,  0,  0,  0,
           0,  0,  0,  1,  0,  0,  0,  1,  2,  2]

plt.hist(results, bins=range(5), align='left')
plt.xticks(range(4))
plt.show()

enter image description here

Answered By: ricpacca

Like Jarad pointed out in his answer, barplot is a neat way to do it. Here’s a short way of plotting barplot using pandas.

import pandas as pd
import matplotlib.pyplot as plt

arr = [ 0.,  2.,  0.,  0.,  0.,  0.,  3.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  2.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  1.,
        0.,  0.,  0.,  0.,  2.,  0.,  3.,  1.,  0.,  0.,  2.,  2.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  1.,  0.,  0.,  0.,  0.,
        0.,  0.,  2.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  3.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  1.,  0.,  0.,  0.,  1.,  2.,  2.]

col = 'name'
pd.DataFrame({col : arr}).groupby(col).size().plot.bar()
plt.show()
Answered By: Igor KoĊ‚akowski

To center the labels on a matplotlib histogram of discrete values is enough to define the "bins" as a list of bin boundaries.

import matplotlib.pyplot as plt
%matplotlib inline

example_data = [0,0,1,0,1,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,1]

fig = plt.figure(figsize=(5,5))
ax1 = fig.add_subplot()
ax1_bars = [0,1]                           
ax1.hist( 
    example_data, 
    bins=[x for i in ax1_bars for x in (i-0.4,i+0.4)], 
    color='#404080')
ax1.set_xticks(ax1_bars)
ax1.set_xticklabels(['class 0 label','class 1 label'])
ax1.set_title("Example histogram")
ax1.set_yscale('log')
ax1.set_ylabel('quantity')

fig.tight_layout()
plt.show()

enter image description here

How this works?

  • The histogram bins parameter can be a list defining the boundaries of the bins. For a class that can assume the values 0 or 1, those boundaries should be [ -0.5, 0.5, 0.5, 1.5 ] which loosely translates as "bin 0" is from -0.5 to 1.5 and "bin 1" is from 0.5 to 1.5. Since the middle of those ranges are the discrete values the label will be on the expected place.

  • The expression [x for i in ax_bars for x in (i-0.4,i+0.4)] is just a way to generate the list of boundaries for a list of values (ax_bars).

  • The expression ax1.set_xticks(ax1_bars) is important to set the x axis to be discrete.

  • The rest should be self explanatory.

Answered By: Lucas

Use numpy to have bins centered at your requested values:

import matplotlib.pyplot as plt
import numpy as np
plt.hist(results, bins=np.arange(-0.5, 5))
plt.show()
Answered By: michael
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.