How to center labels in histogram plot
Question:
I have a numpy array results
that looks like
[ 0. 2. 0. 0. 0. 0. 3. 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0.
0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.
0. 1. 1. 0. 0. 0. 0. 2. 0. 3. 1. 0. 0. 2. 2. 0. 0. 0.
0. 0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0.
0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 3. 1. 0. 0. 0. 0. 0.
0. 0. 0. 1. 0. 0. 0. 1. 2. 2.]
I would like to plot a histogram of it. I have tried
import matplotlib.pyplot as plt
plt.hist(results, bins=range(5))
plt.show()
This gives me a histogram with the x-axis labelled 0.0 0.5 1.0 1.5 2.0 2.5 3.0. 3.5 4.0
.
I would like the x-axis to be labelled 0 1 2 3 instead with the labels in the center of each bar. How can you do that?
Answers:
you can build a bar
plot out of a np.histogram
.
Consider this
his = np.histogram(a,bins=range(5))
fig, ax = plt.subplots()
offset = .4
plt.bar(his[1][1:],his[0])
ax.set_xticks(his[1][1:] + offset)
ax.set_xticklabels( ('1', '2', '3', '4') )
EDIT: in order to get the bars touching one another, one has to play with the width parameter.
fig, ax = plt.subplots()
offset = .5
plt.bar(his[1][1:],his[0],width=1)
ax.set_xticks(his[1][1:] + offset)
ax.set_xticklabels( ('1', '2', '3', '4') )
The following alternative solution is compatible with plt.hist()
(and this has the advantage for instance that you can call it after a pandas.DataFrame.hist()
.
import numpy as np
def bins_labels(bins, **kwargs):
bin_w = (max(bins) - min(bins)) / (len(bins) - 1)
plt.xticks(np.arange(min(bins)+bin_w/2, max(bins), bin_w), bins, **kwargs)
plt.xlim(bins[0], bins[-1])
(The last line is not strictly requested by the OP but it makes the output nicer)
This can be used as in:
import matplotlib.pyplot as plt
bins = range(5)
plt.hist(results, bins=bins)
bins_labels(bins, fontsize=20)
plt.show()
The other answers just don’t do it for me. The benefit of using plt.bar
over plt.hist
is that bar can use align='center'
:
import numpy as np
import matplotlib.pyplot as plt
arr = np.array([ 0., 2., 0., 0., 0., 0., 3., 0., 0., 0., 0., 0., 0.,
0., 0., 2., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.,
0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 1., 1.,
0., 0., 0., 0., 2., 0., 3., 1., 0., 0., 2., 2., 0.,
0., 0., 0., 0., 0., 0., 0., 1., 1., 0., 0., 0., 0.,
0., 0., 2., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 3., 1., 0., 0., 0., 0., 0., 0.,
0., 0., 1., 0., 0., 0., 1., 2., 2.])
labels, counts = np.unique(arr, return_counts=True)
plt.bar(labels, counts, align='center')
plt.gca().set_xticks(labels)
plt.show()
Here is a solution that only uses plt.hist()
.
Let’s break this down in two parts:
- Have the x-axis to be labelled
0 1 2 3
.
- Have the labels in the center of each bar.
To have the x-axis labelled 0 1 2 3
without .5
values, you can use the function plt.xticks()
and provide as argument the values that you want on the x axis. In your case, since you want 0 1 2 3
, you can call plt.xticks(range(4))
.
To have the labels in the center of each bar, you can pass the argument align='left'
to the plt.hist()
function. Below is your code, minimally modified to do that.
import matplotlib.pyplot as plt
results = [0, 2, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0,
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
0, 1, 1, 0, 0, 0, 0, 2, 0, 3, 1, 0, 0, 2, 2, 0, 0, 0,
0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 1, 0, 0, 0, 0, 0,
0, 0, 0, 1, 0, 0, 0, 1, 2, 2]
plt.hist(results, bins=range(5), align='left')
plt.xticks(range(4))
plt.show()
Like Jarad pointed out in his answer, barplot is a neat way to do it. Here’s a short way of plotting barplot using pandas.
import pandas as pd
import matplotlib.pyplot as plt
arr = [ 0., 2., 0., 0., 0., 0., 3., 0., 0., 0., 0., 0., 0.,
0., 0., 2., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.,
0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 1., 1.,
0., 0., 0., 0., 2., 0., 3., 1., 0., 0., 2., 2., 0.,
0., 0., 0., 0., 0., 0., 0., 1., 1., 0., 0., 0., 0.,
0., 0., 2., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 3., 1., 0., 0., 0., 0., 0., 0.,
0., 0., 1., 0., 0., 0., 1., 2., 2.]
col = 'name'
pd.DataFrame({col : arr}).groupby(col).size().plot.bar()
plt.show()
To center the labels on a matplotlib histogram of discrete values is enough to define the "bins" as a list of bin boundaries.
import matplotlib.pyplot as plt
%matplotlib inline
example_data = [0,0,1,0,1,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,1]
fig = plt.figure(figsize=(5,5))
ax1 = fig.add_subplot()
ax1_bars = [0,1]
ax1.hist(
example_data,
bins=[x for i in ax1_bars for x in (i-0.4,i+0.4)],
color='#404080')
ax1.set_xticks(ax1_bars)
ax1.set_xticklabels(['class 0 label','class 1 label'])
ax1.set_title("Example histogram")
ax1.set_yscale('log')
ax1.set_ylabel('quantity')
fig.tight_layout()
plt.show()
How this works?
-
The histogram bins
parameter can be a list defining the boundaries of the bins. For a class that can assume the values 0 or 1, those boundaries should be [ -0.5, 0.5, 0.5, 1.5 ]
which loosely translates as "bin 0" is from -0.5 to 1.5 and "bin 1" is from 0.5 to 1.5. Since the middle of those ranges are the discrete values the label will be on the expected place.
-
The expression [x for i in ax_bars for x in (i-0.4,i+0.4)]
is just a way to generate the list of boundaries for a list of values (ax_bars
).
-
The expression ax1.set_xticks(ax1_bars)
is important to set the x axis to be discrete.
-
The rest should be self explanatory.
Use numpy
to have bins centered at your requested values:
import matplotlib.pyplot as plt
import numpy as np
plt.hist(results, bins=np.arange(-0.5, 5))
plt.show()
I have a numpy array results
that looks like
[ 0. 2. 0. 0. 0. 0. 3. 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0.
0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.
0. 1. 1. 0. 0. 0. 0. 2. 0. 3. 1. 0. 0. 2. 2. 0. 0. 0.
0. 0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0.
0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 3. 1. 0. 0. 0. 0. 0.
0. 0. 0. 1. 0. 0. 0. 1. 2. 2.]
I would like to plot a histogram of it. I have tried
import matplotlib.pyplot as plt
plt.hist(results, bins=range(5))
plt.show()
This gives me a histogram with the x-axis labelled 0.0 0.5 1.0 1.5 2.0 2.5 3.0. 3.5 4.0
.
I would like the x-axis to be labelled 0 1 2 3 instead with the labels in the center of each bar. How can you do that?
you can build a bar
plot out of a np.histogram
.
Consider this
his = np.histogram(a,bins=range(5))
fig, ax = plt.subplots()
offset = .4
plt.bar(his[1][1:],his[0])
ax.set_xticks(his[1][1:] + offset)
ax.set_xticklabels( ('1', '2', '3', '4') )
EDIT: in order to get the bars touching one another, one has to play with the width parameter.
fig, ax = plt.subplots()
offset = .5
plt.bar(his[1][1:],his[0],width=1)
ax.set_xticks(his[1][1:] + offset)
ax.set_xticklabels( ('1', '2', '3', '4') )
The following alternative solution is compatible with plt.hist()
(and this has the advantage for instance that you can call it after a pandas.DataFrame.hist()
.
import numpy as np
def bins_labels(bins, **kwargs):
bin_w = (max(bins) - min(bins)) / (len(bins) - 1)
plt.xticks(np.arange(min(bins)+bin_w/2, max(bins), bin_w), bins, **kwargs)
plt.xlim(bins[0], bins[-1])
(The last line is not strictly requested by the OP but it makes the output nicer)
This can be used as in:
import matplotlib.pyplot as plt
bins = range(5)
plt.hist(results, bins=bins)
bins_labels(bins, fontsize=20)
plt.show()
The other answers just don’t do it for me. The benefit of using plt.bar
over plt.hist
is that bar can use align='center'
:
import numpy as np
import matplotlib.pyplot as plt
arr = np.array([ 0., 2., 0., 0., 0., 0., 3., 0., 0., 0., 0., 0., 0.,
0., 0., 2., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.,
0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 1., 1.,
0., 0., 0., 0., 2., 0., 3., 1., 0., 0., 2., 2., 0.,
0., 0., 0., 0., 0., 0., 0., 1., 1., 0., 0., 0., 0.,
0., 0., 2., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 3., 1., 0., 0., 0., 0., 0., 0.,
0., 0., 1., 0., 0., 0., 1., 2., 2.])
labels, counts = np.unique(arr, return_counts=True)
plt.bar(labels, counts, align='center')
plt.gca().set_xticks(labels)
plt.show()
Here is a solution that only uses plt.hist()
.
Let’s break this down in two parts:
- Have the x-axis to be labelled
0 1 2 3
. - Have the labels in the center of each bar.
To have the x-axis labelled 0 1 2 3
without .5
values, you can use the function plt.xticks()
and provide as argument the values that you want on the x axis. In your case, since you want 0 1 2 3
, you can call plt.xticks(range(4))
.
To have the labels in the center of each bar, you can pass the argument align='left'
to the plt.hist()
function. Below is your code, minimally modified to do that.
import matplotlib.pyplot as plt
results = [0, 2, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0,
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
0, 1, 1, 0, 0, 0, 0, 2, 0, 3, 1, 0, 0, 2, 2, 0, 0, 0,
0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 1, 0, 0, 0, 0, 0,
0, 0, 0, 1, 0, 0, 0, 1, 2, 2]
plt.hist(results, bins=range(5), align='left')
plt.xticks(range(4))
plt.show()
Like Jarad pointed out in his answer, barplot is a neat way to do it. Here’s a short way of plotting barplot using pandas.
import pandas as pd
import matplotlib.pyplot as plt
arr = [ 0., 2., 0., 0., 0., 0., 3., 0., 0., 0., 0., 0., 0.,
0., 0., 2., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.,
0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 1., 1.,
0., 0., 0., 0., 2., 0., 3., 1., 0., 0., 2., 2., 0.,
0., 0., 0., 0., 0., 0., 0., 1., 1., 0., 0., 0., 0.,
0., 0., 2., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 3., 1., 0., 0., 0., 0., 0., 0.,
0., 0., 1., 0., 0., 0., 1., 2., 2.]
col = 'name'
pd.DataFrame({col : arr}).groupby(col).size().plot.bar()
plt.show()
To center the labels on a matplotlib histogram of discrete values is enough to define the "bins" as a list of bin boundaries.
import matplotlib.pyplot as plt
%matplotlib inline
example_data = [0,0,1,0,1,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,1]
fig = plt.figure(figsize=(5,5))
ax1 = fig.add_subplot()
ax1_bars = [0,1]
ax1.hist(
example_data,
bins=[x for i in ax1_bars for x in (i-0.4,i+0.4)],
color='#404080')
ax1.set_xticks(ax1_bars)
ax1.set_xticklabels(['class 0 label','class 1 label'])
ax1.set_title("Example histogram")
ax1.set_yscale('log')
ax1.set_ylabel('quantity')
fig.tight_layout()
plt.show()
How this works?
-
The histogram
bins
parameter can be a list defining the boundaries of the bins. For a class that can assume the values 0 or 1, those boundaries should be[ -0.5, 0.5, 0.5, 1.5 ]
which loosely translates as "bin 0" is from -0.5 to 1.5 and "bin 1" is from 0.5 to 1.5. Since the middle of those ranges are the discrete values the label will be on the expected place. -
The expression
[x for i in ax_bars for x in (i-0.4,i+0.4)]
is just a way to generate the list of boundaries for a list of values (ax_bars
). -
The expression
ax1.set_xticks(ax1_bars)
is important to set the x axis to be discrete. -
The rest should be self explanatory.
Use numpy
to have bins centered at your requested values:
import matplotlib.pyplot as plt
import numpy as np
plt.hist(results, bins=np.arange(-0.5, 5))
plt.show()