How to plot a histogram of a single dataframe column and exclude 0s
Question:
I have a huge dataframe that looks like this:
Date value
1 2022-01-01 00:00:00 0.000
2 2022-01-01 01:00:00 0.000
3 2022-01-01 02:00:00 0.000
4 2022-01-01 08:00:00 0.058
5 2022-01-01 09:00:00 4.419
6 2022-01-01 10:00:00 14.142
I want to only plot in a histogram the column ‘value’ but without the 0 values. How do I do that?
I have tried :
plt.hist(df['value' >0], bins=50)
plt.hist(df['value' =! 0], bins=50)
but no luck. Any ideas?
Many thanks
Answers:
There’s a syntax error with df['value' > 0]
— needs to be either df[df['value'] > 0]
or df[df.value > 0]
.
The idea is that you create a boolean index with df.value
:
>>> df.value > 0
0 False
1 True
2 False
3 True
4 True
5 True
Name: value, dtype: bool
And then use that index on df
to retrieve the True
indexes:
>>> df[df.value > 0]
date value
1 2022/01/01 0.100
3 2022/01/01 0.058
4 2022/01/01 4.419
5 2022/01/01 14.142
On the plotting side, you can also plot directly with pandas:
>>> df[df.value > 0].plot.hist(bins=50)
I have a huge dataframe that looks like this:
Date value
1 2022-01-01 00:00:00 0.000
2 2022-01-01 01:00:00 0.000
3 2022-01-01 02:00:00 0.000
4 2022-01-01 08:00:00 0.058
5 2022-01-01 09:00:00 4.419
6 2022-01-01 10:00:00 14.142
I want to only plot in a histogram the column ‘value’ but without the 0 values. How do I do that?
I have tried :
plt.hist(df['value' >0], bins=50)
plt.hist(df['value' =! 0], bins=50)
but no luck. Any ideas?
Many thanks
There’s a syntax error with df['value' > 0]
— needs to be either df[df['value'] > 0]
or df[df.value > 0]
.
The idea is that you create a boolean index with df.value
:
>>> df.value > 0
0 False
1 True
2 False
3 True
4 True
5 True
Name: value, dtype: bool
And then use that index on df
to retrieve the True
indexes:
>>> df[df.value > 0]
date value
1 2022/01/01 0.100
3 2022/01/01 0.058
4 2022/01/01 4.419
5 2022/01/01 14.142
On the plotting side, you can also plot directly with pandas:
>>> df[df.value > 0].plot.hist(bins=50)