How to show last row of Pandas DataFrame in box plot
Question:
Random data:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
data = pd.DataFrame(np.random.normal(size=(20,4)))
data
0 1 2 3
0 -0.710006 -0.748083 -1.261515 0.048941
1 0.856541 0.533073 0.649113 -0.236297
2 -0.091005 -0.244658 -2.194779 0.632878
3 -0.059058 0.807661 -0.418446 -0.295255
4 -0.103701 0.775622 0.258412 0.024411
5 -0.447976 -0.034419 -1.521598 -0.903301
6 1.451105 0.549661 -1.655751 -0.147499
7 1.479374 -1.475347 0.665726 0.236611
8 -1.427979 -1.812916 0.522802 0.006066
9 0.198515 1.203476 -0.475389 -1.721707
10 0.286255 0.564450 0.590050 -0.657811
11 -1.076161 1.820218 -0.315127 -0.848114
12 0.061848 0.303502 0.978169 0.024630
13 -0.307827 -1.047835 0.547052 -0.647217
14 0.679214 0.734134 0.158803 -0.334951
15 0.469675 1.043391 -1.449727 1.335354
16 -0.483831 -0.988185 0.264027 -0.831833
17 -2.013968 -0.200699 1.076526 1.275300
18 -0.199473 -1.630597 -1.697146 -0.177458
19 1.245289 0.132349 1.054312 -0.082550
data.boxplot(vert= False, figsize = (15,10))
I want to add red dots to the box plot indicating the last value (bottom) in each column. For example (red dots I’ve edited in are not in their exact position, but this gives you a general idea):
Thank you.
Answers:
You could just add a scatter plot on top of the boxplot.
For the provided example, it looks like this:
fig, ax = plt.subplots(figsize=(8,5))
df.boxplot(vert= False, patch_artist=True, ax=ax, zorder=1)
lastrow = df.iloc[-1,:]
print(lastrow)
ax.scatter(x=lastrow, y=[*range(1,len(lastrow)+1)], color='r', zorder=2)
# for displaying the values of the red points:
for i, val in enumerate(lastrow,1):
ax.annotate(text=f"{val:.2f}", xy=(val,i+0.1))
Random data:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
data = pd.DataFrame(np.random.normal(size=(20,4)))
data
0 1 2 3
0 -0.710006 -0.748083 -1.261515 0.048941
1 0.856541 0.533073 0.649113 -0.236297
2 -0.091005 -0.244658 -2.194779 0.632878
3 -0.059058 0.807661 -0.418446 -0.295255
4 -0.103701 0.775622 0.258412 0.024411
5 -0.447976 -0.034419 -1.521598 -0.903301
6 1.451105 0.549661 -1.655751 -0.147499
7 1.479374 -1.475347 0.665726 0.236611
8 -1.427979 -1.812916 0.522802 0.006066
9 0.198515 1.203476 -0.475389 -1.721707
10 0.286255 0.564450 0.590050 -0.657811
11 -1.076161 1.820218 -0.315127 -0.848114
12 0.061848 0.303502 0.978169 0.024630
13 -0.307827 -1.047835 0.547052 -0.647217
14 0.679214 0.734134 0.158803 -0.334951
15 0.469675 1.043391 -1.449727 1.335354
16 -0.483831 -0.988185 0.264027 -0.831833
17 -2.013968 -0.200699 1.076526 1.275300
18 -0.199473 -1.630597 -1.697146 -0.177458
19 1.245289 0.132349 1.054312 -0.082550
data.boxplot(vert= False, figsize = (15,10))
I want to add red dots to the box plot indicating the last value (bottom) in each column. For example (red dots I’ve edited in are not in their exact position, but this gives you a general idea):
Thank you.
You could just add a scatter plot on top of the boxplot.
For the provided example, it looks like this:
fig, ax = plt.subplots(figsize=(8,5))
df.boxplot(vert= False, patch_artist=True, ax=ax, zorder=1)
lastrow = df.iloc[-1,:]
print(lastrow)
ax.scatter(x=lastrow, y=[*range(1,len(lastrow)+1)], color='r', zorder=2)
# for displaying the values of the red points:
for i, val in enumerate(lastrow,1):
ax.annotate(text=f"{val:.2f}", xy=(val,i+0.1))