How to plot medians of grouped data in pandas

Question:

Considering two histograms as the following ones:

from pandas import DataFrame
import numpy as np
x = ['A']*300 + ['B']*400 
y = np.random.randn(700)
df = DataFrame({'Letter': x, 'N': y})
df.hist('N', by='Letter')

I am trying to plot the median of each grouped data. I would also like to invert the order of the graphs (Group B on the left and Group A on the right)

Asked By: Ale Rey

||

Answers:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

x = ['A']*300 + ['B']*400
y = np.random.randn(700)
df = pd.DataFrame({'Letter': x, 'N': y})

# medians for each group
medians = df.groupby('Letter')['N'].median()

colors = {'A': 'blue', 'B': 'green'}

fig, axs = plt.subplots(1, 2)
df[df['Letter'] == 'B'].hist('N', ax=axs[0], color=colors['B'], grid=False)
df[df['Letter'] == 'A'].hist('N', ax=axs[1], color=colors['A'], grid=False)

# median lines
axs[0].axvline(medians['B'], color='r', linestyle='dashed', linewidth=1)
axs[1].axvline(medians['A'], color='r', linestyle='dashed', linewidth=1)

axs[0].set_title('Group B')
axs[1].set_title('Group A')
axs[0].set_xlabel('Value')
axs[1].set_xlabel('Value')
axs[0].set_ylabel('Frequency')
axs[1].set_ylabel('Frequency')

plt.tight_layout()
plt.show()
Answered By: Tech Savvy
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.