Categorical histograms in seaborn from a dataframe
Question:
Consider I have the following dataframe:
sample
tca_after
avg_length_after
cd_after
tca_before
avg_length_before
cd_before
1
0.015385
50.513499
1.4
0.005139
31.844415
0.4
2
0.005040
19.209373
1.0
0.004603
20.831459
0.6
3
0.057218
31.869649
10.0
0.008687
17.926937
1.0
4
0.037175
45.543659
3.8
0.035760
56.937708
2.8
I want to compare TCA, avg_length, and CD, before and after a certain process. So I would like to create three categorical histograms like that using seaborn. On x-axis I have all four samples, on y-axis I have either TCA, avg_length or CD for both before and after. I have no idea how to do it 🙁
Could you please help?
Answers:
You could join the before and after columns could together, and add a new column, e.g. "when" to indicate before vs after. That new column can then be used as "hue".
Here is how you could create a histogram:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
df = pd.read_html('https://stackoverflow.com/questions/73348014')[0]
df_before = df[['sample', 'tca_before', 'avg_length_before', 'cd_before']]
df_before = df_before.rename(columns={'tca_before': 'tca', 'avg_length_before': 'avg_length', 'cd_before': 'cd'})
df_before['when'] = 'before'
df_after = df[['sample', 'tca_after', 'avg_length_after', 'cd_after']]
df_after = df_after.rename(columns={'tca_after': 'tca', 'avg_length_after': 'avg_length', 'cd_after': 'cd'})
df_after['when'] = 'after'
df_new = pd.concat([df_before, df_after]).reset_index()
sns.histplot(data=df_new, x='tca', hue='when', palette='copper_r', multiple='dodge')
To create a bar plot with the sample IDs on the x axis, you can change sns.histplot
to sns.barplot
:
sns.barplot(data=df_new, x='sample', y='tca', hue='when', palette='copper_r', dodge=True)
Consider I have the following dataframe:
sample | tca_after | avg_length_after | cd_after | tca_before | avg_length_before | cd_before |
---|---|---|---|---|---|---|
1 | 0.015385 | 50.513499 | 1.4 | 0.005139 | 31.844415 | 0.4 |
2 | 0.005040 | 19.209373 | 1.0 | 0.004603 | 20.831459 | 0.6 |
3 | 0.057218 | 31.869649 | 10.0 | 0.008687 | 17.926937 | 1.0 |
4 | 0.037175 | 45.543659 | 3.8 | 0.035760 | 56.937708 | 2.8 |
I want to compare TCA, avg_length, and CD, before and after a certain process. So I would like to create three categorical histograms like that using seaborn. On x-axis I have all four samples, on y-axis I have either TCA, avg_length or CD for both before and after. I have no idea how to do it 🙁
Could you please help?
You could join the before and after columns could together, and add a new column, e.g. "when" to indicate before vs after. That new column can then be used as "hue".
Here is how you could create a histogram:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
df = pd.read_html('https://stackoverflow.com/questions/73348014')[0]
df_before = df[['sample', 'tca_before', 'avg_length_before', 'cd_before']]
df_before = df_before.rename(columns={'tca_before': 'tca', 'avg_length_before': 'avg_length', 'cd_before': 'cd'})
df_before['when'] = 'before'
df_after = df[['sample', 'tca_after', 'avg_length_after', 'cd_after']]
df_after = df_after.rename(columns={'tca_after': 'tca', 'avg_length_after': 'avg_length', 'cd_after': 'cd'})
df_after['when'] = 'after'
df_new = pd.concat([df_before, df_after]).reset_index()
sns.histplot(data=df_new, x='tca', hue='when', palette='copper_r', multiple='dodge')
To create a bar plot with the sample IDs on the x axis, you can change sns.histplot
to sns.barplot
:
sns.barplot(data=df_new, x='sample', y='tca', hue='when', palette='copper_r', dodge=True)