countplot from several columns
Question:
I have a dataframe with several categorical columns. I know how to do countplot which routinly plots ONE column.
Q: how to plot maximum count from ALL columns in one plot?
here is an exemplary dataframe to clarify the question:
import pandas as pd
import numpy as np
import seaborn as sns
testdf=pd.DataFrame(({ 'Ahome' : pd.Categorical(["home"]*10),
'Bsearch' : pd.Categorical(["search"]*8 + ["NO"]*2),
'Cbuy' : pd.Categorical(["buy"]*5 + ["NO"]*5),
'Dcheck' : pd.Categorical(["check"]*3 + ["NO"]*7),
} ))
testdf.head(10)
sns.countplot(data=testdf,x='Bsearch');
The last line is just using normal countplot for one column. I’d like to have the columns category (home,search,buy and check) in x-axis and their frequency in y-axis.
Answers:
As @HarvIpan points out, using melt
you would create a long-form dataframe with the column names as entries. Calling countplot
on this dataframe produces the correct plot.
As a difference to the existing solution, I would recommend not to use the hue
argument at all.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df=pd.DataFrame(({ 'Ahome' : pd.Categorical(["home"]*10),
'Bsearch' : pd.Categorical(["search"]*8 + ["NO"]*2),
'Cbuy' : pd.Categorical(["buy"]*5 + ["NO"]*5),
'Dcheck' : pd.Categorical(["check"]*3 + ["NO"]*7),
} ))
df2 = df.melt(value_vars=df.columns)
df2 = df2[df2["value"] != "NO"]
sns.countplot(data=df2, x="variable")
plt.show()
I have a dataframe with several categorical columns. I know how to do countplot which routinly plots ONE column.
Q: how to plot maximum count from ALL columns in one plot?
here is an exemplary dataframe to clarify the question:
import pandas as pd
import numpy as np
import seaborn as sns
testdf=pd.DataFrame(({ 'Ahome' : pd.Categorical(["home"]*10),
'Bsearch' : pd.Categorical(["search"]*8 + ["NO"]*2),
'Cbuy' : pd.Categorical(["buy"]*5 + ["NO"]*5),
'Dcheck' : pd.Categorical(["check"]*3 + ["NO"]*7),
} ))
testdf.head(10)
sns.countplot(data=testdf,x='Bsearch');
The last line is just using normal countplot for one column. I’d like to have the columns category (home,search,buy and check) in x-axis and their frequency in y-axis.
As @HarvIpan points out, using melt
you would create a long-form dataframe with the column names as entries. Calling countplot
on this dataframe produces the correct plot.
As a difference to the existing solution, I would recommend not to use the hue
argument at all.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df=pd.DataFrame(({ 'Ahome' : pd.Categorical(["home"]*10),
'Bsearch' : pd.Categorical(["search"]*8 + ["NO"]*2),
'Cbuy' : pd.Categorical(["buy"]*5 + ["NO"]*5),
'Dcheck' : pd.Categorical(["check"]*3 + ["NO"]*7),
} ))
df2 = df.melt(value_vars=df.columns)
df2 = df2[df2["value"] != "NO"]
sns.countplot(data=df2, x="variable")
plt.show()