countplot from several columns

Question:

I have a dataframe with several categorical columns. I know how to do countplot which routinly plots ONE column.
Q: how to plot maximum count from ALL columns in one plot?

here is an exemplary dataframe to clarify the question:

import pandas as pd
import numpy as np
import seaborn as sns

testdf=pd.DataFrame(({   'Ahome' :   pd.Categorical(["home"]*10),
                         'Bsearch' : pd.Categorical(["search"]*8 + ["NO"]*2),
                          'Cbuy' : pd.Categorical(["buy"]*5 + ["NO"]*5),
                          'Dcheck' : pd.Categorical(["check"]*3 + ["NO"]*7),
                      
                     
                     } ))
testdf.head(10)
sns.countplot(data=testdf,x='Bsearch');

The last line is just using normal countplot for one column. I’d like to have the columns category (home,search,buy and check) in x-axis and their frequency in y-axis.

Asked By: physiker

||

Answers:

You need to use countplot as below:

df = pd.melt(testdf)
sns.countplot(data=df.loc[df['value']!="NO"], x='variable', hue='value')

Output:

enter image description here

Answered By: harvpan

As @HarvIpan points out, using melt you would create a long-form dataframe with the column names as entries. Calling countplot on this dataframe produces the correct plot.

As a difference to the existing solution, I would recommend not to use the hue argument at all.

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

df=pd.DataFrame(({   'Ahome' :   pd.Categorical(["home"]*10),
                         'Bsearch' : pd.Categorical(["search"]*8 + ["NO"]*2),
                          'Cbuy' : pd.Categorical(["buy"]*5 + ["NO"]*5),
                          'Dcheck' : pd.Categorical(["check"]*3 + ["NO"]*7),


                     } ))

df2 = df.melt(value_vars=df.columns)
df2 = df2[df2["value"] != "NO"]
sns.countplot(data=df2, x="variable")
plt.show()

enter image description here

Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.