Plotting categorical data with pandas and matplotlib

Question

I have a data frame with categorical data:

     colour  direction
1    red     up
2    blue    up
3    green   down
4    red     left
5    red     right
6    yellow  down
7    blue    down

I want to generate some graphs, like pie charts and histograms based on the categories. Is it possible without creating dummy numeric variables? Something like

df.plot(kind='hist')

Asked By: Ivan

||

Source

Answer 1

You can simply use value_counts on the series:

df['colour'].value_counts().plot(kind='bar')

Answered By: Alexander

Answer 2

like this :

df.groupby('colour').size().plot(kind='bar')

Answered By: steboc

Answer 3

You might find useful mosaic plot from statsmodels. Which can also give statistical highlighting for the variances.

from statsmodels.graphics.mosaicplot import mosaic
plt.rcParams['font.size'] = 16.0
mosaic(df, ['direction', 'colour']);

enter image description here

But beware of the 0 sized cell – they will cause problems with labels.

See this answer for details

Answered By: Primer

Answer 4

You could also use countplot from seaborn. This package builds on pandas to create a high level plotting interface. It gives you good styling and correct axis labels for free.

import pandas as pd
import seaborn as sns
sns.set()

df = pd.DataFrame({'colour': ['red', 'blue', 'green', 'red', 'red', 'yellow', 'blue'],
                   'direction': ['up', 'up', 'down', 'left', 'right', 'down', 'down']})
sns.countplot(df['colour'], color='gray')

It also supports coloring the bars in the right color with a little trick

sns.countplot(df['colour'],
              palette={color: color for color in df['colour'].unique()})

Answered By: Jarno

Answer 5

To plot multiple categorical features as bar charts on the same plot, I would suggest:

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame(
    {
        "colour": ["red", "blue", "green", "red", "red", "yellow", "blue"],
        "direction": ["up", "up", "down", "left", "right", "down", "down"],
    }
)

categorical_features = ["colour", "direction"]
fig, ax = plt.subplots(1, len(categorical_features))
for i, categorical_feature in enumerate(df[categorical_features]):
    df[categorical_feature].value_counts().plot("bar", ax=ax[i]).set_title(categorical_feature)
fig.show()

Answered By: Roman Orac

Answer 6

You can simply use value_counts with sort option set to False. This will preserve ordering of the categories

df['colour'].value_counts(sort=False).plot.bar(rot=0)

Answered By: msenior_

Answer 7

Using plotly

import plotly.express as px
px.bar(df["colour"].value_counts())

Answered By: Biman Pal

Answer 8

Roman’s answer is very helpful and correct but in latest version you also need to specify kind as the parameter’s order can change.

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame(
    {
    "colour": ["red", "blue", "green", "red", "red", "yellow", "blue"],
    "direction": ["up", "up", "down", "left", "right", "down", "down"],
    }
)

categorical_features = ["colour", "direction"]
fig, ax = plt.subplots(1, len(categorical_features))
for i, categorical_feature in enumerate(df[categorical_features]):
    df[categorical_feature].value_counts().plot(kind="bar", ax=ax[i]).set_title(categorical_feature)
fig.show()

Answered By: ahtasham nazeer

Answer 9

Pandas.Series.plot.pie

https://pandas.pydata.org/docs/reference/api/pandas.Series.plot.pie.html

We can do a little better than that without straying from the built-in functionality.

People love to hate on pie charts, but they have the same benefit as a mosaic/tree; they help keep proportion-to-the-whole interpretable.

kwargs = dict(
    startangle = 90,
    colormap   = 'Pastel2',
    fontsize   = 13,
    explode    = (0.1,0.1,0.1),
    figsize    = (60,5),
    autopct    = '%1.1f%%',
    title      = 'Chemotherapy Stratification'
)

df['treatment_chemo'].value_counts().plot.pie(**kwargs)

Answered By: Kermit

Plotting categorical data with pandas and matplotlib

Question:

Answers:

`Pandas.Series.plot.pie`