Pandas how to plot multiple 0/1 distributions in to one figure

Question:

I have a dataset that looks like this:

label colA colB colC
  0     1    0    0
  0     0    1    0
  1     0    0    1
  1     1    0    1

Each row will be either label 0 or 1, and only one of colA, colB and colC will be 1, other will be 0.
I want to plot one figure that looks something like (Y will be counts, X will be 0/1):

enter image description here
Based on the example given, there will be 6 columns in total since there’ll be 3 columns indicating 0 label, and 3 columns for 1 label. How do I do that? I know how to plot one column
df[df['colA']==1]['label'].plot() but not sure how to combine multiple columns together..

Asked By: zxcisnoias

||

Answers:

If I’ve understood the question correctly, maybe you are looking for something like this:

import matplotlib.pyplot as plt

X_axis = [int(x) for x in df['labels']]
cols = df.drop(['labels'], axis=1).columns
width = 0.25
color_dict = {0: 'red', 1: 'blue', 2: 'green'}
for i, c in enumerate(cols):
  plt.bar(0 + width*(i - len(cols)//2), len(df[(df['labels'] == 0) & (df[c] == 1)]), width, color=color_dict[i%len(cols)], label=c)
  plt.bar(1 + width*(i - len(cols)//2), len(df[(df['labels'] == 1) & (df[c] == 1)]), width, color=color_dict[i%len(cols)], label=c)
plt.xticks(X_axis, df['labels'])

handles, labels = plt.gca().get_legend_handles_labels()
by_label = dict(zip(labels, handles))
plt.legend(by_label.values(), by_label.keys())

For the data you’ve provided, the result should look as follows
plt for a given data

Answered By: griko

It looks like you might want a groupby.sum then plot.bar:

df.groupby('label').sum().plot.bar()

NB. if you want the proportion, use df.groupby('label').mean().plot.bar().

Output:

enter image description here

Answered By: mozway
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.