Pandas DataFrame plot, colors are not unique

Question:

According to Pandas manual, the parameter Colormap can be used to select colors from matplotlib colormap object. However for each bar, in the case of a bar diagram, the color needs to be selected manually. This is not capable, if you have a lot of bars, the manual effort is annoying. My expectation is that if no color is selected, each object/class should get a unique color representation. Unfortunately, this is not the case. The colors are repetitive. Only 10 unique colors are provided.

Code for reproduction:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randint(0,100,size=(100, 25)), columns=list('ABCDEFGHIJKLMNOPQRSTUVWXY'))
df.set_index('A', inplace=True)
df.plot(kind='bar', stacked=True, figsize=(20, 10))
plt.title("some_name")
plt.savefig("some_name" + '.png')

Does somebody have any idea how to get a unique color for each class in the diagram?
Thanks in advance

Asked By: Jürgen K.

||

Answers:

That’s probably because the colors in the default property cycle (see image below) are only number of 10.

A workaround would be to set a list of random colors (in your case, 24) and pass it as a kwarg to pandas.DataFrame.bar :

import random

list_colors= ["#"+"".join([random.choice("0123456789ABCDEF") for j in range(6)])
              for i in range(len(df.columns))]

df.plot(kind="bar", stacked=True, figsize=(20, 10), color=list_colors)

enter image description here

Update :

It might be hard to find a palette of very distinct 24 colors. However, you can use one of the palettes available in seaborn :

enter image description here

import seaborn as sns #pip install seaborn

list_colors = sns.color_palette("hsv", n_colors=24)

df.plot(kind="bar", stacked=True, figsize=(20, 10), color=list_colors)

Another solution would be to use scipy.spatial.distance.euclidean from the beautiful :

from scipy.spatial import distance #pip install scipy

def hex_to_rgb(hex_color):
    return tuple(int(hex_color[i:i+2], 16) for i in (1, 3, 5))

def distinct_colors(n):
    colors = []
    while len(colors) < n:
        color = "#" + "".join(random.choice("0123456789ABCDEF") for _ in range(6))
        if all(distance.euclidean(hex_to_rgb(color), hex_to_rgb(c)) > 50 for c in colors):
            colors.append(color)
    return colors

colors = distinct_colors(24)
sns.palplot(colors)

enter image description here

Answered By: Timeless
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.