matplotlib multicolored line from pandas DataFrame with colors from value in dataframe

Question:

I am trying to plot a DataFrame containing 3 columns, first 2 will be the coordinates of each point and the third would determine the color of the plot at that point:

X Y C
1 2 R
2 1 R
3 4 B
4 3 R
5 1 R
6 5 G
7 6 G
8 8 B

I grouped the data into segments of the same color:

df.groupby((df['C']!=df['C'].shift()).cumsum())

And then tried to call .plot for each group, but the displayed plot had discontinuities and was also extremely slow as the amount of data is quite large.

I found this example and I believe using LineCollection and ListedColormap could be the right solution, but being new to the ecosystem, I’m failing to understand how I could adapt it to work with the described DataFrame.

Asked By: roign

||

Answers:

Adapting the linked code to your example is quite straightforward.

Note that the last color won’t be used.

Some remarks:

  • Your list of colors aren’t valid matplotlib colors. They need to be in lowercase.
  • The code uses segments of two points. If you’d try to combine segments with the same color to larger segments, the fast numpy array operations can’t be used anymore.
  • autoscale_view() or explicitly setting the x and y limits (as in the tutorial) is needed because matplotlib doesn’t do this automatically when elements are added (instead of plotted)

Working directly with the colors

from matplotlib import pyplot as plt
from matplotlib.collections import LineCollection
import pandas as pd
import numpy as np

df = pd.read_html('https://stackoverflow.com/questions/75695487')[0]
points = np.c_[df['X'], df['Y']]
segments = np.c_[points[:-1], points[1:]].reshape(-1, 2, 2)

lc = LineCollection(segments, colors=df['C'].str.lower())

fig, ax = plt.subplots()
ax.add_collection(lc)
ax.autoscale_view()
plt.show()

multi-colored line from dataframe

Creating a colormap from the dataframe column

If you have a really large dataframe, you could create a listed colormap with all the colors. pd.Categorical will create both the list of colors and their internal numeric representation.

from matplotlib import pyplot as plt
from matplotlib.colors import ListedColormap
from matplotlib.collections import LineCollection
import pandas as pd
import numpy as np

df = pd.read_html('https://stackoverflow.com/questions/75695487')[0]
points = np.c_[df['X'], df['Y']]
segments = np.c_[points[:-1], points[1:]].reshape(-1, 2, 2)

df['C'] = pd.Categorical(df['C'])  # explicitly make categorical

lc = LineCollection(segments,
                    cmap=ListedColormap(df['C'].cat.categories.str.lower()),
                    array=df['C'].cat.codes)
fig, ax = plt.subplots()
ax.add_collection(lc)
ax.autoscale_view()
plt.show()
Answered By: JohanC
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.