In seaborn, how can I group by a variable without using the "hue" argument?

Question:

In seaborn, is it possible to group observations based on a column without using the hue argument?

For example, how could I get these two lines to show up in the same colour, but as separate lines?

enter image description here

Code for generating this is below.

import pandas as pd
import seaborn as sns

df = pd.DataFrame(
    {
        'group': ["group01", "group01", "group02", "group02"],
        'x': [1, 2, 3, 5],
        'y': [2, 4, 3, 5]
    }
)
sns.lineplot(df, x='x', y='y', hue='group')
plt.show()

This is straightforward to do in R’s ggplot, by mapping the group variable to group, rather than to colour. For example, see this.

The reason I want to do this is that I want to show multiple overlaid plots all in the same colour. This helps to show variability across different datasets. The different colours that I would get with seaborn’s hue are unnecessary and distracting, especially when there would be dozens of them. Here is the sort of plot I want to create:

enter image description here

Asked By: Nayef

||

Answers:

seaborn.lineplot has a units parameter, which seems to be equivalent to ggplot’s group:

units: vector or key in data

Grouping variable identifying sampling units. When used, a separate line will be drawn for each unit with appropriate semantics,
but no legend entry will be added. Useful for showing distribution of
experimental replicates when exact identities are not needed.

sns.lineplot(df, x='x', y='y', units='group')

Output:

enter image description here

combining units and hue in a more complex example:

df = pd.DataFrame(
    {
        'group': ["group01", "group01", "group02", "group02", "group01", "group01"],
        'group2': ['A', 'A', 'A', 'A', 'B', 'B'],
        'x': [1, 2, 3, 5, 2, 4],
        'y': [2, 4, 3, 5, 3, 2]
    }
)
sns.lineplot(df, x='x', y='y', units='group', hue='group2')

Output:

enter image description here

Answered By: mozway
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.