In seaborn, how can I group by a variable without using the "hue" argument?
Question:
In seaborn, is it possible to group observations based on a column without using the hue
argument?
For example, how could I get these two lines to show up in the same colour, but as separate lines?
Code for generating this is below.
import pandas as pd
import seaborn as sns
df = pd.DataFrame(
{
'group': ["group01", "group01", "group02", "group02"],
'x': [1, 2, 3, 5],
'y': [2, 4, 3, 5]
}
)
sns.lineplot(df, x='x', y='y', hue='group')
plt.show()
This is straightforward to do in R’s ggplot, by mapping the group variable to group
, rather than to colour
. For example, see this.
The reason I want to do this is that I want to show multiple overlaid plots all in the same colour. This helps to show variability across different datasets. The different colours that I would get with seaborn’s hue
are unnecessary and distracting, especially when there would be dozens of them. Here is the sort of plot I want to create:
Answers:
seaborn.lineplot
has a units
parameter, which seems to be equivalent to ggplot’s group
:
units: vector or key in data
Grouping variable identifying sampling units. When used, a separate line will be drawn for each unit with appropriate semantics,
but no legend entry will be added. Useful for showing distribution of
experimental replicates when exact identities are not needed.
sns.lineplot(df, x='x', y='y', units='group')
Output:
combining units
and hue
in a more complex example:
df = pd.DataFrame(
{
'group': ["group01", "group01", "group02", "group02", "group01", "group01"],
'group2': ['A', 'A', 'A', 'A', 'B', 'B'],
'x': [1, 2, 3, 5, 2, 4],
'y': [2, 4, 3, 5, 3, 2]
}
)
sns.lineplot(df, x='x', y='y', units='group', hue='group2')
Output:
In seaborn, is it possible to group observations based on a column without using the hue
argument?
For example, how could I get these two lines to show up in the same colour, but as separate lines?
Code for generating this is below.
import pandas as pd
import seaborn as sns
df = pd.DataFrame(
{
'group': ["group01", "group01", "group02", "group02"],
'x': [1, 2, 3, 5],
'y': [2, 4, 3, 5]
}
)
sns.lineplot(df, x='x', y='y', hue='group')
plt.show()
This is straightforward to do in R’s ggplot, by mapping the group variable to group
, rather than to colour
. For example, see this.
The reason I want to do this is that I want to show multiple overlaid plots all in the same colour. This helps to show variability across different datasets. The different colours that I would get with seaborn’s hue
are unnecessary and distracting, especially when there would be dozens of them. Here is the sort of plot I want to create:
seaborn.lineplot
has a units
parameter, which seems to be equivalent to ggplot’s group
:
units: vector or key in data
Grouping variable identifying sampling units. When used, a separate line will be drawn for each unit with appropriate semantics,
but no legend entry will be added. Useful for showing distribution of
experimental replicates when exact identities are not needed.
sns.lineplot(df, x='x', y='y', units='group')
Output:
combining units
and hue
in a more complex example:
df = pd.DataFrame(
{
'group': ["group01", "group01", "group02", "group02", "group01", "group01"],
'group2': ['A', 'A', 'A', 'A', 'B', 'B'],
'x': [1, 2, 3, 5, 2, 4],
'y': [2, 4, 3, 5, 3, 2]
}
)
sns.lineplot(df, x='x', y='y', units='group', hue='group2')
Output: