Hide certain categorical element from legend in Plotnine

Question:

In Plotnine, is it possible to hide certain legend elements?

mpg_select = mpg[mpg["manufacturer"].isin(pd.Series(["audi", "ford", "honda", "hyundai"]))]

I have selected only 4 manufacturers. But when I plot the data, I still see the manufacturers that are not in the data as elements for my legend.

(ggplot(mpg_select, aes(x="displ", y="cty"))
    + geom_jitter(aes(size="hwy", color="manufacturer"))
    + geom_smooth(aes(color="manufacturer"), method="lm", se=False)
    + labs(title="Bubble chart")
)

Plotnine result showing complete legends for manufacturer

How do I show only the manufacturer that I selected (audi, ford, honda, and hyundai) as my legend?

Asked By: jxu

||

Answers:

It’s because the manufacturer column is categorical and it still has all those categories. You can remove the categories from the column and the extra values will remove from the legend.

from plotnine.data import mpg

desired_manufacturers = ['audi','ford','honda','hyundai']
mpg_select = mpg.loc[mpg['manufacturer'].isin(desired_manufacturers)]
mpg_select['manufacturer_subset'] = pd.Categorical(mpg_select['manufacturer'],
                                                   categories=desired_manufacturers)

(ggplot(mpg_select, aes(x="displ", y="cty"))
    + geom_jitter(aes(size="hwy", color="manufacturer_subset"))
    + geom_smooth(aes(color="manufacturer_subset"), method="lm", se=False)
    + labs(title="Bubble chart")
)

plot_with_manufacturer_subset

Answered By: cookesd

I had a similar issue and I found that remove_unused_categories() did a cleaner job. You don’t need to create a new variable, it just removes the missing categories after the filtering:

    from plotnine.data import mpg

    desired_manufacturers = ['audi','ford','honda','hyundai']
    mpg_select = mpg.loc[mpg['manufacturer'].isin(desired_manufacturers)]
    
    mpg_select["manufacturer"] = mpg_select["manufacturer"].cat.remove_unused_categories()

    (ggplot(mpg_select, aes(x="displ", y="cty"))
        + geom_jitter(aes(size="hwy", color="manufacturer"))
        + geom_smooth(aes(color="manufacturer"), method="lm", se=False)
        + labs(title="Bubble chart")
    )

enter image description here

Answered By: Germán Mandrini
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.