bokeh.charts is gone – what library can do interactive, colored scatterplots?

Question:

One of the visualizations I find myself doing most often is the following: I have x,y Data, labeled in categories. I need to plot this in a scatterplot, automatically coloring the dots according to the label, and generating a legend. The visualization should then be interactive (zoomable, hovering over points shows Metadata, etc…)

This is the perfect example of what I am looking for – something that was existent in the now-deprecated bokeh.charts library:

enter image description here

I know I can do this non-interactively with seaborn:

fig = sns.lmplot(x="Length", y="Boot", hue="class", fit_reg=False, data=df)

I can plot interactively with bokeh, but only on a low level, without colors or legend:

p = Scatter(df, x="Length", y="Boot", plot_width=800, plot_height=600, 
    tooltips=TOOLTIPS, title="Cars")

I also know there exist various workarounds, manually defining color palettes, for example this one. However, this is ridiciously convoluted for something that used to be a simple oneliner (and still is, in R). The replacement, Holoview, does not seem to support coloring in scatterplots: Source

So, any recommendations for a Python package that supports this out-of-the-box in a oneliner, instead of manually writing this code on a low-level basis?

Asked By: Thomas

||

Answers:

The HoloViews library is a good replacement for the bokeh charts API. However since the API is quite unfamiliar to people who are used to imperative plotting APIs we have recently released a new library called hvPlot, which tries to mirror the pandas plotting API quite closely while providing the interactivity afforded by bokeh as well as some of the more advanced features of HoloViews (e.g. automatic faceting, widgets and datashader integration). To recreate the plot from above you would do something like this:

import bokeh.sampledata.autompg as mpg
import hvplot.pandas  # adds hvplot method to pandas objects

myPlot = mpg.autompg.hvplot.scatter(
        x='displ', y='hp', by='cyl',
        fields={'hp': 'Horsepower', 'displ': 'Displacement'},
        title='HP vs. DISPL (shaded by CYL)'
)
hvplot.show(myPlot) # view the plot in a browser (hvplot doesn't support modal dialogs)

enter image description here

Note that you will be able to replace the fields argument with more explicit xlabel and ylabel arguments in the next release.

Answered By: philippjfr

This is pretty trivial to achieve in modern versions of Bokeh:

from bokeh.plotting import figure, show
from bokeh.sampledata.iris import flowers as df
from bokeh.transform import factor_cmap

SPECIES = ['setosa', 'versicolor', 'virginica']

p = figure(tooltips="species: @species")
p.scatter("petal_length", "sepal_width", source=df, legend="species", alpha=0.5,
          size=12, color=factor_cmap('species', 'Category10_3', SPECIES))

show(p)

enter image description here

Answered By: bigreddot
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.