Python – Line Chart Plotly – Is there a simple way to plot by average if a single date has multiple data points?

Question:

I’m plotting sales data and can’t find a simple way to plot by average if a single date has multiple sales.

For example: If 12/10/2022 has 3 data points, 100, 100 and 400, my current line graph is passing through data point 100 (as this is the middle of the three) but I’d like it to create and pass through the average point of all sales on the specific day(100 + 100 + 400 / 3 = 200).

I can create a function to groupby certain dates and average each date but I’m wondering if this functionality is built into Plotly? There’s also the option of using Seaborn but this requires Pandas and I’d like to find a lighter weight solution.

I’ve googled, checked S/o and checked the Plotly docs but can’t seem to find a simple way of achieving this.

Edit:
Here’s a copy of my code, the data and the graphed result.

    x = [date(Aug 09, 2021), date(Jul 05, 2021), date(Jul 05, 
         2021), date(Jul 05, 2021), date(Jun 07, 2021), date(Jun 
         07, 2021), date(Jun 07, 2021), date(Jun 07, 2021)]

    y = [317, 286, 269, 294, 286, 274, 323, 286]

    fig1 = px.scatter(
        x=x,
        y=y,  
    )

    fig2 = px.line(
        x=x,
        y=y,  
    )

    fig3 = go.Figure(data=fig1.data + fig2.data)

    fig3.update_layout(autotypenumbers='convert types')

    chart = fig3.to_html()
    context = {"chart": chart, "sales_data_list": sales_data_list}

    return render(request, "index.html", context) 

enter image description here

Asked By: s_m_lima

||

Answers:

You can use aggregations in plotly. This will require you to use plotly.io (instead of plotly.graph_objects or plotly.express), and makes use of transforms which are deprecated in plotly v5 and will be removed in a future version – this means that in order for this solution to work, you will have to use a legacy version of plotly in the future.

Following the example from the documentation in plotly, you can create a figure dictionary with transforms as a key, then run pio.show(fig_dict):

import plotly.io as pio
from datetime import datetime

## sample data
x = [datetime(2021,8,9)] + [datetime(2021,7,5)]*3 + [datetime(2021,6,7)]*3
y = [317,286,269,294,286,274,323,286]

data = [dict(
    type='scatter',
    x=x,
    y=y,
    mode='markers+lines',
    transforms=[dict(
        type='aggregate',
        groups=x,
        aggregations=[dict(
            target='y', 
            func='avg', 
            enabled=True
            )]
        )]
)]

fig_dict = dict(data=data, layout=dict(autotypenumbers='convert types'))
pio.show(fig_dict, validate=False)

enter image description here

Answered By: Derek O
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.