Fanplot in python from quantiles

Question:

I want to visualize my data in a similar plot like this one, in order to have the data intervals running from the darkest shade of the figures for the 50th percentile to the lightest ones at the 10th at the bottom and the 90th at the top intervals.

I have calculated the quantiles for my timeseries, and I have them in a dataframe

I want to have something looking like this image.

PV daily production:

I can make a chart like this one but is not the same

What I have already made

My dataframe looks like this

Time | pct0.1 | pct0.2 | pct0.3 | pct0.4 | pct0.5 | pct0.6 | pct0.7 | pct0.8 | pct0.9
01:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
02:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
03:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
04:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
05:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
06:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
07:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
08:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.4 | 1.2
09:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 46.2 | 138.6
10:00 | 246.4 | 340.8 | 445.0 | 559.0 | 673.0 | 737.8 | 802.6 | 843.2 | 859.6
11:00 | 429.8 | 620.6 | 777.8 | 901.4 | 1025.0 | 1153.8 | 1282.6 | 1362.8 | 1394.4
12:00 | 559.2 | 840.4 | 1025.8 | 1115.4 | 1205.0 | 1367.8 | 1530.6 | 1630.4 | 1667.2
13:00 | 606.4 | 968.8 | 1154.8 | 1164.4 | 1174.0 | 1351.2 | 1528.4 | 1648.0 | 1710.0
14:00 | 514.4 | 856.8 | 1031.8 | 1039.4 | 1047.0 | 1232.2 | 1417.4 | 1541.2 | 1603.6
15:00 | 386.0 | 620.0 | 760.4 | 807.2 | 854.0 | 1026.8 | 1199.6 | 1309.0 | 1355.0
16:00 | 259.0 | 331.0 | 391.4 | 440.2 | 489.0 | 621.4 | 753.8 | 836.6 | 869.8
17:00 | 87.2 | 100.4 | 110.2 | 116.6 | 123.0 | 174.2 | 225.4 | 252.6 | 255.8
18:00 | 0.4 | 0.8 | 1.6 | 2.8 | 4.0 | 4.0 | 4.0 | 4.0 | 4.0
19:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
20:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
21:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
22:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
23:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
00:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0

Thanks in advance for any help

Asked By: cebep27

||

Answers:

You could use ax.fill_between() to color the ranges between the quantiles:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from io import StringIO

data_str = '''
Time | pct0.1 | pct0.2 | pct0.3 | pct0.4 | pct0.5 | pct0.6 | pct0.7 | pct0.8 | pct0.9
01:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
02:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
03:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
04:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
05:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
06:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
07:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
08:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.4 | 1.2
09:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 46.2 | 138.6
10:00 | 246.4 | 340.8 | 445.0 | 559.0 | 673.0 | 737.8 | 802.6 | 843.2 | 859.6
11:00 | 429.8 | 620.6 | 777.8 | 901.4 | 1025.0 | 1153.8 | 1282.6 | 1362.8 | 1394.4
12:00 | 559.2 | 840.4 | 1025.8 | 1115.4 | 1205.0 | 1367.8 | 1530.6 | 1630.4 | 1667.2
13:00 | 606.4 | 968.8 | 1154.8 | 1164.4 | 1174.0 | 1351.2 | 1528.4 | 1648.0 | 1710.0
14:00 | 514.4 | 856.8 | 1031.8 | 1039.4 | 1047.0 | 1232.2 | 1417.4 | 1541.2 | 1603.6
15:00 | 386.0 | 620.0 | 760.4 | 807.2 | 854.0 | 1026.8 | 1199.6 | 1309.0 | 1355.0
16:00 | 259.0 | 331.0 | 391.4 | 440.2 | 489.0 | 621.4 | 753.8 | 836.6 | 869.8
17:00 | 87.2 | 100.4 | 110.2 | 116.6 | 123.0 | 174.2 | 225.4 | 252.6 | 255.8
18:00 | 0.4 | 0.8 | 1.6 | 2.8 | 4.0 | 4.0 | 4.0 | 4.0 | 4.0
19:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
20:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
21:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
22:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
23:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0
00:00 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0'''
df = pd.read_csv(StringIO(data_str), sep='s+|s+', engine='python')

fig, ax = plt.subplots(figsize=(12, 4))
xs = np.arange(len(df))
colors = plt.cm.Reds(np.linspace(0.3, 0.8, 4))
for lower, upper, color in zip([f'pct0.{i}' for i in range(1, 5)], [f'pct0.{i}' for i in range(9, 5, -1)], colors):
    ax.fill_between(xs, df[lower], df[upper], color=color, label=lower + '-' + upper)
ax.plot(xs, df['pct0.5'], color='black', lw=2, label='Median')
ax.set_xticks(xs)
ax.set_xticklabels(df['Time'])
ax.legend()
ax.margins(x=0)
ax.set_ylim(ymin=0)
for sp in ['top', 'right']:
    ax.spines[sp].set_visible(False)
plt.tight_layout()
plt.show()

fan plot from quantiles

Answered By: JohanC

If you have the raw dataset (in a Pandas DataFrame or Series), you can use Seaborn, with which you don’t even have to calculate the percentiles. It’s as simple as:

import seaborn as sns

for interval in [10, 20, 30, 40, 50, 60, 70, 80, 90]:
    plot = sns.lineplot(df, estimator="median", errorbar=("pi", interval), color="tab:red")

The "pi" in errorbar="pi" stands for Percentile interval. You can read more about it in Statistical estimation and error bars.

This might be useful if you don’t want to calculate the percentiles by hand.

Answered By: Ewout ter Hoeven
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.