How to fill_between based on a condition

Question:

I have a df from 2002 to 2017 that looks like this:

link to gdrive with .csv

Team,Finish.1_x,Finish.1_y,Win%Detroit,Win%Chicago,Date,GdpDetroit,GdpChicago,D_GDP_Change,C_GDP_Change,detroitdummy,chicagodummy
2002–03,1st,6th,0.61,0.366,2002-01-01,49650,54744,933.0,-27.0,1,0
2003–04,2nd,8th,0.659,0.28,2003-01-01,51101,55273,1451.0,529.0,1,1
2004–05,1st,2nd,0.659,0.573,2004-01-01,50935,56507,-166.0,1234.0,0,1
2005–06,1st,4th[k],0.78,0.5,2005-01-01,52028,57608,1093.0,1101.0,1,1
2006–07,1st,3rd,0.646,0.598,2006-01-01,50576,58717,-1452.0,1109.0,0,1
2007–08,1st,4th,0.72,0.402,2007-01-01,50450,59240,-126.0,523.0,0,1
2008–09,3rd,2nd,0.476,0.5,2008-01-01,47835,57197,-2615.0,-2043.0,0,0
2009–10,5th,3rd,0.329,0.5,2009-01-01,43030,54802,-4805.0,-2395.0,0,0
2010–11,4th,1st,0.366,0.756,2010-01-01,45735,55165,2705.0,363.0,1,1
2012–13,4th,2nd,0.354,0.549,2012-01-01,48469,57254,926.0,1463.0,1,1
2013–14,4th,2nd,0.354,0.585,2013-01-01,48708,56939,239.0,-315.0,1,0
2014–15,5th,2nd,0.39,0.61,2014-01-01,49594,57823,886.0,884.0,1,1
2015–16,3rd,4th,0.537,0.512,2015-01-01,50793,59285,1199.0,1462.0,1,1
2016–17,5th,4th,0.451,0.5,2016-01-01,51578,60191,785.0,906.0,1,1
2017–18,4th,5th,0.476,0.329,2017-01-01,52879,61170,1301.0,979.0,1,1

I am trying to have a chart like this one:

plot

with the Date column on the x, the %Win on the y, and the fill_between conditional on the sign of GDP change: blue if GDP increased and red if it decreased.
The code I’m using:

fig, ax = plt.subplots()
ax.fill_between(df["Date"],np.max(df["Win%Detroit"]), where=(np.sign(df["D_GDP_Change"])<= 0), color='r', alpha=.1)
ax.fill_between(df["Date"],np.max(df["Win%Detroit"]), where=(np.sign(df["D_GDP_Change"])> 0), color='b', alpha=.1)
ax.plot(df["Date"],df["Win%Detroit"])
plt.show()

Cannot understand what I am doing wrong as the patches have empty spaces. Could you give me a hint?

I feel like the answer is here: conditional matplotlib fill_between for dataframe. Yet, I cannot figure out how to make the function sharper in this case.

Asked By: KArrow'sBest

||

Answers:

Your version of fill_between with where= only fills between subsequent dates when the condition is true for both dates.

To obtain a background filling depending on a condition, you could use a step function for filling, letting it step between 0 and the maximum:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

df = pd.DataFrame({"Date": pd.date_range('20020101', '20170101', freq='YS'),
                   "Win%Detroit": np.random.uniform(0.3, 0.7, 16),
                   "D_GDP_Change": np.random.randint(-2000, 2000, 16)})

fig, ax = plt.subplots()
max_val = np.max(df["Win%Detroit"]) * 1.05
ax.fill_between(df["Date"], np.where(df["D_GDP_Change"] <= 0, max_val, 0), color='r', alpha=.1, step='post')
ax.fill_between(df["Date"], np.where(df["D_GDP_Change"] > 0, max_val, 0), color='b', alpha=.1, step='post')
ax.plot(df["Date"], df["Win%Detroit"])
ax.margins(x=0, y=0)
plt.show()

fill_between using step function

With the dataframe from the post, and updating the ‘Date’ column to date type (df['Date'] = pd.to_datetime(df['Date'])):

fill_between using step function with original data

Answered By: JohanC
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.