How to fill_between based on a condition
Question:
I have a df from 2002 to 2017 that looks like this:
Team,Finish.1_x,Finish.1_y,Win%Detroit,Win%Chicago,Date,GdpDetroit,GdpChicago,D_GDP_Change,C_GDP_Change,detroitdummy,chicagodummy
2002–03,1st,6th,0.61,0.366,2002-01-01,49650,54744,933.0,-27.0,1,0
2003–04,2nd,8th,0.659,0.28,2003-01-01,51101,55273,1451.0,529.0,1,1
2004–05,1st,2nd,0.659,0.573,2004-01-01,50935,56507,-166.0,1234.0,0,1
2005–06,1st,4th[k],0.78,0.5,2005-01-01,52028,57608,1093.0,1101.0,1,1
2006–07,1st,3rd,0.646,0.598,2006-01-01,50576,58717,-1452.0,1109.0,0,1
2007–08,1st,4th,0.72,0.402,2007-01-01,50450,59240,-126.0,523.0,0,1
2008–09,3rd,2nd,0.476,0.5,2008-01-01,47835,57197,-2615.0,-2043.0,0,0
2009–10,5th,3rd,0.329,0.5,2009-01-01,43030,54802,-4805.0,-2395.0,0,0
2010–11,4th,1st,0.366,0.756,2010-01-01,45735,55165,2705.0,363.0,1,1
2012–13,4th,2nd,0.354,0.549,2012-01-01,48469,57254,926.0,1463.0,1,1
2013–14,4th,2nd,0.354,0.585,2013-01-01,48708,56939,239.0,-315.0,1,0
2014–15,5th,2nd,0.39,0.61,2014-01-01,49594,57823,886.0,884.0,1,1
2015–16,3rd,4th,0.537,0.512,2015-01-01,50793,59285,1199.0,1462.0,1,1
2016–17,5th,4th,0.451,0.5,2016-01-01,51578,60191,785.0,906.0,1,1
2017–18,4th,5th,0.476,0.329,2017-01-01,52879,61170,1301.0,979.0,1,1
I am trying to have a chart like this one:
with the Date column on the x, the %Win on the y, and the fill_between conditional on the sign of GDP change: blue if GDP increased and red if it decreased.
The code I’m using:
fig, ax = plt.subplots()
ax.fill_between(df["Date"],np.max(df["Win%Detroit"]), where=(np.sign(df["D_GDP_Change"])<= 0), color='r', alpha=.1)
ax.fill_between(df["Date"],np.max(df["Win%Detroit"]), where=(np.sign(df["D_GDP_Change"])> 0), color='b', alpha=.1)
ax.plot(df["Date"],df["Win%Detroit"])
plt.show()
Cannot understand what I am doing wrong as the patches have empty spaces. Could you give me a hint?
I feel like the answer is here: conditional matplotlib fill_between for dataframe. Yet, I cannot figure out how to make the function sharper in this case.
Answers:
Your version of fill_between
with where=
only fills between subsequent dates when the condition is true for both dates.
To obtain a background filling depending on a condition, you could use a step function for filling, letting it step between 0 and the maximum:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
df = pd.DataFrame({"Date": pd.date_range('20020101', '20170101', freq='YS'),
"Win%Detroit": np.random.uniform(0.3, 0.7, 16),
"D_GDP_Change": np.random.randint(-2000, 2000, 16)})
fig, ax = plt.subplots()
max_val = np.max(df["Win%Detroit"]) * 1.05
ax.fill_between(df["Date"], np.where(df["D_GDP_Change"] <= 0, max_val, 0), color='r', alpha=.1, step='post')
ax.fill_between(df["Date"], np.where(df["D_GDP_Change"] > 0, max_val, 0), color='b', alpha=.1, step='post')
ax.plot(df["Date"], df["Win%Detroit"])
ax.margins(x=0, y=0)
plt.show()
With the dataframe from the post, and updating the ‘Date’ column to date type (df['Date'] = pd.to_datetime(df['Date'])
):
I have a df from 2002 to 2017 that looks like this:
Team,Finish.1_x,Finish.1_y,Win%Detroit,Win%Chicago,Date,GdpDetroit,GdpChicago,D_GDP_Change,C_GDP_Change,detroitdummy,chicagodummy
2002–03,1st,6th,0.61,0.366,2002-01-01,49650,54744,933.0,-27.0,1,0
2003–04,2nd,8th,0.659,0.28,2003-01-01,51101,55273,1451.0,529.0,1,1
2004–05,1st,2nd,0.659,0.573,2004-01-01,50935,56507,-166.0,1234.0,0,1
2005–06,1st,4th[k],0.78,0.5,2005-01-01,52028,57608,1093.0,1101.0,1,1
2006–07,1st,3rd,0.646,0.598,2006-01-01,50576,58717,-1452.0,1109.0,0,1
2007–08,1st,4th,0.72,0.402,2007-01-01,50450,59240,-126.0,523.0,0,1
2008–09,3rd,2nd,0.476,0.5,2008-01-01,47835,57197,-2615.0,-2043.0,0,0
2009–10,5th,3rd,0.329,0.5,2009-01-01,43030,54802,-4805.0,-2395.0,0,0
2010–11,4th,1st,0.366,0.756,2010-01-01,45735,55165,2705.0,363.0,1,1
2012–13,4th,2nd,0.354,0.549,2012-01-01,48469,57254,926.0,1463.0,1,1
2013–14,4th,2nd,0.354,0.585,2013-01-01,48708,56939,239.0,-315.0,1,0
2014–15,5th,2nd,0.39,0.61,2014-01-01,49594,57823,886.0,884.0,1,1
2015–16,3rd,4th,0.537,0.512,2015-01-01,50793,59285,1199.0,1462.0,1,1
2016–17,5th,4th,0.451,0.5,2016-01-01,51578,60191,785.0,906.0,1,1
2017–18,4th,5th,0.476,0.329,2017-01-01,52879,61170,1301.0,979.0,1,1
I am trying to have a chart like this one:
with the Date column on the x, the %Win on the y, and the fill_between conditional on the sign of GDP change: blue if GDP increased and red if it decreased.
The code I’m using:
fig, ax = plt.subplots()
ax.fill_between(df["Date"],np.max(df["Win%Detroit"]), where=(np.sign(df["D_GDP_Change"])<= 0), color='r', alpha=.1)
ax.fill_between(df["Date"],np.max(df["Win%Detroit"]), where=(np.sign(df["D_GDP_Change"])> 0), color='b', alpha=.1)
ax.plot(df["Date"],df["Win%Detroit"])
plt.show()
Cannot understand what I am doing wrong as the patches have empty spaces. Could you give me a hint?
I feel like the answer is here: conditional matplotlib fill_between for dataframe. Yet, I cannot figure out how to make the function sharper in this case.
Your version of fill_between
with where=
only fills between subsequent dates when the condition is true for both dates.
To obtain a background filling depending on a condition, you could use a step function for filling, letting it step between 0 and the maximum:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
df = pd.DataFrame({"Date": pd.date_range('20020101', '20170101', freq='YS'),
"Win%Detroit": np.random.uniform(0.3, 0.7, 16),
"D_GDP_Change": np.random.randint(-2000, 2000, 16)})
fig, ax = plt.subplots()
max_val = np.max(df["Win%Detroit"]) * 1.05
ax.fill_between(df["Date"], np.where(df["D_GDP_Change"] <= 0, max_val, 0), color='r', alpha=.1, step='post')
ax.fill_between(df["Date"], np.where(df["D_GDP_Change"] > 0, max_val, 0), color='b', alpha=.1, step='post')
ax.plot(df["Date"], df["Win%Detroit"])
ax.margins(x=0, y=0)
plt.show()
With the dataframe from the post, and updating the ‘Date’ column to date type (df['Date'] = pd.to_datetime(df['Date'])
):