How to count the number of quarters between two dates
Question:
Question 1:
How would you count the number of quarters between the starting date (base_date
variable) and the y
column in the pandas dataframe?
from dateutil.relativedelta import relativedelta
import pandas as pd
import numpy as np
base_date=pd.to_datetime('07/01/2019')
fake_data=pd.DataFrame([(x,pd.to_datetime('04/01/2020')+relativedelta(months=y)) for x in list(range(0,100)) for y in list(range(0,100))],columns=['x','y'])
fake_data['z']=fake_data['x']*np.random.uniform(low=1,high=1000)
UPDATE: one option seems to work for question1:
fake_data['month_diff']=(fake_data['y'].dt.year-base_date.year)*12+fake_data['y'].dt.month-base_date.month
fake_data['quarter']=(fake_data['month_diff']/3).astype(int)
Question 2:
If the x
and y
variables are now set to a pandas dataframe multi-level index. How would you perform the same task as in question 1?
fake_data.set_index(['x','y'],drop=True,inplace=True)
Answers:
Question 1:
A year contains 4 quarters. A simple way to count the difference of quarters between to dates is to convert them to year * 4 + quarter and use the difference:
fake_data.y.dt.year * 4 + fake_data.y.dt.quarter - (base_date.year * 4 + base_date.quarter)
Question 2:
The previous method still applies, but on index.levels[1]
:
fake_data.index.levels[1].year * 4 + fake_data.index.levels[1].quarter - (base_date.year * 4 + base_date.quarter)
Simply in first question you get a plain Series, while here it is an Index…
This is a pretty neat way to use pandas built-in period differencing:
import pandas as pd
t = pd.to_datetime('2025Q4').to_period(freq='Q') - pd.to_datetime('1850Q2').to_period(freq='Q')
print(t.n) # -> 702
You could extend this easily to your particular application.
Question 1:
How would you count the number of quarters between the starting date (base_date
variable) and the y
column in the pandas dataframe?
from dateutil.relativedelta import relativedelta
import pandas as pd
import numpy as np
base_date=pd.to_datetime('07/01/2019')
fake_data=pd.DataFrame([(x,pd.to_datetime('04/01/2020')+relativedelta(months=y)) for x in list(range(0,100)) for y in list(range(0,100))],columns=['x','y'])
fake_data['z']=fake_data['x']*np.random.uniform(low=1,high=1000)
UPDATE: one option seems to work for question1:
fake_data['month_diff']=(fake_data['y'].dt.year-base_date.year)*12+fake_data['y'].dt.month-base_date.month
fake_data['quarter']=(fake_data['month_diff']/3).astype(int)
Question 2:
If the x
and y
variables are now set to a pandas dataframe multi-level index. How would you perform the same task as in question 1?
fake_data.set_index(['x','y'],drop=True,inplace=True)
Question 1:
A year contains 4 quarters. A simple way to count the difference of quarters between to dates is to convert them to year * 4 + quarter and use the difference:
fake_data.y.dt.year * 4 + fake_data.y.dt.quarter - (base_date.year * 4 + base_date.quarter)
Question 2:
The previous method still applies, but on index.levels[1]
:
fake_data.index.levels[1].year * 4 + fake_data.index.levels[1].quarter - (base_date.year * 4 + base_date.quarter)
Simply in first question you get a plain Series, while here it is an Index…
This is a pretty neat way to use pandas built-in period differencing:
import pandas as pd
t = pd.to_datetime('2025Q4').to_period(freq='Q') - pd.to_datetime('1850Q2').to_period(freq='Q')
print(t.n) # -> 702
You could extend this easily to your particular application.