Generating future dataframe with annual growth rate
Question:
I am trying to generate annual data for a certain product when I have data for the base year and growth rate.
In the toy example, each product has different annual growth rate in efficiency by its ‘color’, and I want to generate yearly data until 2030.
Therefore, I have base year data (base_year) as follows:
year color shape efficiency
0 2018 red circle 50
1 2018 red square 30
2 2018 blue circle 100
3 2018 blue square 60
And each type of product’s growth rate (growthrate) as:
color rate
0 red 30
1 blue 20
Results I desire is:
year color shape efficiency
0 2018 red circle 50
1 2018 red square 30
2 2018 blue circle 100
3 2018 blue square 60
4 2019 red circle 65
5 2019 red square 39
6 2019 blue circle 120
7 2019 blue square 72
8 2020 red circle 84.5
... (until 2030)
The data used in the toy code is..
base_year = pd.DataFrame(data = {'year': [2018,2018,2018,2018],
'color': ['red', 'red', 'blue', 'blue'],
'shape' : ['circle', 'square', 'circle', 'square'],
'efficiency' : [50, 30, 100, 60]}, columns = ['year', 'color', 'shape', 'efficiency'])
growthrate = pd.DataFrame(data = {'color': ['red', 'blue'],
'rate' : [30, 20]}, columns = ['color', 'rate'])
I’ve been trying some approach using .loc, but it seems such approach is quite inefficient.
Any suggestions or hints would be appreciated. Thank you in advance!
Answers:
Here is one way to do this:
years = 2031 - 2018
df = (pd.concat([df.assign(year=df['year']+i,
efficiency=df['efficiency']*((df['rate']/100+1)**i))
for i, df in enumerate([base_year.merge(growthrate, on='color')] * years)])
.drop('rate', axis=1))
what if it is GDP, and you have one value for the base year and just one growth rate, and you want to see what future gap values will be at the growth rate?
I am trying to generate annual data for a certain product when I have data for the base year and growth rate.
In the toy example, each product has different annual growth rate in efficiency by its ‘color’, and I want to generate yearly data until 2030.
Therefore, I have base year data (base_year) as follows:
year color shape efficiency
0 2018 red circle 50
1 2018 red square 30
2 2018 blue circle 100
3 2018 blue square 60
And each type of product’s growth rate (growthrate) as:
color rate
0 red 30
1 blue 20
Results I desire is:
year color shape efficiency
0 2018 red circle 50
1 2018 red square 30
2 2018 blue circle 100
3 2018 blue square 60
4 2019 red circle 65
5 2019 red square 39
6 2019 blue circle 120
7 2019 blue square 72
8 2020 red circle 84.5
... (until 2030)
The data used in the toy code is..
base_year = pd.DataFrame(data = {'year': [2018,2018,2018,2018],
'color': ['red', 'red', 'blue', 'blue'],
'shape' : ['circle', 'square', 'circle', 'square'],
'efficiency' : [50, 30, 100, 60]}, columns = ['year', 'color', 'shape', 'efficiency'])
growthrate = pd.DataFrame(data = {'color': ['red', 'blue'],
'rate' : [30, 20]}, columns = ['color', 'rate'])
I’ve been trying some approach using .loc, but it seems such approach is quite inefficient.
Any suggestions or hints would be appreciated. Thank you in advance!
Here is one way to do this:
years = 2031 - 2018
df = (pd.concat([df.assign(year=df['year']+i,
efficiency=df['efficiency']*((df['rate']/100+1)**i))
for i, df in enumerate([base_year.merge(growthrate, on='color')] * years)])
.drop('rate', axis=1))
what if it is GDP, and you have one value for the base year and just one growth rate, and you want to see what future gap values will be at the growth rate?