How to calculate a pandas column based on the previous value in the same column that is calculated?
Question:
I want to achieve the following table, where the values in "md" needs to be calculated :
msotc md
0 0 0
1 1 1
2 2 3
3 3 7
4 4 15
5 5 31
6 6 63
7 7 127
Given:
- The total rows is based on a given value (msotc + 1)
- The first value in column "md" needs to be 0
- The value for row[1] to row[-1] are calculate based in formula: (prev_md * soss) + sopd
Solutions (I think):
- Create a new column with a formula
- Create an empty column "md" with the value 0 on index[0] and calculate the other rows
import pandas as pd
import numpy as np
msotc = 7
sopd = 1 # (= not fixed, could be eg. 0.5)
soss = 2 # (= not fixed, could be eg. 1.05)
arr = [np.NaN] * (msotc + 1)
arr[0] = 0
data = {
"msotc": range(0, msotc + 1, 1),
"md": arr
}
df = pd.DataFrame(
data=data
)
# df["md"] = (df["md"].shift(1) * soss) + sopd <- This doesn't work
Answers:
This should work fine. It is quite easy to understand, just a simple loop.
arr = [0] * (msotc + 1)
for i in range(msotc + 1):
if i == 0:
continue
arr[i] = (arr[i - 1] * soss) + sopd
Try this:
import pandas as pd
msotc = 7
sopd = 1
soss = 2
msotc_vals = []
arr = [0]
for val in range(msotc + 1):
msotc_vals += [val]
arr += [arr[-1] * soss + sopd]
data = {"msotc": msotc_vals, "md": arr[:-1]}
df = pd.DataFrame(data=data)
You can use math to convert your formula into a geometric series.
md[n] = md[n-1]*soss + sopd
expressed in terms of md[0]
and using the formula for the sum of powers:
md[n] = md[0]*soss**(n-1) + sopd * (soss**n - 1)/(soss-1)
Thus no need to loop, you can vectorize:
msotc = 7
sopd = 1
soss = 2
md0 = 0
n = np.arange(msotc+1)
df = pd.DataFrame({'msotc': n, 'md': md0*soss**np.clip(n-1, 0, np.inf) + sopd*(soss**n-1)/(soss-1)})
output:
msotc md
0 0 0.0
1 1 1.0
2 2 3.0
3 3 7.0
4 4 15.0
5 5 31.0
6 6 63.0
7 7 127.0
I want to achieve the following table, where the values in "md" needs to be calculated :
msotc md
0 0 0
1 1 1
2 2 3
3 3 7
4 4 15
5 5 31
6 6 63
7 7 127
Given:
- The total rows is based on a given value (msotc + 1)
- The first value in column "md" needs to be 0
- The value for row[1] to row[-1] are calculate based in formula: (prev_md * soss) + sopd
Solutions (I think):
- Create a new column with a formula
- Create an empty column "md" with the value 0 on index[0] and calculate the other rows
import pandas as pd
import numpy as np
msotc = 7
sopd = 1 # (= not fixed, could be eg. 0.5)
soss = 2 # (= not fixed, could be eg. 1.05)
arr = [np.NaN] * (msotc + 1)
arr[0] = 0
data = {
"msotc": range(0, msotc + 1, 1),
"md": arr
}
df = pd.DataFrame(
data=data
)
# df["md"] = (df["md"].shift(1) * soss) + sopd <- This doesn't work
This should work fine. It is quite easy to understand, just a simple loop.
arr = [0] * (msotc + 1)
for i in range(msotc + 1):
if i == 0:
continue
arr[i] = (arr[i - 1] * soss) + sopd
Try this:
import pandas as pd
msotc = 7
sopd = 1
soss = 2
msotc_vals = []
arr = [0]
for val in range(msotc + 1):
msotc_vals += [val]
arr += [arr[-1] * soss + sopd]
data = {"msotc": msotc_vals, "md": arr[:-1]}
df = pd.DataFrame(data=data)
You can use math to convert your formula into a geometric series.
md[n] = md[n-1]*soss + sopd
expressed in terms of md[0]
and using the formula for the sum of powers:
md[n] = md[0]*soss**(n-1) + sopd * (soss**n - 1)/(soss-1)
Thus no need to loop, you can vectorize:
msotc = 7
sopd = 1
soss = 2
md0 = 0
n = np.arange(msotc+1)
df = pd.DataFrame({'msotc': n, 'md': md0*soss**np.clip(n-1, 0, np.inf) + sopd*(soss**n-1)/(soss-1)})
output:
msotc md
0 0 0.0
1 1 1.0
2 2 3.0
3 3 7.0
4 4 15.0
5 5 31.0
6 6 63.0
7 7 127.0