Python: How can I iterate within columns to make the difference the value and its previous one?
Question:
I am going mad with this code.
I have a simple dataframe like this:
Business Date dic-22 gen-23 feb-23
03/10/2022 112,0 121,1 131,2
04/10/2022 87,0 103,0 122,5
05/10/2022 114,3 102,8 99,6
06/10/2022 101,7 116,6 104,3
07/10/2022 116,6 103,7 110,8
10/10/2022 108,8 107,3 112,0
I want to make the division for each value and its previous one, per column.
So like: 87/112; 114/87;… for each column
In order to have
Business Date dic-22 gen-23 feb-23
03/10/2022 0 0 0
04/10/2022 0,8 0,9 0,9
05/10/2022 1,3 1,0 0,8
06/10/2022 0,9 1,1 1,0
07/10/2022 1,1 0,9 1,1
10/10/2022 0,9 1,0 1,0
Then, I would like to get the natural logarithm of these numbers.
I have tried to to it but I’m stuck with the first part (the division within values).
The code does not work.
Offering virtual mojitos to anyone willing to help.
for i, column in df.items():
for j, row in df.iterrows():
# if j > 0: # Skip first row
df.iloc[:, 1:] = df.iloc[:, 1:] / df.iloc[:, 1:].shift()`
Answers:
You can shift
after temporarily setting your dates as index:
tmp = df.set_index('Business Date')
out = (tmp/tmp.shift()).reset_index()
NB. in python the decimal separator is .
, not ,
, make sure to use the correct format. Or convert from strings using tmp = df.set_index('Business Date').apply(lambda s: pd.to_numeric(s.str.replace(',', '.')))
.
Output:
Business Date dic-22 gen-23 feb-23
0 03/10/2022 NaN NaN NaN
1 04/10/2022 0.776786 0.850537 0.933689
2 05/10/2022 1.313793 0.998058 0.813061
3 06/10/2022 0.889764 1.134241 1.047189
4 07/10/2022 1.146509 0.889365 1.062320
5 10/10/2022 0.933105 1.034716 1.010830
If you want to fill with NaNs with zeros and round:
out = (tmp/tmp.shift()).fillna(0).round(1).reset_index()
Output:
Business Date dic-22 gen-23 feb-23
0 03/10/2022 0.0 0.0 0.0
1 04/10/2022 0.8 0.9 0.9
2 05/10/2022 1.3 1.0 0.8
3 06/10/2022 0.9 1.1 1.0
4 07/10/2022 1.1 0.9 1.1
5 10/10/2022 0.9 1.0 1.0
handling non consecutive dates differently
For the sake of generalization, as your dates are not consecutive. In your example you shift to the previous available date. If instead, you wanted to access the exact previous day (10/10/2022 -> 09/10/2022), then you would need to change the code to:
df['Business Date'] = pd.to_datetime(df['Business Date'], dayfirst=True)
tmp = df.set_index('Business Date')
out = (tmp/tmp.shift(freq='1D')).fillna(0).round(1).reset_index()
Output:
Business Date dic-22 gen-23 feb-23
0 2022-10-03 0.0 0.0 0.0
1 2022-10-04 0.8 0.9 0.9
2 2022-10-05 1.3 1.0 0.8
3 2022-10-06 0.9 1.1 1.0
4 2022-10-07 1.1 0.9 1.1
5 2022-10-08 0.0 0.0 0.0
6 2022-10-10 0.0 0.0 0.0
7 2022-10-11 0.0 0.0 0.0
I am going mad with this code.
I have a simple dataframe like this:
Business Date dic-22 gen-23 feb-23
03/10/2022 112,0 121,1 131,2
04/10/2022 87,0 103,0 122,5
05/10/2022 114,3 102,8 99,6
06/10/2022 101,7 116,6 104,3
07/10/2022 116,6 103,7 110,8
10/10/2022 108,8 107,3 112,0
I want to make the division for each value and its previous one, per column.
So like: 87/112; 114/87;… for each column
In order to have
Business Date dic-22 gen-23 feb-23
03/10/2022 0 0 0
04/10/2022 0,8 0,9 0,9
05/10/2022 1,3 1,0 0,8
06/10/2022 0,9 1,1 1,0
07/10/2022 1,1 0,9 1,1
10/10/2022 0,9 1,0 1,0
Then, I would like to get the natural logarithm of these numbers.
I have tried to to it but I’m stuck with the first part (the division within values).
The code does not work.
Offering virtual mojitos to anyone willing to help.
for i, column in df.items():
for j, row in df.iterrows():
# if j > 0: # Skip first row
df.iloc[:, 1:] = df.iloc[:, 1:] / df.iloc[:, 1:].shift()`
You can shift
after temporarily setting your dates as index:
tmp = df.set_index('Business Date')
out = (tmp/tmp.shift()).reset_index()
NB. in python the decimal separator is .
, not ,
, make sure to use the correct format. Or convert from strings using tmp = df.set_index('Business Date').apply(lambda s: pd.to_numeric(s.str.replace(',', '.')))
.
Output:
Business Date dic-22 gen-23 feb-23
0 03/10/2022 NaN NaN NaN
1 04/10/2022 0.776786 0.850537 0.933689
2 05/10/2022 1.313793 0.998058 0.813061
3 06/10/2022 0.889764 1.134241 1.047189
4 07/10/2022 1.146509 0.889365 1.062320
5 10/10/2022 0.933105 1.034716 1.010830
If you want to fill with NaNs with zeros and round:
out = (tmp/tmp.shift()).fillna(0).round(1).reset_index()
Output:
Business Date dic-22 gen-23 feb-23
0 03/10/2022 0.0 0.0 0.0
1 04/10/2022 0.8 0.9 0.9
2 05/10/2022 1.3 1.0 0.8
3 06/10/2022 0.9 1.1 1.0
4 07/10/2022 1.1 0.9 1.1
5 10/10/2022 0.9 1.0 1.0
handling non consecutive dates differently
For the sake of generalization, as your dates are not consecutive. In your example you shift to the previous available date. If instead, you wanted to access the exact previous day (10/10/2022 -> 09/10/2022), then you would need to change the code to:
df['Business Date'] = pd.to_datetime(df['Business Date'], dayfirst=True)
tmp = df.set_index('Business Date')
out = (tmp/tmp.shift(freq='1D')).fillna(0).round(1).reset_index()
Output:
Business Date dic-22 gen-23 feb-23
0 2022-10-03 0.0 0.0 0.0
1 2022-10-04 0.8 0.9 0.9
2 2022-10-05 1.3 1.0 0.8
3 2022-10-06 0.9 1.1 1.0
4 2022-10-07 1.1 0.9 1.1
5 2022-10-08 0.0 0.0 0.0
6 2022-10-10 0.0 0.0 0.0
7 2022-10-11 0.0 0.0 0.0