Iterating through columns and subtracting with the Last Column in pd.dataframe
Question:
I am a python newbie and currently sitting on the evaluation of my simulations. I have read the results of the tab files into a pandas dataframe.
My index is the frequency. The remaining columns represent the amplitude of the calculated PSD.
I want to subtract these columns (e.g. a,b,c,d …) with the last column, which is my test data.
The first table is an example of my current Dataframe. I want to substract each column/row with the test_data to get at the end the Standard deviation etc. of each column like in the following table:
frequency (index)
A
B
C
test_data
1
1.2
5.0
2.4
1.9
2
2.1
3.0
2.7
2.6
3
3.0
6.0
2.9
2.8
The following table/dataframe is the wanted outcome after the loop.
frequency (index)
A
B
C
test_data
1
test_data[1]-A[1]
test_data[1]-B[1]
test_data[1]-C[1]
1.9
…
…
…
…
….
3
test_data[n]-A[n]
test_data[n]-B[n]
test_data[n]-C[n]
2.8
average of column
0.33
-2.3
-0.233
frequency (index)
A
B
C
test_data
1
0.7
-3.1
-0.5
1.9
2
0.5
-0.4
-0.1
2.6
3
-0.2
-3.2
-0.1
2.8
average of column
0.33
-2.3
-0.233
I woult be very very grateful for any help regarding the loop.
Answers:
You can use drop
to get rid of the non target columns, then rsub
to subtract the test_data. Finally concat
to the original dataset:
df2 = df.drop(columns=['frequency (index)', 'test_data']).rsub(df['test_data'] ,axis=0)
out = pd.concat([df.assign(**df2), df2.sum().to_frame().T])
I am a python newbie and currently sitting on the evaluation of my simulations. I have read the results of the tab files into a pandas dataframe.
My index is the frequency. The remaining columns represent the amplitude of the calculated PSD.
I want to subtract these columns (e.g. a,b,c,d …) with the last column, which is my test data.
The first table is an example of my current Dataframe. I want to substract each column/row with the test_data to get at the end the Standard deviation etc. of each column like in the following table:
frequency (index) | A | B | C | test_data |
---|---|---|---|---|
1 | 1.2 | 5.0 | 2.4 | 1.9 |
2 | 2.1 | 3.0 | 2.7 | 2.6 |
3 | 3.0 | 6.0 | 2.9 | 2.8 |
The following table/dataframe is the wanted outcome after the loop.
frequency (index) | A | B | C | test_data |
---|---|---|---|---|
1 | test_data[1]-A[1] | test_data[1]-B[1] | test_data[1]-C[1] | 1.9 |
… | … | … | … | …. |
3 | test_data[n]-A[n] | test_data[n]-B[n] | test_data[n]-C[n] | 2.8 |
average of column | 0.33 | -2.3 | -0.233 |
frequency (index) | A | B | C | test_data |
---|---|---|---|---|
1 | 0.7 | -3.1 | -0.5 | 1.9 |
2 | 0.5 | -0.4 | -0.1 | 2.6 |
3 | -0.2 | -3.2 | -0.1 | 2.8 |
average of column | 0.33 | -2.3 | -0.233 |
I woult be very very grateful for any help regarding the loop.
You can use drop
to get rid of the non target columns, then rsub
to subtract the test_data. Finally concat
to the original dataset:
df2 = df.drop(columns=['frequency (index)', 'test_data']).rsub(df['test_data'] ,axis=0)
out = pd.concat([df.assign(**df2), df2.sum().to_frame().T])