How to get difference of columns in DataFrame Pandas?

Question:

I need to get a DataFrame with column difference of column that I choose (for example the last one)

I tried using df.diff(axis=1, periods=1) will count column difference with the next column. However, I want to get difference of columns with exactly one column (last one).

Asked By: ddddd

||

Answers:

Use DataFrame.sub for subtract by last column selected by DataFrame.iloc:

df1 = df.sub(df.iloc[:, -1], axis=0)

If need subtract by column selected by label:

df1 = df.sub(df['col'], axis=0)
Answered By: jezrael

To get the difference between the last column and another column in a pandas DataFrame, you can use the following code:

import pandas as pd

create a sample DataFrame

df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})

calculate the difference between the last column and the ‘A’ column

diff = df.iloc[:, -1] - df['A']

create a new DataFrame with the difference as a column

diff_df = pd.DataFrame(diff, columns=['Difference'])

Here, df.iloc[:, -1] selects the last column of the DataFrame, and df['A'] selects the 'A' column. Subtracting one from the other gives the difference between the two columns. Finally, a new DataFrame is created with the difference as a column.

If you want to calculate the difference between the last column and all other columns, you can modify the code like this:

calculate the difference between the last column and all other columns

diff = df.iloc[:, -1] - df.iloc[:, :-1]

create a new DataFrame with the differences as columns

diff_df = pd.DataFrame(diff, columns=df.columns[:-1])

Here, df.iloc[:, :-1] selects all columns except the last one, and subtracting the last column from these gives the differences between the last column and all other columns. The resulting differences are then stored in a new DataFrame, with columns corresponding to the original columns except the last one.

Answered By: Yashodhan Advankar