How to get difference of columns in DataFrame Pandas?
Question:
I need to get a DataFrame with column difference of column that I choose (for example the last one)
I tried using df.diff(axis=1, periods=1)
will count column difference with the next column. However, I want to get difference of columns with exactly one column (last one).
Answers:
Use DataFrame.sub
for subtract by last column selected by DataFrame.iloc
:
df1 = df.sub(df.iloc[:, -1], axis=0)
If need subtract by column selected by label:
df1 = df.sub(df['col'], axis=0)
To get the difference between the last column and another column in a pandas DataFrame, you can use the following code:
import pandas as pd
create a sample DataFrame
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})
calculate the difference between the last column and the ‘A’ column
diff = df.iloc[:, -1] - df['A']
create a new DataFrame with the difference as a column
diff_df = pd.DataFrame(diff, columns=['Difference'])
Here, df.iloc[:, -1]
selects the last column of the DataFrame, and df['A']
selects the 'A'
column. Subtracting one from the other gives the difference between the two columns. Finally, a new DataFrame is created with the difference as a column.
If you want to calculate the difference between the last column and all other columns, you can modify the code like this:
calculate the difference between the last column and all other columns
diff = df.iloc[:, -1] - df.iloc[:, :-1]
create a new DataFrame with the differences as columns
diff_df = pd.DataFrame(diff, columns=df.columns[:-1])
Here, df.iloc[:, :-1]
selects all columns except the last one, and subtracting the last column from these gives the differences between the last column and all other columns. The resulting differences are then stored in a new DataFrame, with columns corresponding to the original columns except the last one.
I need to get a DataFrame with column difference of column that I choose (for example the last one)
I tried using df.diff(axis=1, periods=1)
will count column difference with the next column. However, I want to get difference of columns with exactly one column (last one).
Use DataFrame.sub
for subtract by last column selected by DataFrame.iloc
:
df1 = df.sub(df.iloc[:, -1], axis=0)
If need subtract by column selected by label:
df1 = df.sub(df['col'], axis=0)
To get the difference between the last column and another column in a pandas DataFrame, you can use the following code:
import pandas as pd
create a sample DataFrame
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})
calculate the difference between the last column and the ‘A’ column
diff = df.iloc[:, -1] - df['A']
create a new DataFrame with the difference as a column
diff_df = pd.DataFrame(diff, columns=['Difference'])
Here, df.iloc[:, -1]
selects the last column of the DataFrame, and df['A']
selects the 'A'
column. Subtracting one from the other gives the difference between the two columns. Finally, a new DataFrame is created with the difference as a column.
If you want to calculate the difference between the last column and all other columns, you can modify the code like this:
calculate the difference between the last column and all other columns
diff = df.iloc[:, -1] - df.iloc[:, :-1]
create a new DataFrame with the differences as columns
diff_df = pd.DataFrame(diff, columns=df.columns[:-1])
Here, df.iloc[:, :-1]
selects all columns except the last one, and subtracting the last column from these gives the differences between the last column and all other columns. The resulting differences are then stored in a new DataFrame, with columns corresponding to the original columns except the last one.