Pandas: df.diff() into new column in a multi column dataframe

Question:

I have a DataFrame which is like that:

date Col.A Col.B Col.C
2022 1 1 1
2023 2 2 2
2024 3 3 3

And I want to add Columns where the Diff is calculated and looks like that afterwards:

date Col.A Diff Col.A Col.B Diff Col.B Col.C Diff Col.C
2022 1 nan 1 nan 1 nan
2023 2 1 2 1 2 1
2024 3 1 3 1 3 1

I tried it like that, but it doesnt work

df = pd.read_excel('example.xlsx', header=0).set_index(['date])

df.diff()

How can I do that in Pandas

Asked By: SebastianHeeg

||

Answers:

df = pd.concat([df,df.diff().add_prefix('Diff ')], axis=1)
df = df.reindex(columns=sorted(df.columns, key=lambda x: (x.split(' ')[-1], x.split(' ')[0])))
df
###
   date  Col.A  Diff Col.A  Col.B  Diff Col.B  Col.C  Diff Col.C
0  2022      1         NaN      1         NaN      1         NaN
1  2023      2         1.0      2         1.0      2         1.0
2  2024      3         1.0      3         1.0      3         1.0
Answered By: Baron Legendre

I rename new columns names by DataFrame.add_suffix for correct sorting without specify all columns names:

df = pd.read_excel('example.xlsx', header=0, index_col=['date'])

df = pd.concat([df, df.diff().add_suffix(' diff')], axis=1).sort_index(axis=1)

print (df)
      Col.A  Col.A diff  Col.B  Col.B diff  Col.C  Col.C diff
date                                                         
2022      1         NaN      1         NaN      1         NaN
2023      2         1.0      2         1.0      2         1.0
2024      3         1.0      3         1.0      3         1.0
Answered By: jezrael
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.