New column in DataFrame from other columns AND rows

Question:

I want to create a new column, V, in an existing DataFrame, df. I would like the value of the new column to be the difference between the value in the ‘x’ column in that row, and the value of the ‘x’ column in the row below it.

As an example, in the picture below, I want the value of the new column to be
93.244598 – 93.093285 = 0.151313.

I know how to create a new column based on existing columns in Pandas, but I don’t know how to reference other rows using this method. Is there a way to do this that doesn’t involve iterating over the rows in the dataframe? (since I have read that this is generally a bad idea)

DataFrame I want to modify

Asked By: Mark B

||

Answers:

You can use pandas.DataFrame.shift for your use case.

The last row will not have any row to subtract from so you will get the value for that cell as NaN

df['temp_x'] = df['x'].shift(-1)
df[`new_col`] = df['x'] - df['temp_x']

or one liner :

df[`new_col`] = df['x'] - df['x'].shift(-1)

the column new_col will contain the expected data

Answered By: Himanshu Poddar

An ideal solution is to use diff:

df['new'] = df['x'].diff(-1)
Answered By: mozway
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.