How to apply a user defined function between rows in pandas using both rows values?

Question:

I have two rows of data in a Pandas data frame and want to operate each column separately with a function that includes both values e.g.

import pandas as pd    
df = pd.DataFrame({"x": [1, 2], "z": [2, 6], "i": [3, 12], "j": [4, 20], "y": [5, 30]})
    x   z   i   j   y
0   1   2   3   4   5
1   2   6   12  20  30

The function is something like the row 2 val minus row 1 val, divided by the latter – for each column separately e.g.

(row2-row1)/row2

so I can get the following

0.5  0.667   0.75   0.8   0.833

Based on the following links

how to apply a user defined function column wise on grouped data in pandas

https://www.geeksforgeeks.org/apply-a-function-to-each-row-or-column-in-dataframe-using-pandas-apply/

https://pythoninoffice.com/pandas-how-to-calculate-difference-between-rows

Groupby and apply a defined function – Pandas

I tried the following

df.apply(lambda x,y: (x + y)/y, axis=0)

This does not work as it expects y as an argument

df.diff()

This works but then it is not exactly the function I want.

Does anyone know how to achieve the result I expect?

Asked By: Juan Ossa

||

Answers:

df.diff(1).div(df)

output

    x   z    i    j   y
0   NaN NaN  NaN  NaN NaN
1   0.5 0.67 0.75 0.8 0.83

With a short example, I answered. If I’m misunderstanding something, edit your example more long. I’ll answer again.

Answered By: Panda Kim

After testing many things I found out that it was not required to include two variables in the Lambda function (x,y), but just one and treat that as a vector with all values in the column, so the following solved the issue

df.apply(lambda x: (x[1] - x[0]) / x[1], axis=0)

This avoids having a result with NaN in the first row.

Answered By: Juan Ossa
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.