Apply varying function for pandas dataframe depending on column arguments being passed

Question

I would like to apply a function to each row of a pandas dataframe. Instead of the argument being variable across rows, it’s the function itself that is different for each row depending on the values in its columns. Let’s be more concrete:

import pandas as pd 
from scipy.interpolate import interp1d

d = {'col1': [1, 2], 'col2': [2, 4], 'col3': [3, 6]}
df = pd.DataFrame(data=d)

	col1	col2	col3
0	1	2	3
1	2	4	6

Now, what I would like to achieve is to extrapolate columns 1 to 3 row-wise. For the first row, this would be:

f_1 =interp1d(range(df.shape[1]), df.loc[0], fill_value='extrapolate')

with the extrapolated value f_1(df.shape[1]).item() = 4.0.

So the column I would like to add would be:

col4
4
8

I’ve tried something like following:

import numpy as np
def interp_row(row):
    n = row.shape[1]
    fun = interp1d(np.arange(n), row, fill_value='extrapolate')
    return fun(n+1).item()

df['col4'] = df.apply(lambda row: interp_row(row))

Can I make this work?

Asked By: HannesZ

||

Source

Answer 1

You were almost there:

import pandas as pd 
from scipy.interpolate import interp1d
import numpy as np

d = {'col1': [1, 2], 'col2': [2, 4], 'col3': [3, 6]}
df = pd.DataFrame(data=d)

def interp_row(row):
    n = row.shape[0]
    fun = interp1d(np.arange(n), row, fill_value='extrapolate')
    return fun(n).item()

df['col4'] = df.apply(lambda row: interp_row(row), axis=1)
print(df)

which returns:


 col1  col2  col3  col4
0     1     2     3   4.0
1     2     4     6   8.0

Answered By: Serge de Gosson de Varennes

Apply varying function for pandas dataframe depending on column arguments being passed

Question:

Answers: