Apply varying function for pandas dataframe depending on column arguments being passed

Question:

I would like to apply a function to each row of a pandas dataframe. Instead of the argument being variable across rows, it’s the function itself that is different for each row depending on the values in its columns. Let’s be more concrete:

import pandas as pd 
from scipy.interpolate import interp1d

d = {'col1': [1, 2], 'col2': [2, 4], 'col3': [3, 6]}
df = pd.DataFrame(data=d)
col1 col2 col3
0 1 2 3
1 2 4 6

Now, what I would like to achieve is to extrapolate columns 1 to 3 row-wise. For the first row, this would be:

f_1 =interp1d(range(df.shape[1]), df.loc[0], fill_value='extrapolate')

with the extrapolated value f_1(df.shape[1]).item() = 4.0.

So the column I would like to add would be:

col4
4
8

I’ve tried something like following:

import numpy as np
def interp_row(row):
    n = row.shape[1]
    fun = interp1d(np.arange(n), row, fill_value='extrapolate')
    return fun(n+1).item()

df['col4'] = df.apply(lambda row: interp_row(row))

Can I make this work?

Asked By: HannesZ

||

Answers:

You were almost there:

import pandas as pd 
from scipy.interpolate import interp1d
import numpy as np

d = {'col1': [1, 2], 'col2': [2, 4], 'col3': [3, 6]}
df = pd.DataFrame(data=d)

def interp_row(row):
    n = row.shape[0]
    fun = interp1d(np.arange(n), row, fill_value='extrapolate')
    return fun(n).item()

df['col4'] = df.apply(lambda row: interp_row(row), axis=1)
print(df)

which returns:


 col1  col2  col3  col4
0     1     2     3   4.0
1     2     4     6   8.0
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.