Pandas pipe throws error that df to be passed as an argument

Question:

Pandas pipe throws error that df to be passed as argument

Ideally pipe should take the dataframe as argument by default which is not happening in my case.

class Summary:
def get_src_base_df(self):
    <do stuff>
    return df
    
@staticmethod
def sum_agg(df):
    cols = 'FREQUENCY_ID|^FLAG_'
    df = (df.filter(regex=cols).fillna(0)
          .groupby('FREQUENCY_ID').agg(lambda x: x.astype(int).sum()))
    return df

# few other @static methods

def get_src_df(self):
    df = self.get_src_base_df().pipe(self.sum_agg()) #pipe chain continues  
    # --> error: sum_agg() missing 1 required positional argument: 'df'
    # but the below line works
    # df = self.get_src_base_df().pipe((lambda x: self.sum_agg(x))) #pipe chain continues


   
Asked By: rams

||

Answers:

By doing self.sum_agg(), you’re calling the sum_agg function (@staticmethods in Python are pretty much indistinguishable from functions), and since it doesn’t have an argument right there in that call, it rightfully fails. You need to pass the function object, not the value returned by the function.

Do this, instead :

def get_src_df(self):
    df = self.get_src_base_df().pipe(self.sum_agg)  # note: no parentheses
Answered By: Dominik StaƄczak
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.