Map dataframe function without lambda
Question:
I have the following function:
def summarize(text, percentage=.6):
import numpy as np
sentences = nltk.sent_tokenize(text)
sentences = sentences[:int(percentage*len(sentences))]
summary = ''.join([str(sentence) for sentence in sentences])
return summary
And I want to map it to dataframe rows. It works pretty well when I use the following code :
df['summary'] = df['text'].map(summarize)
However, when I want to change the percentage variable in this call, it does df['summary'] = df['text'].map(summarize(percentage=.8))
, it shows an error indicating it requires another argument, which is text
. Of course, it can be resolved using a lambda function as follows:
df['summary'] = df['text'].map(lambda x: summarize(x, percentage=.8))
But I do not want use the lambda in the call. Is there any method to do it otherwise? For example using kwargs
inside the function to refer to the text
column in the dataframe? Thank you
Answers:
Possible solution is use Series.apply
instead map
, then is possible add parameters without lambda like named arguments:
df['summary'] = df['text'].map(summarize, percentage=.8)
TypeError: map() got an unexpected keyword argument ‘percentage’
df['summary'] = df['text'].apply(summarize, percentage=.8)
I have the following function:
def summarize(text, percentage=.6):
import numpy as np
sentences = nltk.sent_tokenize(text)
sentences = sentences[:int(percentage*len(sentences))]
summary = ''.join([str(sentence) for sentence in sentences])
return summary
And I want to map it to dataframe rows. It works pretty well when I use the following code :
df['summary'] = df['text'].map(summarize)
However, when I want to change the percentage variable in this call, it does df['summary'] = df['text'].map(summarize(percentage=.8))
, it shows an error indicating it requires another argument, which is text
. Of course, it can be resolved using a lambda function as follows:
df['summary'] = df['text'].map(lambda x: summarize(x, percentage=.8))
But I do not want use the lambda in the call. Is there any method to do it otherwise? For example using kwargs
inside the function to refer to the text
column in the dataframe? Thank you
Possible solution is use Series.apply
instead map
, then is possible add parameters without lambda like named arguments:
df['summary'] = df['text'].map(summarize, percentage=.8)
TypeError: map() got an unexpected keyword argument ‘percentage’
df['summary'] = df['text'].apply(summarize, percentage=.8)