Returning a DataFrame from Series.apply when the supplied function returns a Series is deprecated
Question:
The way I have always split a column containing lists into multiple columns is:
df['column_with_lists'].apply(pd.Series)
This returns a new dataframe that can then be concatenated.
With pandas 2.1, this now raises: FutureWarning: Returning a DataFrame from Series.apply when the supplied function returns a Series is deprecated and will be removed in a future version.
What is now the recommended way to split a column containing lists?
Answers:
Making Series.apply
return a DataFrame is deprecated because :
This pattern was very slow and it’s recommended to use alternative methods to archive the same goal. Check GH52116 for more details/context
You can for example do a classical construction like below to turn-off the FutureWarning
:
df = pd.DataFrame({"column_with_lists": [range(4, 6), range(1, 5)]})
out = pd.DataFrame(df["column_with_lists"].tolist())
Output :
print(out)
0 1 2 3
0 4 5 NaN NaN
1 1 2 3.0000 4.0000
The way I have always split a column containing lists into multiple columns is:
df['column_with_lists'].apply(pd.Series)
This returns a new dataframe that can then be concatenated.
With pandas 2.1, this now raises: FutureWarning: Returning a DataFrame from Series.apply when the supplied function returns a Series is deprecated and will be removed in a future version.
What is now the recommended way to split a column containing lists?
Making Series.apply
return a DataFrame is deprecated because :
This pattern was very slow and it’s recommended to use alternative methods to archive the same goal. Check GH52116 for more details/context
You can for example do a classical construction like below to turn-off the FutureWarning
:
df = pd.DataFrame({"column_with_lists": [range(4, 6), range(1, 5)]})
out = pd.DataFrame(df["column_with_lists"].tolist())
Output :
print(out)
0 1 2 3
0 4 5 NaN NaN
1 1 2 3.0000 4.0000