How to combine values of multiple rows in panda

Question:

I have dataframe file that split text into multiple rows, like:

A B
aaa bbbb
ccccc NaN
NaN NaN
dddd ffff
eeee NaN
gg NaN

I hope to merge the value of each row to its next rows unless it is blank and get a data frame like:

A B
aaacccc bbbb
ddddeeeegg ffff

Is there an efficient way to convert the dataframe in python?

Asked By: Yvonne

||

Answers:

You can create a mask and group from the rows with all NaNs, then GroupBy.agg to join the strings:

# rows with all NaN?
mask = df.isna().all(axis=1)
# create group starting with all-NaN rows
group = mask.cumsum()

# filter, group, aggregate
out = df[~mask].groupby(group).agg(lambda x: ''.join(x.dropna()))

output:

            A     B
0    aaaccccc  bbbb
1  ddddeeeegg  ffff
Answered By: mozway
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.