Row value equal to the row value itself plus the following rows up to the next non-null value of another column

Question:

The title is a bit confusing, but I think an example would make it clear.

I have this dataframe:

Date Info
27/07/2022 This is
NAN an
NAN example
28/07/2022 and this
NAN is another one

And this is my desired output:

Date Info
27/07/2022 This is an example
28/07/2022 and this is another one

I tried few attempts with fillna(method="ffil") but I wasn’t able to come with a solution. and I can’t think of anything able to solve this.
Thanks in advance!

Asked By: Alejandro Marín

||

Answers:

Initalize the input

df = pd.DataFrame({
    "Date": ["27/07/2022", np.NaN, np.NaN, "28/07/2022", np.NaN],
    "Info": ["This is", "an", "example", "and this", "is another one"]
})

Forward fill the dates

df.Date = df.Date.ffill()

group by date and concat string

df.Info = df.groupby(df.Date)["Info"].transform(lambda x: ' '.join(x))

Drop duplicates to get the result

df.drop_duplicates()

Result:

         Date                     Info
0  27/07/2022       This is an example
3  28/07/2022  and this is another one
Answered By: srinath

You can do groupby and agg after ffill-ing the Date column:

df.assign(Date=df['Date'].ffill()).groupby('Date',as_index=False).agg(' '.join)

Output:

         Date                     Info
0  27/07/2022       This is an example
1  28/07/2022  and this is another one
Answered By: SomeDude
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.