Combining specific rows that have NaN for a different column on python using pandas

Question:

I am hoping this question makes sense.
I have a table I extracted from a PDF of chemical names that I am trying to format and I am having issues it looks like this: table

Some of the chemical names are split into multiple rows and I need each name in its won row. I did notice the chemicals whose names are split into multiple rows have an NaN in the first column.

EDIT: after running
dt.head(15).to_dict()

{‘Unnamed: 0’: {6: ‘1’,
7: nan,
8: ‘2’,
9: ‘3’,
10: nan,
11: nan,
12: ‘4’,
13: ‘5’,
14: nan,
15: nan,
16: ‘6’,
17: ‘7’,
18: ‘8’,
19: ‘9’,
20: nan},
‘Phenolics’: {6: ‘Dihydroquercetin’,
7: ‘7,30-dimethyl ether’,
8: ‘Artelin’,
9: ‘Esculin 7-‘,
10: ‘methylether’,
11: ‘(methylesculin)’,
12: ‘Esculin’,
13: ‘Scopoletin (7-‘,
14: ‘hydroxy-6-‘,
15: ‘methoxycoumarin)’,
16: ‘Axillarin’,
17: ‘Esculetin’,
18: ‘Isoscopoletin’,
19: ‘6-Beta-D-glucosyl-7-‘,
20: ‘methoxycoumarin’}}

Can anyone help me?
Thank you!

Asked By: Celin Younan

||

Answers:

df["group"] = df["Unnamed: 0"].ffill()
df.groupby("group").agg({"Phenolics": "".join})
Answered By: MoRe

A one-line solution

df = df.fillna(method='ffill').groupby('Unnamed: 0')['Phenolics'].apply(' '.join).reset_index()
Answered By: 7shoe
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.