Collapse Pandas rows to elliminate NaN entries

Question:

Let’s consider the following DataFrame

Name A B C D
tom 10.0 NaN NaN NaN
tom NaN 15.0 NaN NaN
tom NaN NaN 20.0 NaN
tom NaN NaN NaN 25.0
tom 30.0 NaN NaN NaN
tom NaN NaN NaN 40.0
john 1.0 NaN NaN NaN
john NaN 2.0 NaN NaN
john NaN NaN 3.0 NaN
john NaN NaN NaN 4.0
john 5.0 NaN NaN NaN
john NaN 6.0 NaN NaN
john NaN NaN 7.0 NaN
john NaN NaN NaN 8.0

I want to collapse it to limit the amount of NaN values in the DataFrame – can be sequential, i.e. combine the neighboring rows if possible, but all I care about is that the values of columns A-D correspond to the same Name after the collapse

My perfect outcome would be

Name A B C D
tom 10.0 15.0 20.0 25.0
tom 30.0 NaN NaN 40.0
john 1.0 2.0 3.0 4.0
john 5.0 6.0 7.0 8.0

From what I understand, Pandas groupby('Name') will not do the trick, because it will leave one entry for each name.

If that is of any help, I use a dictionary to create the dataframe. The dictionary looks like this:

{
    "a": {
        "tom": [10.0, 30.0],
        "john": [1.0, 5.0]
    },
    "b": {
        "tom": [15.0],
        "john": [2.0, 6.0]
    },
    .....
}

So, basically, I am taking every number in the dictionary then create a row with just this number, and then combine all of the rows.

Is there a simple way to collapse the resulting DataFrame or build a more compact DataFrame given such a dictionary

Asked By: Gabriel

||

Answers:

You can .groupby + .transform (where you "move" the values up). Then drop rows which contain all NaN values:

print(
    df.set_index("Name")
    .groupby(level=0)
    .transform(lambda x: sorted(x, key=lambda k: pd.isna(k)))
    .dropna(axis=0, how="all")
    .reset_index()
)

Prints:

   Name     A     B     C     D
0   tom  10.0  15.0  20.0  25.0
1   tom  30.0   NaN   NaN  40.0
2  john   1.0   2.0   3.0   4.0
3  john   5.0   6.0   7.0   8.0
Answered By: Andrej Kesely
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.