Pandas dataframe groupby and aggreagate with conditions

Question:

Hi is there a way where I can group my dataframe based on specific columns and include empty value as well but only when all of the values of the specific column is empty.

Example:
I have a dataframe that look like this:
enter image description here

I am trying to group the dataframe based on Name and Subject.
and my expected output looks like this:
enter image description here

So, if a person takes more than one subject but one of them is empty, then drop the row so when aggregating the other rows it wont be included. If a person takes only one subject and it is empty then dont drop the row

[Updated]
Original dataframe
enter image description here

Outcome will still be the same. It will takes the first row value if all subjects of a person is empty

[Updated] Another new dataframe

enter image description here

Outcome will have the same number of subjects but there will be 3 year
enter image description here

Asked By: cyntha

||

Answers:

Here is a proposition with GroupBy.agg :

df = df.drop_duplicates(subset=["ID", "Name", "Subject"])

m = (df.groupby(["ID", "Name"])["Subject"].transform("size").gt(1)
     & df["Subject"].isnull())

out = df.loc[~m].groupby(["ID", "Name"], as_index=False).agg(list)

Output :
‚Äč

print(out)

   ID Name          Subject    Year
0   1   CC  [Math, English]  [1, 3]
1   2   DD        [Physics]     [2]
2   3   EE      [Chemistry]     [1]
3   4   FF            [nan]     [0]
4   5   GG            [nan]     [0]
Answered By: Timeless
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.