How do i 'ungroup' a grouped data frame?

Question

I have a large dataframe containing ∼ 100 million observations of stock returns for a given date and an identifier for each specific stock. My problem is that for some stocks at some dates there are missing values in the return column. To set up a small example the data frame looks something like this:

ID	Date	Return
10001	1970-01-01	0.043
10001	1970-01-02	0.023
10001	1970-01-03	-0.010
10002	1970-01-01	0.010
10002	1970-01-02	NaN
10002	1970-01-03	-0.034
10003	1970-01-01	0,040
10003	1970-01-02	-0.041
10003	1970-01-03	0.009

I wish to remove all the stocks containing one or more missing values in the return column, which I do by:

df = df[df['RET'].notna().groupby('ID').transform('all')]

From the data frame example above, this will generate a grouped data frame without stocks having missing values in the return column:

ID	Date	Return
10001	1970-01-01	0.043
	1970-01-02	0.023
	1970-01-03	-0.010
10003	1970-01-01	0,040
	1970-01-02	-0.041
	1970-01-03	0.009

However, I need to convert the data frame back to being ungrouped again. Do any of you have a solution on how to do this? To specify, I want the data frame to look like this:

ID	Date	Return
10001	1970-01-01	0.043
10001	1970-01-02	0.023
10001	1970-01-03	-0.010
10003	1970-01-01	0,040
10003	1970-01-02	-0.041
10003	1970-01-03	0.009

Thank you in advance for any possible solution.

I have already tried using .unstack() and .reset_index() without any luck

Asked By: Sebastian

||

Source

Answer 1

You don’t need grouping. Simple boolean indexing + .isin() is enough:

x = df.loc[df['Return'].isna(), 'ID'].unique()
print(df[~df['ID'].isin(x)])

Prints:

      ID        Date  Return
0  10001  1970-01-01   0.043
1  10001  1970-01-02   0.023
2  10001  1970-01-03  -0.010
6  10003  1970-01-01   0,040
7  10003  1970-01-02  -0.041
8  10003  1970-01-03   0.009

Answered By: Andrej Kesely

Answer 2

If anyone should wonder, this could also be fixed by adding ‘, group_keys=False’ to the groupby function:

df = df[df['RET'].notna().groupby('ID', group_keys=False).transform('all')]

Answered By: Sebastian

How do i 'ungroup' a grouped data frame?

Question:

Answers: