group-by

Pandas Intersection after grouping based on common errors between each group

Pandas Intersection after grouping based on common errors between each group Question: I have the following dataframe: I want to find intersections based on ID that consistently have errors in all the Run. So, all IDs are repeating in all Runs. I tried to group data by Run first, then as per this similar question. …

Total answers: 1

Frequency rolling count with groupby, Pandas

Frequency rolling count with groupby, Pandas Question: I’m trying to get the frequency count of a groupby which is grouped by name and date. I am having trouble figuring out how to do a 3 days roll count prior to the current day. example: on 2022-01-05, John’s 3 days range are 2022-01-05 and 2022-01-01 with …

Total answers: 1

Ranking using multiple columns within groups allowing for tied ranks in Pandas

Ranking using multiple columns within groups allowing for tied ranks in Pandas Question: Intro and problem How can I rank observations within groups where the ranking is based on more than just one column and where the ranking allows for tied ranks? I know how to calculate aggregated group-level statistics using the groupby() method and …

Total answers: 1

Count consecutive boolean values in Python/pandas array for whole subset

Count consecutive boolean values in Python/pandas array for whole subset Question: I am looking for a way to aggregate pandas data frame by consecutive same values and perform actions like count or max on this aggregation. for example, if I would have one column in df: my_column 0 0 1 0 2 1 3 1 …

Total answers: 2

Pandas cumulative count across different groups

Pandas cumulative count across different groups Question: I’ve got the following DataFrame : df = pd.DataFrame({‘A’: [‘Nadal’, ‘Federer’, ‘Djokovic’, ‘Nadal’, ‘Nadal’, ‘Murray’, ‘Nadal’], ‘B’: [‘Djokovic’, ‘Nadal’, ‘Murray’, ‘Murray’, ‘Djokovic’, ‘Federer’, ‘Murray’], ‘Winner’: [‘Nadal’, ‘Federer’, ‘Djokovic’, ‘Murray’, ‘Nadal’, ‘Federer’, ‘Murray’], ‘Loser’: [‘Djokovic’, ‘Nadal’, ‘Murray’, ‘Nadal’, ‘Djokovic’, ‘Murray’, ‘Nadal’]}) And I’d like to create new features based …

Total answers: 1

Group and multiply columns with conditions

Group and multiply columns with conditions Question: I’m trying to multiply 2 columns until get a desired value(8), but i need to group first, also need to keep the first mult if the values is already under the desired valued (This part is the problematic) MPRO ID Nuevo_I Nuevo_P 1 ID1 5 3 1 ID1 …

Total answers: 1

Pandas :How to improve performance, comparing rows inside groups

Pandas :How to improve performance, comparing rows inside groups Question: I have done a python program to compare rows inside groups.But the performances are poor. The data are coming from a Change Data Capture system. For every change, there is a Sequence id , and an Operation number. For an Update operation, there is two …

Total answers: 1

pandas apply groupby and agg, and update orig columns

pandas apply groupby and agg, and update orig columns Question: I have a dataframe df: Group1. Group2 Val 0 1 Q 2 1 1 Q 3 2 2 R 8 3 4 Y 9 I want to update df with list of values per group, so new df will be Group Group2 Val new 0 …

Total answers: 2

Very slow aggregate on Pandas 2.0 dataframe with pyarrow as dtype_backend

Very slow aggregate on Pandas 2.0 dataframe with pyarrow as dtype_backend Question: Let’s say I have the following dataframe: Code Price AA1 10 AA1 20 BB2 30 And I want to perform the following operation on it: df.groupby("code").aggregate({ "price": "sum" }) I have tried playing with the new pyarrow dtypes introduced in Pandas 2.0 and …

Total answers: 1