pandas-groupby

Pandas get frequency of item occurrences in a column as percentage

Pandas get frequency of item occurrences in a column as percentage Question: I want to get a percentage of a particular value in a df column. Say I have a df with (col1, col2 , col3, gender) gender column has values of M, F, or Other. I want to get the percentage of M, F, …

Total answers: 5

How to keep original index of a DataFrame after groupby 2 columns?

How to keep original index of a DataFrame after groupby 2 columns? Question: Is there any way I can retain the original index of my large dataframe after I perform a groupby? The reason I need to this is because I need to do an inner merge back to my original df (after my groupby) …

Total answers: 4

Pandas groupby with categories with redundant nan

Pandas groupby with categories with redundant nan Question: I am having issues using pandas groupby with categorical data. Theoretically, it should be super efficient: you are grouping and indexing via integers rather than strings. But it insists that, when grouping by multiple categories, every combination of categories must be accounted for. I sometimes use categories …

Total answers: 6

How to groupby().transform() to value_counts() in pandas?

How to groupby().transform() to value_counts() in pandas? Question: I am processing a pandas dataframe df1 with prices of items. Item Price Minimum Most_Common_Price 0 Coffee 1 1 2 1 Coffee 2 1 2 2 Coffee 2 1 2 3 Tea 3 3 4 4 Tea 4 3 4 5 Tea 4 3 4 I create …

Total answers: 3

Python Pandas groupby apply lambda arguments

Python Pandas groupby apply lambda arguments Question: In a coursera video about Python Pandas groupby (in the Introduction to Data Science in Python course) the following example is given: df.groupby(‘Category’).apply(lambda df,a,b: sum(df[a] * df[b]), ‘Weight (oz.)’, ‘Quantity’) Where df is a DataFrame, and the lambda is applied to calculate the sum of two columns. If …

Total answers: 2

When is it appropriate to use df.value_counts() vs df.groupby('…').count()?

When is it appropriate to use df.value_counts() vs df.groupby('…').count()? Question: I’ve heard in Pandas there’s often multiple ways to do the same thing, but I was wondering – If I’m trying to group data by a value within a specific column and count the number of items with that value, when does it make sense …

Total answers: 4

Subset pandas dataframe up to when condition is met the first time

Subset pandas dataframe up to when condition is met the first time Question: I have not had any luck accomplishing a task, where I want to subset a pandas dataframe up to a value, and grouping by their id. In the actual dataset I have several columns in between ‘id’ and ‘status’ For example: d …

Total answers: 2

Pandas, groupby and count

Pandas, groupby and count Question: I have a dataframe say like this >>> df = pd.DataFrame({‘user_id’:[‘a’,’a’,’s’,’s’,’s’], ‘session’:[4,5,4,5,5], ‘revenue’:[-1,0,1,2,1]}) >>> df revenue session user_id 0 -1 4 a 1 0 5 a 2 1 4 s 3 2 5 s 4 1 5 s And each value of session and revenue represents a kind of type, …

Total answers: 3

pandas: GroupBy .pipe() vs .apply()

pandas: GroupBy .pipe() vs .apply() Question: In the example from the pandas documentation about the new .pipe() method for GroupBy objects, an .apply() method accepting the same lambda would return the same results. In [195]: import numpy as np In [196]: n = 1000 In [197]: df = pd.DataFrame({‘Store’: np.random.choice([‘Store_1’, ‘Store_2’], n), …..: ‘Product’: np.random.choice([‘Product_1’, …

Total answers: 2