pandas-groupby | Page 3

Pandas get frequency of item occurrences in a column as percentage

Pandas get frequency of item occurrences in a column as percentage Question: I want to get a percentage of a particular value in a df column. Say I have a df with (col1, col2 , col3, gender) gender column has values of M, F, or Other. I want to get the percentage of M, F, …

Total answers: 5

How to keep original index of a DataFrame after groupby 2 columns?

How to keep original index of a DataFrame after groupby 2 columns? Question: Is there any way I can retain the original index of my large dataframe after I perform a groupby? The reason I need to this is because I need to do an inner merge back to my original df (after my groupby) …

Total answers: 4

Pandas groupby with categories with redundant nan

Pandas groupby with categories with redundant nan Question: I am having issues using pandas groupby with categorical data. Theoretically, it should be super efficient: you are grouping and indexing via integers rather than strings. But it insists that, when grouping by multiple categories, every combination of categories must be accounted for. I sometimes use categories …

Total answers: 6

How to groupby().transform() to value_counts() in pandas?

How to groupby().transform() to value_counts() in pandas? Question: I am processing a pandas dataframe df1 with prices of items. Item Price Minimum Most_Common_Price 0 Coffee 1 1 2 1 Coffee 2 1 2 2 Coffee 2 1 2 3 Tea 3 3 4 4 Tea 4 3 4 5 Tea 4 3 4 I create …

Total answers: 3

Python Pandas groupby apply lambda arguments

Python Pandas groupby apply lambda arguments Question: In a coursera video about Python Pandas groupby (in the Introduction to Data Science in Python course) the following example is given: df.groupby(‘Category’).apply(lambda df,a,b: sum(df[a] * df[b]), ‘Weight (oz.)’, ‘Quantity’) Where df is a DataFrame, and the lambda is applied to calculate the sum of two columns. If …

Total answers: 2

When is it appropriate to use df.value_counts() vs df.groupby('…').count()?

When is it appropriate to use df.value_counts() vs df.groupby('…').count()? Question: I’ve heard in Pandas there’s often multiple ways to do the same thing, but I was wondering – If I’m trying to group data by a value within a specific column and count the number of items with that value, when does it make sense …

Total answers: 4

Pandas groupby and aggregation output should include all the original columns (including the ones not aggregated on)

Pandas groupby and aggregation output should include all the original columns (including the ones not aggregated on) Question: I have the following data frame and want to: Group records by month Sum QTY_SOLDand NET_AMT of each unique UPC_ID(per month) Include the rest of the columns as well in the resulting dataframe The way I thought …

Total answers: 2

Subset pandas dataframe up to when condition is met the first time

Subset pandas dataframe up to when condition is met the first time Question: I have not had any luck accomplishing a task, where I want to subset a pandas dataframe up to a value, and grouping by their id. In the actual dataset I have several columns in between ‘id’ and ‘status’ For example: d …

Total answers: 2

Pandas, groupby and count

Pandas, groupby and count Question: I have a dataframe say like this >>> df = pd.DataFrame({‘user_id’:[‘a’,’a’,’s’,’s’,’s’], ‘session’:[4,5,4,5,5], ‘revenue’:[-1,0,1,2,1]}) >>> df revenue session user_id 0 -1 4 a 1 0 5 a 2 1 4 s 3 2 5 s 4 1 5 s And each value of session and revenue represents a kind of type, …

Total answers: 3

pandas: GroupBy .pipe() vs .apply()

pandas: GroupBy .pipe() vs .apply() Question: In the example from the pandas documentation about the new .pipe() method for GroupBy objects, an .apply() method accepting the same lambda would return the same results. In [195]: import numpy as np In [196]: n = 1000 In [197]: df = pd.DataFrame({‘Store’: np.random.choice([‘Store_1’, ‘Store_2’], n), …..: ‘Product’: np.random.choice([‘Product_1’, …

Total answers: 2