pandas-groupby

From Dummy to a List pandas

From Dummy to a List pandas Question: I have a dataframe with many dummy variables. Instead of having a lot of different dummy columns, I want only one column and each row needs to contain a string with only the dummy variable equal to 1. index a b c 0 1 1 1 1 0 …

Total answers: 2

Groupby sample pandas with keeping the groups lower than n if applicable

Groupby sample pandas with keeping the groups lower than n if applicable Question: I have a dataset, on which I want to do sampling after groupby. In general it can be achieved with df.groupby("some_id").sample(n=100) . But the problem is that some groups have less than 100 samples (and yes replace=True is a choice but what …

Total answers: 3

Filter non-duplicated records in Python-pandas, based on group-by column and row-level comparison

Filter non-duplicated records in Python-pandas, based on group-by column and row-level comparison Question: This is a complicated issue and I am not able to figure this out, and I really appreciate your help in this. The below dataframe is generated from a pandas function DataFrame.duplicated(), based on ‘Loc'(groupby) and ‘Category’ repeated records are marked as …

Total answers: 2

Python Pandas shift by given value in cell within groupby

Python Pandas shift by given value in cell within groupby Question: Given the following dataframe df = pd.DataFrame(data={‘name’: [‘a’, ‘a’, ‘a’, ‘b’, ‘b’, ‘b’, ‘b’, ‘c’, ‘c’, ‘c’], ‘lag’: [1, 1, 1, 2, 2, 2, 2, 2, 2, 2], ‘value’: range(10)}) print(df) lag name value 0 1 a 0 1 1 a 1 2 1 …

Total answers: 2

Convert SAS proc sql to Python(pandas)

Convert SAS proc sql to Python(pandas) Question: I rewrite some code from SAS to Python using Pandas library. I’ve got such code, and I have no idea what should I do with it? Can you help me, beacase its too complicated for me to do it correct. I’ve changed the name of columns (for encrypt …

Total answers: 3

Faster Way to GroupBy Apply Python Pandas?

Faster Way to GroupBy Apply Python Pandas? Question: How can I make the Groupby Apply run faster, or how can I write it a different way? import numpy as np import pandas as pd df = pd.DataFrame({‘ID’:[1,1,1,1,1,2,2,2,2,2], ‘value’:[1,2,np.nan,3,np.nan,1,2,np.nan,4,np.nan]}) result = df.groupby(“ID”).apply(lambda x: len(x[x[‘value’].notnull()].index) if((len(x[x[‘value’]==1].index)>=1)& (len(x[x[‘value’]==4].index)==0)) else 0) output: Index 0 1 3 2 0 My …

Total answers: 2

Split one excel file into multiple with specific number of rows in Pandas

Split one excel file into multiple with specific number of rows in Pandas Question: Let’s say I have an excel file with 101 rows, I need to split and write into 11 excel files with equivalent row number 10 for each new file, except the last one since there is only one row left. This …

Total answers: 2