running-count

Pandas cumulative count across different groups

Pandas cumulative count across different groups Question: I’ve got the following DataFrame : df = pd.DataFrame({‘A’: [‘Nadal’, ‘Federer’, ‘Djokovic’, ‘Nadal’, ‘Nadal’, ‘Murray’, ‘Nadal’], ‘B’: [‘Djokovic’, ‘Nadal’, ‘Murray’, ‘Murray’, ‘Djokovic’, ‘Federer’, ‘Murray’], ‘Winner’: [‘Nadal’, ‘Federer’, ‘Djokovic’, ‘Murray’, ‘Nadal’, ‘Federer’, ‘Murray’], ‘Loser’: [‘Djokovic’, ‘Nadal’, ‘Murray’, ‘Nadal’, ‘Djokovic’, ‘Murray’, ‘Nadal’]}) And I’d like to create new features based …

Total answers: 1

Running count of rows before a specific date for each group in a dataframe

Running count of rows before a specific date for each group in a dataframe Question: I have the following Pandas dataframe in Python: ID Date E105 28/4/2021 E105 28/2/2021 E105 23/12/2020 E105 29/11/2020 E076 7/7/2021 E076 20/6/2021 E076 26/5/2021 E076 8/4/2021 E076 3/3/2021 E076 3/2/2021 E076 13/1/2021 E076 23/12/2020 E066 2/6/2021 E066 8/5/2021 E066 8/4/2021 …

Total answers: 1

The most efficient way to number the instance of a value in a Series

The most efficient way to number the instance of a value in a Series Question: I have a dataframe of visits that includes a person ID column, and a given person may have more than one visit. I want to number the visits for a given person. I can sort the dataframe by visit date, …

Total answers: 2

Add column with a specific sequence of numbers depending on value

Add column with a specific sequence of numbers depending on value Question: I have this dataframe: df = pd.DataFrame({ ‘ID’: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], ‘Condition’: [False, False, True, False, False, False, False, False, False, False, True, False]}) ID Condition 0 1 False 1 1 False 2 1 …

Total answers: 3

Identify non-unique rows, grouping any pairs

Identify non-unique rows, grouping any pairs Question: I am trying to figure out a non-looping way to identify (auto-incrementing int would be ideal) the non-unique groups of rows (a group can contain 1 or more rows) within each TDateID, GroupID combination. Here is an example DataFrame that looks like Index Cents SD_YF TDateID GroupID 10 …

Total answers: 1

Cumulatively count values between range by group in a pandas dataframe

Cumulatively count values between range by group in a pandas dataframe Question: Say I have the following data. For each user_id I want to get a cumulative count every time the difference score is <= -2 until it reaches a positive value. The count should then reset to zero and stay at that value until …

Total answers: 1

For loop on pandas dataframe causing slow performance

For loop on pandas dataframe causing slow performance Question: I currently have a loop in a script that is designed to process a raw test data file, and perform a bunch of calculations during the sanitised data. During the script, I need to figure out exactly how many cycles there are in each test. A …

Total answers: 1

Cumulative count between two specific events in Pandas

Cumulative count between two specific events in Pandas Question: I want to count all steps of each user after ‘Start’ event in Pandas. My dataset: ID Event 1 Start 1 Event1 1 Start 1 Event1 1 Event2 1 Start 2 Start 2 Event1 2 Start I want: ID Event Count 1 Start 1 1 Event1 …

Total answers: 1

Copying and appending rows to a dataframe with increment to timestamp column by a minute

Copying and appending rows to a dataframe with increment to timestamp column by a minute Question: Here is the dataframe I have: df = pd.DataFrame([[pd.Timestamp(2017, 1, 1, 12, 32, 0), 2, 3], [pd.Timestamp(2017, 1, 2, 12, 32, 0), 4, 9]], columns=[‘time’, ‘feature1’, ‘feature2’]) For every timestamp value found in the df (i.e for every value …

Total answers: 2