aggregate

Aggregating df columns but not duplicates

Aggregating df columns but not duplicates Question: Is there a neat way to aggregate columns into a new column without duplicating information? For example, if I have a df: Description Information 0 text1 text1 1 text2 text3 2 text4 text5 And I want to create a new column called ‘Combined’, which aggregates ‘Description’ and ‘Information’ …

Total answers: 5

How to count specific rows?

How to count specific rows? Question: I have an example of pyspark dataframe: X Y Z DATE 23 41 63 2016-01-01 23 41 5 2016-01-01 23 41 75 2016-01-01 23 41 46 2016-12-01 23 41 23 2016-12-01 27 41 5 2016-01-01 27 41 75 2016-01-01 27 41 85 2016-01-01 27 41 71 2016-01-01 What I …

Total answers: 1

Snakemake | Creating an aggregate without specifying a list in expand

Snakemake | Creating an aggregate without specifying a list in expand Question: My directory structure looks like this: — path — parameter_combination_1 – time_average.property1.csv – time_average.property2.csv – … — parameter_combination_2 – time_average.property1.csv – time_average.property2.csv – … — … I would like to create a rule which aggregates information of all files which carry the time_average …

Total answers: 2

Multiple aggregations on multiple columns in Python polars

Multiple aggregations on multiple columns in Python polars Question: Checking out how to implement binning with Python polars, I can easily calculate aggregates for individual columns: import polars as pl import numpy as np t, v = np.arange(0, 100, 2), np.arange(0, 100, 2) df = pl.DataFrame({"t": t, "v0": v, "v1": v}) df = df.with_column((pl.datetime(2022,10,30) + …

Total answers: 1

Groupby aggregate multiple columns with same function

Groupby aggregate multiple columns with same function Question: I am trying to do groupbyof this table. I have multiple value columns some of them I want to sum & some I want to count. df_p = pd.DataFrame({‘H1’:[‘A1′,’A1′,’C1′,’B1′,’A1′,’C1’], ‘H2’:[‘X1′,’Y1′,’Z1′,’X1′,’Y1′,’Z1’], ‘V1’:[1,2,3,4,5,6], ‘V2’:[10,20,30,40,50,60], ‘V3’:[100,200,300,400,500,600], ‘V4’:[1000,2000,3000,4000,5000,6000], ‘V5’:[11,12,13,14,15,16], ‘V6’:[110,120,130,140,150,160], ‘V7’:[1100,1200,1300,1400,1500,1600], ‘V8’:[1100,1200,1300,1400,1500,1600],}) I am trying to achieve that like this, which …

Total answers: 1

How do I aggregate 5 random samples of a dict?

How do I aggregate 5 random samples of a dict? Question: I’m trying to sample 5 items from a dict then aggregate them by sum. In this case by building a sum of all 5 prices. How do I do that? import random mydict = { 1: {"name":"item1","price": 16}, 2: {"name":"item2","price": 14}, 3: {"name":"item3","price": 16}, …

Total answers: 3

Group by and aggregate the values in pandas dataframe

Group by and aggregate the values in pandas dataframe Question: I have following dataframe in python meddra_id meddra_label soc cross_ref soc_term 2 10000081 Abdominal pain 10017947 http://snomed.info/id/21522001 Gastrointestinal disorders 3 10017999 Gastrointestinal pain 10017947 http://snomed.info/id/21522001 Gastrointestinal disorders 15 10000340 Abstains from alcohol 10041244 http://snomed.info/id/105542008 Social circumstances 35 10001022 Acute psychosis 10037175 http://snomed.info/id/69322001 Psychiatric disorders 36 …

Total answers: 1

groupby in pandas with custom function over a subset of rows in each group

groupby in pandas with custom function over a subset of rows in each group Question: I have a pandas DataFrame of the following format: Input: X [OTHER_COLUMNS] version branch v1 overall 2475.0 -1 . A 1712.5 1 . B 257.5 2 . C 392.5 2 D 112.5 3 v2 overall 2475.0 -1 A 2341.5 1 …

Total answers: 1

Pandas aggregation function: Merge text rows, but insert spaces between them?

Pandas aggregation function: Merge text rows, but insert spaces between them? Question: I managed to group rows in a dataframe, given one column (id). The problem is that one column consists of parts of sentences, and when I add them together, the spaces are missing. An example probably makes it easier to understand… My dataframe …

Total answers: 1

aggregate based on if column values exist

aggregate based on if column values exist Question: I have a dataframe and would like to aggregate based on if some values in column Result exist or not. So if for any ìndex1 and ìndex2 there is an A then my total_result column should be A. If there is no A but there is a …

Total answers: 1