How to add a new field with the counts per group criteria in python polars?

Question

I have a small use case and here is a polars dataframe.

df_names = pl.DataFrame({'LN'['Mallesham','Bhavik','Mallesham','Bhavik','Mahesh','Naresh','Sharath','Rakesh','Mallesham'],
                         'FN':['Yamulla','Yamulla','Yamulla','Yamulla','Dayala','Burre','Velmala','Uppu','Yamulla'],
                        'SSN':['123','456','123','456','893','111','222','333','123'],
                        'Address':['A','B','C','D','E','F','G','H','S']})

Here I would like to group on LN,FN,SSN and create a new column in which how many number of observations for this group combination and below is the expected output.

‘Mallesham’,’Yamulla’,’123′ is appeared 3 times, hence LN_FN_SSN_count field is filled up with 3.

Asked By: myamulla_ciencia

||

Source

Answer 1

You can use an expression using over (which is like grouping, aggregating and self-joining in other libs, but without the need for the join):

df_names.with_column(pl.count().over(['LN', 'FN', 'SSN']).alias('LN_FN_SSN_count'))

┌───────────┬─────────┬─────┬─────────┬─────────────────┐
│ LN        ┆ FN      ┆ SSN ┆ Address ┆ LN_FN_SSN_count │
│ ---       ┆ ---     ┆ --- ┆ ---     ┆ ---             │
│ str       ┆ str     ┆ str ┆ str     ┆ u32             │
╞═══════════╪═════════╪═════╪═════════╪═════════════════╡
│ Mallesham ┆ Yamulla ┆ 123 ┆ A       ┆ 3               │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ Bhavik    ┆ Yamulla ┆ 456 ┆ B       ┆ 2               │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ Mallesham ┆ Yamulla ┆ 123 ┆ C       ┆ 3               │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ Bhavik    ┆ Yamulla ┆ 456 ┆ D       ┆ 2               │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ ...       ┆ ...     ┆ ... ┆ ...     ┆ ...             │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ Naresh    ┆ Burre   ┆ 111 ┆ F       ┆ 1               │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ Sharath   ┆ Velmala ┆ 222 ┆ G       ┆ 1               │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ Rakesh    ┆ Uppu    ┆ 333 ┆ H       ┆ 1               │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ Mallesham ┆ Yamulla ┆ 123 ┆ S       ┆ 3               │
└───────────┴─────────┴─────┴─────────┴─────────────────┘

Answered By: datenzauber.ai

How to add a new field with the counts per group criteria in python polars?

Question:

Answers: