Python Polars Window Function With Literal Type
Question:
Say I have a DataFrame with an id column like this:
┌─────┐
│ id │
│ --- │
│ i64 │
╞═════╡
│ 1 │
│ 1 │
│ 1 │
│ 2 │
│ 2 │
│ 3 │
│ 3 │
└─────┘
I want to aggregate a running count over the id column, giving this result:
┌─────┬───────┐
│ id ┆ count │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═══════╡
│ 1 ┆ 1 │
│ 1 ┆ 2 │
│ 1 ┆ 3 │
│ 2 ┆ 1 │
│ 2 ┆ 2 │
│ 3 ┆ 1 │
│ 3 ┆ 2 │
└─────┴───────┘
My attempt involved creating a dummy column, which I think produced the desired result but seems a bit hacky.
(
df.with_columns(
pl.lit(1).alias("ones")
)
.with_columns(
(pl.col("ones").cumsum().over("id")).alias("count")
)
.drop("ones")
)
However when I try this:
(
df.with_columns(
(pl.lit(1).cumsum().over("id")).alias("count")
)
.drop("ones")
)
I get the error "ComputeError: the length of the window expression did not match that of the group".
Is there a better way to do this? What am I missing in my attempt above?
Answers:
Say I have a DataFrame with an id column like this:
┌─────┐
│ id │
│ --- │
│ i64 │
╞═════╡
│ 1 │
│ 1 │
│ 1 │
│ 2 │
│ 2 │
│ 3 │
│ 3 │
└─────┘
I want to aggregate a running count over the id column, giving this result:
┌─────┬───────┐
│ id ┆ count │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═══════╡
│ 1 ┆ 1 │
│ 1 ┆ 2 │
│ 1 ┆ 3 │
│ 2 ┆ 1 │
│ 2 ┆ 2 │
│ 3 ┆ 1 │
│ 3 ┆ 2 │
└─────┴───────┘
My attempt involved creating a dummy column, which I think produced the desired result but seems a bit hacky.
(
df.with_columns(
pl.lit(1).alias("ones")
)
.with_columns(
(pl.col("ones").cumsum().over("id")).alias("count")
)
.drop("ones")
)
However when I try this:
(
df.with_columns(
(pl.lit(1).cumsum().over("id")).alias("count")
)
.drop("ones")
)
I get the error "ComputeError: the length of the window expression did not match that of the group".
Is there a better way to do this? What am I missing in my attempt above?