What is the equivalent of `DataFrame.drop_duplicates()` from pandas in polars?

Question

What is the equivalent of drop_duplicates() from pandas in polars?

import polars as pl
df = pl.DataFrame({"a":[1,1,2], "b":[2,2,3], "c":[1,2,3]})
df

Output:

shape: (3, 3)
┌─────┬─────┬─────┐
│ a   ┆ b   ┆ c   │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╡
│ 1   ┆ 2   ┆ 1   │
├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌┤
│ 1   ┆ 2   ┆ 2   │
├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌┤
│ 2   ┆ 3   ┆ 3   │
└─────┴─────┴─────┘

Code:

df.drop_duplicates(["a", "b"])

Delivers the following error:

AttributeError: drop_duplicates not found

Asked By: keiv.fly

||

Source

Answer 1

The right function name is .unique()

import polars as pl
df = pl.DataFrame({"a":[1,1,2], "b":[2,2,3], "c":[1,2,3]})
df.unique(subset=["a","b"])

And this delivers the right output:

shape: (2, 3)
┌─────┬─────┬─────┐
│ a   ┆ b   ┆ c   │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╡
│ 1   ┆ 2   ┆ 1   │
├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌┤
│ 2   ┆ 3   ┆ 3   │
└─────┴─────┴─────┘

Answered By: keiv.fly

Answer 2

It’s renamed to .unique()

See their Polars Documentation

Answered By: Claus8528

What is the equivalent of `DataFrame.drop_duplicates()` from pandas in polars?

Question:

Answers: