Implement qcut functionality using polars

Question:

I have been using polars but it seems like it lacks qcut functionality as pandas do.

I am not sure about the reason but is it possible to achieve the same effect as pandas qcut using current available polars functionalities?

The following shows an example about what I can do with pandas qcut.

import pandas as pd

data = pd.Series([11, 1, 2, 2, 3, 4, 5, 1, 2, 3, 4, 5])
pd.qcut(data, [0, 0.2, 0.4, 0.6, 0.8, 1], labels=['q1', 'q2', 'q3', 'q4', 'q5'])

The results are as follows:

0     q5
1     q1
2     q1
3     q1
4     q3
5     q4
6     q5
7     q1
8     q1
9     q3
10    q4
11    q5
dtype: category

So, I am curious how can I get the same result by using polars?

Thanks for your help.

Asked By: lebesgue

||

Answers:

Update:
Series.qcut was added in polars version 0.16.15

data = pl.Series([11, 1, 2, 2, 3, 4, 5, 1, 2, 3, 4, 5])
 
data.qcut([0.2, 0.4, 0.6, 0.8], labels=['q1', 'q2', 'q3', 'q4', 'q5'], maintain_order=True)
shape: (12, 3)
┌──────┬─────────────┬──────────┐
│      ┆ break_point ┆ category │
│ ---  ┆ ---         ┆ ---      │
│ f64  ┆ f64         ┆ cat      │
╞══════╪═════════════╪══════════╡
│ 11.0 ┆ inf         ┆ q5       │
│ 1.0  ┆ 2.0         ┆ q1       │
│ 2.0  ┆ 2.0         ┆ q1       │
│ 2.0  ┆ 2.0         ┆ q1       │
│ …    ┆ …           ┆ …        │
│ 2.0  ┆ 2.0         ┆ q1       │
│ 3.0  ┆ 3.6         ┆ q3       │
│ 4.0  ┆ 4.8         ┆ q4       │
│ 5.0  ┆ inf         ┆ q5       │
└──────┴─────────────┴──────────┘

Old answer:

From what I can tell .qcut() uses the linear quantile of the bin values?

If so, you could implement that part "manually":

import polars as pl

data = pl.Series([11, 1, 2, 2, 3, 4, 5, 1, 2, 3, 4, 5])
bins = [0.2, 0.4, 0.6, 0.8]
labels = ["q1", "q2", "q3", "q4", "q5"]

pl.cut(data, bins=[data.quantile(val, "linear") for val in bins], labels=labels)
shape: (12, 3)
┌──────┬─────────────┬──────────┐
│      | break_point | category │
│ ---  | ---         | ---      │
│ f64  | f64         | cat      │
╞══════╪═════════════╪══════════╡
│ 1.0  | 2.0         | q1       │
│ 1.0  | 2.0         | q1       │
│ 2.0  | 2.0         | q1       │
│ 2.0  | 2.0         | q1       │
│ 2.0  | 2.0         | q1       │
│ 3.0  | 3.6         | q3       │
│ 3.0  | 3.6         | q3       │
│ 4.0  | 4.8         | q4       │
│ 4.0  | 4.8         | q4       │
│ 5.0  | inf         | q5       │
│ 5.0  | inf         | q5       │
│ 11.0 | inf         | q5       │
└──────┴─────────────┴──────────┘
Answered By: jqurious
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.