Pandas – group by quantile position

Question:

I have a pandas dataframe with the following structure:

import pandas as pd
df = pd.DataFrame(
    {
        "value": [0, 1, 2, 3, 4, 5, 5, 4, 3, 2, 1, 0],
        "group": ["A"] * 6 + ["B"] * 6
    }
)

I would like to obtain a new dataframe, with the same number of rows, that has, for each row the quantile that corresponds to the value in the group.

For this case, the output would be like this:

enter image description here

There can be a very large number of groups and values on completely different scales, and each group may have different sizes.

Asked By: David Masip

||

Answers:

import pandas as pd

df = pd.DataFrame(
    {
        "value": [0, 1, 2, 3, 4, 5, 5, 4, 3, 2, 1, 0],
        "group": ["A"] * 6 + ["B"] * 6,
    }
)


def norm_group(group):
    g_min = group.value.min()
    g_max = group.value.max()
    group["quantile"] = (group.value - g_min) / (g_max - g_min)
    return group


print(df.groupby("group").apply(norm_group))

seems to do the trick:

    value group  quantile
0       0     A       0.0
1       1     A       0.2
2       2     A       0.4
3       3     A       0.6
4       4     A       0.8
5       5     A       1.0
6       5     B       1.0
7       4     B       0.8
8       3     B       0.6
9       2     B       0.4
10      1     B       0.2
11      0     B       0.0
Answered By: AKX
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.