Getting index counter on a data frame
Question:
I have the following data frame:
import pandas as pd
data = {
"id_1": [1, 1, 1, 2, 2, 2],
"id_2": [1, 1, 1, 2, 2, 2],
"foo": [0.1, 0.1, 0.1, 0.2, 0.2, 0.2],
}
df = pd.DataFrame(data)
df = df.set_index(["id_1", "id_2"])
which looks like this:
foo
id_1 id_2
1 1 0.1
1 0.1
1 0.1
2 2 0.2
2 0.2
2 0.2
I want to have another column (index
) that starts from 1
and goes up to the length of the index
group, 3
. The output should look like this:
foo index
id_1 id_2
1 1 0.1 1
1 0.1 2
1 0.1 3
2 2 0.2 1
2 0.2 2
2 0.2 3
How do I go about this, please?
Answers:
You can groupby
the id_1
index whose level is 1 then transform
on foo
column to get length
df['index'] = (df.groupby(level=1)['foo']
.transform(lambda col: range(1, len(col)+1)))
print(df)
foo index
id_1 id_2
1 1 0.1 1
1 0.1 2
1 0.1 3
2 2 0.2 1
2 0.2 2
2 0.2 3
One approach could be as follows:
- Use
df.groupby
with level=0
, and apply Groupby.cumcount
to get consecutive numbers for each item in each group.
- Since the numbering will start at
0
, we use Series.add
to add 1
to the result.
df['index'] = df.groupby(level=0).cumcount().add(1)
df
foo index
id_1 id_2
1 1 0.1 1
1 0.1 2
1 0.1 3
2 2 0.2 1
2 0.2 2
2 0.2 3
I have the following data frame:
import pandas as pd
data = {
"id_1": [1, 1, 1, 2, 2, 2],
"id_2": [1, 1, 1, 2, 2, 2],
"foo": [0.1, 0.1, 0.1, 0.2, 0.2, 0.2],
}
df = pd.DataFrame(data)
df = df.set_index(["id_1", "id_2"])
which looks like this:
foo
id_1 id_2
1 1 0.1
1 0.1
1 0.1
2 2 0.2
2 0.2
2 0.2
I want to have another column (index
) that starts from 1
and goes up to the length of the index
group, 3
. The output should look like this:
foo index
id_1 id_2
1 1 0.1 1
1 0.1 2
1 0.1 3
2 2 0.2 1
2 0.2 2
2 0.2 3
How do I go about this, please?
You can groupby
the id_1
index whose level is 1 then transform
on foo
column to get length
df['index'] = (df.groupby(level=1)['foo']
.transform(lambda col: range(1, len(col)+1)))
print(df)
foo index
id_1 id_2
1 1 0.1 1
1 0.1 2
1 0.1 3
2 2 0.2 1
2 0.2 2
2 0.2 3
One approach could be as follows:
- Use
df.groupby
withlevel=0
, and applyGroupby.cumcount
to get consecutive numbers for each item in each group. - Since the numbering will start at
0
, we useSeries.add
to add1
to the result.
df['index'] = df.groupby(level=0).cumcount().add(1)
df
foo index
id_1 id_2
1 1 0.1 1
1 0.1 2
1 0.1 3
2 2 0.2 1
2 0.2 2
2 0.2 3