Grouping pandas series based on condition

Question:

I have a Pandas df with one column the following values.

      Data
0      A
1      A 
2      B
3      A
4      A 
5      A
6      B
7      A
8      A
9      B

I want to try and group these values as such, for each encounter of Value B, i want the the group value to be changed as follows

      Data  Group
0      A      1
1      A      1
2      B      1
3      A      2
4      A      2
5      A      2
6      B      2
7      A      3
8      A      3
9      B      3

How can this be achieved using pandas inbuilt. in some way to create any helper columns to facilitate the mentioned task.

Asked By: Inderjeet Singh

||

Answers:

You can try cumsum after comparing if the series equals B and then shift 1 place to include B in the group:

df['Data'].eq('B').shift(fill_value=False).cumsum().add(1)

0    1
1    1
2    1
3    2
4    2
5    2
6    2
7    3
8    3
9    3
Answered By: anky

I notice the group here is descending. But if you only need to split the group by Data, the output should be same:

s=df.Data.eq('B').iloc[::-1].cumsum()
s
9    1
8    1
7    1
6    2
5    2
4    2
3    2
2    3
1    3
0    3
Name: Data, dtype: int64
Answered By: BENY

You can also use pandas.core.groupby.GroupBy.cumcount() in combination with pandas.DataFrame.bfill() method like this.

>>> df['Group'] = (df[df.Data == 'B'].groupby('Data').Data.cumcount() + 1)
>>> df['Group'] = df.Group.bfill()
>>> print(df)
  Data  Group
0    A    1.0
1    A    1.0
2    B    1.0
3    A    2.0
4    A    2.0
5    A    2.0
6    B    2.0
7    A    3.0
8    B    3.0
Answered By: Jaroslav Bezděk
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.