Count up if value changes in other dataframe columns

Question

My input is:

d = {'x': [1479903, 1479879, 1479926, 1479736, 1479760, 1479784, 1479808, 1479831, 1480331, 1480355, 1480949, 1480973, 1480997, 1481021, 1481877, 1481901, 1481924, 1481948, 1481996, 1482020], 'y': [7195293, 7195293, 7195293, 7194829, 7194829, 7194829, 7194829, 7194829, 7194829, 7194829, 7194384, 7194384, 7194384, 7194384, 7194384, 7194384, 7194384, 7194384, 7193920, 7193920], 'group':[1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3]}

df = pd.DataFrame(data=d)
df = df.sort_values(by = ['group', 'y','x'], ascending = [True, False, True])

The dataframe is already sorted. The desired output is to create a column called ‘subgroup’ that starts at 1 and then only counts up when the y-value changes and starts over at 1 when the group number changes. Also create another column called ‘count’ that counts up until the ‘subgroup’ changes, when ‘subgroup’ changes then the ‘count’ starts over at 1. The desired output should look like:

group	subgroup	count	x	y
1	1	1	1479879	7195293
1	1	2	1479903	7195293
1	1	3	1479926	7195293
1	2	1	1479736	7194829
1	2	2	1479760	7194829
1	2	3	1479784	7194829
1	2	4	1479808	7194829
1	2	5	1479831	7194829
2	1	1	1480331	7194829
2	1	2	1480355	7194829
2	2	1	1480949	7194384
2	2	2	1480973	7194384
2	2	3	1480997	7194384
2	2	4	1481021	7194384
3	1	1	1481877	7194384
3	1	2	1481901	7194384
3	1	3	1481924	7194384
3	1	4	1481948	7194384
3	2	1	1481996	7193920
3	2	2	1482020	7193920

Thanks in advance!

Asked By: matteo112

||

Source

Answer 1

Try:

df['subgroup new'] = df.groupby('group').apply(lambda x: (x['y'] != x['y'].shift()).cumsum()).values
df['count new'] = df.groupby(['group', 'subgroup new']).cumcount() + 1

print(df)

Prints:

    group  subgroup  count        x        y  subgroup new  count new
0       1         1      1  1479879  7195293             1          1
1       1         1      2  1479903  7195293             1          2
2       1         1      3  1479926  7195293             1          3
3       1         2      1  1479736  7194829             2          1
4       1         2      2  1479760  7194829             2          2
5       1         2      3  1479784  7194829             2          3
6       1         2      4  1479808  7194829             2          4
7       1         2      5  1479831  7194829             2          5
8       2         1      1  1480331  7194829             1          1
9       2         1      2  1480355  7194829             1          2
10      2         2      1  1480949  7194384             2          1
11      2         2      2  1480973  7194384             2          2
12      2         2      3  1480997  7194384             2          3
13      2         2      4  1481021  7194384             2          4
14      3         1      1  1481877  7194384             1          1
15      3         1      2  1481901  7194384             1          2
16      3         1      3  1481924  7194384             1          3
17      3         1      4  1481948  7194384             1          4
18      3         2      1  1481996  7193920             2          1
19      3         2      2  1482020  7193920             2          2

Answered By: Andrej Kesely

Count up if value changes in other dataframe columns

Question:

Answers: