Pandas Dataframe count occurrences that only happen immediately

Question

I have the following data frame ‘A’

Index	1or0
1	0
2	0
3	0
…	…
8	0
9	1
10	1
…	…

I want to count how many times the zero (or 1) occurs in directly afterwards in the index column and write that into a new dataframe ‘B’ below:

StartNum	EndNum	Size
1	3	3
8	8	1
9	10	2

What is the fastest or best way to do this? just iterate like I would do with an array or is there a better way using pandas?

Asked By: Natoshi SakiSaki

||

Source

Answer 1

IIUC, use this :

# is there any 0<>1 transition? if so, then cumsum!
ser = A["1or0"].ne(A["1or0"].shift().bfill()).cumsum()

B = (
        A.groupby(ser, as_index=False)
            .agg({"Index": ["first", "last", "count"]})
            .set_axis(["StartNum", "EndNum", "Size"], axis=1)
    )

Output:

print(B)

   StartNum  EndNum  Size
0         1       3     3
1         4       7     4
2         8       8     1
3         9      10     2

Update (based on comments) :

B = (
        A.groupby(ser, as_index=False)
            .agg({"Index": ["first", "last", "count"],
                  "1or0": "unique"})
            .set_axis(["StartNum", "EndNum", "Size", "Value"], axis=1)
            .assign(Value= lambda d: d["Value"].astype(str).str.strip("[]"))
    )

print(B)

   StartNum  EndNum  Size Value
0         1       3     3     0
1         4       7     4     1
2         8       8     1     0
3         9      10     2     1

DataFrame used :

print(A)

   Index  1or0
0      1     0
1      2     0
2      3     0
3      4     1
4      5     1
5      6     1
6      7     1
7      8     0
8      9     1
9     10     1

Answered By: Timeless

Pandas Dataframe count occurrences that only happen immediately

Question:

Answers: