Determinate Consecutive Values (Invoices) Pandas

Question

I have a dataset with supplier and its invoices and I need to determinate which of the invoices are consecutives marking it with a 1 or a 0.

For example:

df1 = pd.DataFrame()
df1['Supplier'] = ['ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC']
df1['Invoice'] = ['101','102','105','109','110','114','120','121','122','130','131']

   Supplier Invoice
0       ABC     101
1       ABC     102
2       ABC     105
3       ABC     109
4       ABC     110
5       ABC     114
6       ABC     120
7       ABC     121
8       ABC     122
9       ABC     130
10      ABC     131

And what I want is a third column like this:

   Supplier Invoice Consecutive
0       ABC     101           0
1       ABC     102           1
2       ABC     105           0
3       ABC     109           0
4       ABC     110           1
5       ABC     114           0
6       ABC     120           0
7       ABC     121           1
8       ABC     122           1
9       ABC     130           0
10      ABC     131           1

EDIT

Thanks for your answers, this options works great, but when I tried it in a real database I realized that I need to compare the suppliers, if the next row is a new supplier the consecutive must be 0.

For example:

   Supplier Invoice Consecutive
0       ABC     101           0
1       ABC     102           1
2       ABC     105           0
3       ABC     109           0
4       ABC     110           1
5       ABC     114           0
6       ABC     120           0
7       ABC     121           1
8       ABC     122           1
9       ABC     130           0
10      ABC     131           1
11      DEF     132           0
12      DEF     133           1
13      GHI     134           0

Thanks in advance!

Asked By: Gustavo A. Marín Acevedo

||

Source

Answer 1

Make sure your ‘Invoice’ column is of a numeric type first. Then you can use:

df1['Consecutive'] = (df1.Invoice - df1.Invoice.shift(1) == 1).astype(int)

prints:

   Supplier  Invoice  Consecutive
0       ABC      101            0
1       ABC      102            1
2       ABC      105            0
3       ABC      109            0
4       ABC      110            1
5       ABC      114            0
6       ABC      120            0
7       ABC      121            1
8       ABC      122            1
9       ABC      130            0
10      ABC      131            1

Answered By: sophocles

Answer 2

using npwhere and diff

# take difference b/w consecutive rows using diff
# update to 0 when difference is not 1

df1['Consecutive']=(df1.groupby('Supplier')['Invoice']
                    .transform(lambda x: np.where(
                                           x.astype(int).diff() !=1, 0, 1)) )
df1
df1

    Supplier    Invoice     Consecutive
0   ABC     101     0
1   ABC     102     1
2   ABC     105     0
3   ABC     109     0
4   ABC     110     1
5   ABC     114     0
6   ABC     120     0
7   ABC     121     1
8   ABC     122     1
9   ABC     130     0
10  ABC     131     1
11  DEF     132     0
12  DEF     133     1
13  GHI     134     0

Answered By: Naveed

Answer 3

df1['Consecutive']=df1.Invoice.astype(int).diff().eq(1).astype(int)

   Supplier Invoice  Consecutive
0       ABC     101            0
1       ABC     102            1
2       ABC     105            0
3       ABC     109            0
4       ABC     110            1
5       ABC     114            0
6       ABC     120            0
7       ABC     121            1
8       ABC     122            1
9       ABC     130            0
10      ABC     131            1

Answered By: Алексей Р

Answer 4

Thanks to all for your answers, here is the code for the case we have multiple suppliers.

df1 = pd.DataFrame()
df1['Supplier'] = ['ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC','DEF','DEF','GHI']
df1['Invoice'] = ['101','102','105','109','110','114','120','121','122','130','131','132','133','134']
df1['Consecutive']= np.where((df1['Invoice'].astype('int64').diff() ==1) 
                                             & (df1.Supplier.eq(df1.Supplier.shift())), 1, 0)

Result:

   Supplier Invoice  Consecutive
0       ABC     101            0
1       ABC     102            1
2       ABC     105            0
3       ABC     109            0
4       ABC     110            1
5       ABC     114            0
6       ABC     120            0
7       ABC     121            1
8       ABC     122            1
9       ABC     130            0
10      ABC     131            1
11      DEF     132            0
12      DEF     133            1
13      GHI     134            0

Answered By: Gustavo A. Marín Acevedo

Determinate Consecutive Values (Invoices) Pandas

Question:

Answers: