Retrieve Pandas dataframe rows that its column (one) values are consecutively equal to the values of a list

Question:

How do I retrieve Pandas dataframe rows that its column (one) values are consecutively equal to the values of a list?

Example, given this:

import pandas as pd     

df = pd.DataFrame({'col1': [10, 20, 30, 40, 50, 88, 99, 30, 40, 50]})

lst = [30, 40, 50]

I want to extract the dataframe rows from 30 to 50, but just the first sequence of consecutive values (just the 2 to 4 index rows).

Asked By: Marco

||

Answers:

You can use a rolling comparison:

s = df['col1'][::-1].rolling(len(lst)).apply(lambda x: x.eq(lst[::-1]).all())[::-1].eq(1)

if s.any():
    idx = s.idxmax()
    out = df.iloc[idx:idx+len(lst)]
    print(out)
else:
    print('Not found')                                                                            
                                                                              

output:

   col1
2    30
3    40
4    50
Answered By: mozway

this should do the trick:

df = pd.DataFrame({'col1': [10, 20, 30, 40, 50, 88, 99, 30, 40, 50]})
lst = [30, 40, 50]
ans=[]
for i,num in enumerate(df['col1']):
    if num in lst:
        lst.remove(num)
        ans.append(i)

print(ans)
Answered By: Ohad Sharet

Try:

lst = [30, 40, 50]

if any(lst == (found := s).to_list() for s in df["col1"].rolling(len(lst))):
    print(df.loc[found.index])

Prints:

   col1
2    30
3    40
4    50
Answered By: Andrej Kesely
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.