Check for presence of a sliced list in Python

Question:

I want to write a function that determines if a sublist exists in a larger list.

list1 = [1,0,1,1,1,0,0]
list2 = [1,0,1,0,1,0,1]

#Should return true
sublistExists(list1, [1,1,1])

#Should return false
sublistExists(list2, [1,1,1])

Is there a Python function that can do this?

Asked By: Jonathan

||

Answers:

If you are sure that your inputs will only contain the single digits 0 and 1 then you can convert to strings:

def sublistExists(list1, list2):
    return ''.join(map(str, list2)) in ''.join(map(str, list1))

This creates two strings so it is not the most efficient solution but since it takes advantage of the optimized string searching algorithm in Python it’s probably good enough for most purposes.

If efficiency is very important you can look at the Boyer-Moore string searching algorithm, adapted to work on lists.

A naive search has O(n*m) worst case but can be suitable if you cannot use the converting to string trick and you don’t need to worry about performance.

Answered By: Mark Byers

No function that I know of

def sublistExists(list, sublist):
    for i in range(len(list)-len(sublist)+1):
        if sublist == list[i:i+len(sublist)]:
            return True #return position (i) if you wish
    return False #or -1

As Mark noted, this is not the most efficient search (it’s O(n*m)). This problem can be approached in much the same way as string searching.

Answered By: sas4740

Let’s get a bit functional, shall we? 🙂

def contains_sublist(lst, sublst):
    n = len(sublst)
    return any((sublst == lst[i:i+n]) for i in range(len(lst)-n+1))

Note that any() will stop on first match of sublst within lst – or fail if there is no match, after O(m*n) ops

Answered By: Nas Banov

Here is a way that will work for simple lists that is slightly less fragile than Mark’s

def sublistExists(haystack, needle):
    def munge(s):
        return ", "+format(str(s)[1:-1])+","
    return munge(needle) in munge(haystack)
Answered By: John La Rooy

if iam understanding this correctly, you have a larger list, like :

list_A= ['john', 'jeff', 'dave', 'shane', 'tim']

then there are other lists

list_B= ['sean', 'bill', 'james']

list_C= ['cole', 'wayne', 'jake', 'moose']

and then i append the lists B and C to list A

list_A.append(list_B)

list_A.append(list_C)

so when i print list_A

print (list_A)

i get the following output

['john', 'jeff', 'dave', 'shane', 'tim', ['sean', 'bill', 'james'], ['cole', 'wayne', 'jake', 'moose']]

now that i want to check if the sublist exists:

for value in list_A:
    value= type(value)
    value= str(value).strip('<>').split()[1]
    if (value == "'list'"):
        print "True"
    else:
        print "False"

this will give you ‘True’ if you have any sublist inside the larger list.

Answered By: Suhail
def sublistExists(x, y):
  occ = [i for i, a in enumerate(x) if a == y[0]]
  for b in occ:
      if x[b:b+len(y)] == y:
           print 'YES-- SUBLIST at : ', b
           return True
      if len(occ)-1 ==  occ.index(b):
           print 'NO SUBLIST'
           return False

list1 = [1,0,1,1,1,0,0]
list2 = [1,0,1,0,1,0,1]

#should return True
sublistExists(list1, [1,1,1])

#Should return False
sublistExists(list2, [1,1,1])
Answered By: SuperNova

Might as well throw in a recursive version of @NasBanov’s solution

def foo(sub, lst):
    '''Checks if sub is in lst.

    Expects both arguments to be lists
    '''
    if len(lst) < len(sub):
        return False
    return sub == lst[:len(sub)] or foo(sub, lst[1:])
Answered By: wwii

The efficient way to do this is to use the Boyer-Moore algorithm, as Mark Byers suggests. I have done it already here: Boyer-Moore search of a list for a sub-list in Python, but will paste the code here. It’s based on the Wikipedia article.

The search() function returns the index of the sub-list being searched for, or -1 on failure.

def search(haystack, needle):
    """
    Search list `haystack` for sublist `needle`.
    """
    if len(needle) == 0:
        return 0
    char_table = make_char_table(needle)
    offset_table = make_offset_table(needle)
    i = len(needle) - 1
    while i < len(haystack):
        j = len(needle) - 1
        while needle[j] == haystack[i]:
            if j == 0:
                return i
            i -= 1
            j -= 1
        i += max(offset_table[len(needle) - 1 - j], char_table.get(haystack[i]));
    return -1

    
def make_char_table(needle):
    """
    Makes the jump table based on the mismatched character information.
    """
    table = {}
    for i in range(len(needle) - 1):
        table[needle[i]] = len(needle) - 1 - i
    return table
    
def make_offset_table(needle):
    """
    Makes the jump table based on the scan offset in which mismatch occurs.
    """
    table = []
    last_prefix_position = len(needle)
    for i in reversed(range(len(needle))):
        if is_prefix(needle, i + 1):
            last_prefix_position = i + 1
        table.append(last_prefix_position - i + len(needle) - 1)
    for i in range(len(needle) - 1):
        slen = suffix_length(needle, i)
        table[slen] = len(needle) - 1 - i + slen
    return table
    
def is_prefix(needle, p):
    """
    Is needle[p:end] a prefix of needle?
    """
    j = 0
    for i in range(p, len(needle)):
        if needle[i] != needle[j]:
            return 0
        j += 1    
    return 1
    
def suffix_length(needle, p):
    """
    Returns the maximum length of the substring ending at p that is a suffix.
    """
    length = 0;
    j = len(needle) - 1
    for i in reversed(range(p + 1)):
        if needle[i] == needle[j]:
            length += 1
        else:
            break
        j -= 1
    return length

Here is the example from the question:

def main():
    list1 = [1,0,1,1,1,0,0]
    list2 = [1,0,1,0,1,0,1]
    index = search(list1, [1, 1, 1])
    print(index)
    index = search(list2, [1, 1, 1])
    print(index)

if __name__ == '__main__':
    main()

Output:

2
-1
Answered By: user325117
def sublist(l1,l2):
  if len(l1) < len(l2):
    for i in range(0, len(l1)):
      for j in range(0, len(l2)):
        if l1[i]==l2[j] and j==i+1:
        pass
      return True
  else:
    return False
Answered By: Ashutosh K Singh

My favourite simple solution is following (however, its brutal-force, so i dont recommend it on huge data):

>>> l1 = ['z','a','b','c']
>>> l2 = ['a','b']
>>>any(l1[i:i+len(l2)] == l2 for i in range(len(l1)))
True

This code above actually creates all possible slices of l1 with length of l2, and sequentially compares them with l2.

Detailed explanation

Read this explanation only if you dont understand how it works (and you want to know it), otherwise there is no need to read it

Firstly, this is how you can iterate over indexes of l1 items:

>>> [i for i in range(len(l1))]
[0, 1, 2, 3]

So, because i is representing index of item in l1, you can use it to show that actuall item, instead of index number:

>>> [l1[i] for i in range(len(l1))]
['z', 'a', 'b', 'c']

Then create slices (something like subselection of items from list) from l1 with length of2:

>>> [l1[i:i+len(l2)] for i in range(len(l1))]
[['z', 'a'], ['a', 'b'], ['b', 'c'], ['c']] #last one is shorter, because there is no next item.

Now you can compare each slice with l2 and you see that second one matched:

>>> [l1[i:i+len(l2)] == l2 for i in range(len(l1))]
[False, True, False, False] #notice that the second one is that matching one

Finally, with function named any, you can check if at least one of booleans is True:

>>> any(l1[i:i+len(l2)] == l2 for i in range(len(l1)))
True
Answered By: Jan Musil

I know this might not be quite relevant to the original question but it might be very elegant 1 line solution to someone else if the sequence of items in both lists doesn’t matter. The result below will show True if List1 elements are in List2 (regardless of order). If the order matters then don’t use this solution.

List1 = [10, 20, 30]
List2 = [10, 20, 30, 40]
result = set(List1).intersection(set(List2)) == set(List1)
print(result)

Output

True
Answered By: Chadee Fouad
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.