Pattern matching of lists in Python

Question:

I want to do some pattern matching on lists in Python. For example, in Haskell, I can do something like the following:

fun (head : rest) = ...

So when I pass in a list, head will be the first element, and rest will be the trailing elements.

Likewise, in Python, I can automatically unpack tuples:

(var1, var2) = func_that_returns_a_tuple()

I want to do something similar with lists in Python. Right now, I have a function that returns a list, and a chunk of code that does the following:

ls = my_func()
(head, rest) = (ls[0], ls[1:])

I wondered if I could somehow do that in one line in Python, instead of two.

Asked By: mipadi

||

Answers:

That’s a very much a ‘pure functional’ approach and as such is a sensible idiom in Haskell but it’s probably not so appropriate to Python. Python only has a very limited concept of patterns in this way – and I suspect you might need a somewhat more rigid type system to implement that sort of construct (erlang buffs invited to disagree here).

What you have is probably as close as you would get to that idiom, but you are probably better off using a list comprehension or imperative approach rather than recursively calling a function with the tail of the list.

As has been stated on a few occasions before, Python is not actually a functional language. It just borrows ideas from the FP world. It is not inherently Tail Recursive in the way you would expect to see embedded in the architecture of a functional language, so you would have some difficulty doing this sort of recursive operation on a large data set without using a lot of stack space.

So far as I know there’s no way to make it a one-liner in current Python without introducing another function, e.g.:

split_list = lambda lst: (lst[0], lst[1:])
head, rest = split_list(my_func())

However, in Python 3.0 the specialized syntax used for variadic argument signatures and argument unpacking will become available for this type of general sequence unpacking as well, so in 3.0 you’ll be able to write:

head, *rest = my_func()

See PEP 3132 for details.

Answered By: James Bennett

Well, why you want it in 1-line in the first place?

If you really want to, you can always do a trick like this:

def x(func):
  y = func()
  return y[0], y[1:]

# then, instead of calling my_func() call x(my_func)
(head, rest) = x(my_func) # that's one line :)
Answered By: kender

extended unpacking was introduced in 3.0
http://www.python.org/dev/peps/pep-3132/

Answered By: bpowah

First of all, please note that the “pattern matching” of functional languages and the assignment to tuples you mention are not really that similar. In functional languages the patterns are used to give partial definitions of a function. So f (x : s) = e does not mean take the head and tail of the argument of f and return e using them, but it means that if the argument of f is of the form x : s (for some x and s), then f (x : s) is equal to e.

The assignment of python is more like a multiple assignment (I suspect that was its original intention). So you write, for example, x, y = y, x to swap the values in x and y without needing a temporary variable (as you would with a simple assignment statement). This has little to do with pattern matching as it is basically a shorthand for the “simultaneous” execution of x = y and y = x. Although python allows arbitrary sequences instead of comma-separated lists, I would not suggest calling this pattern matching. With pattern matching you check whether or not something matches a pattern; in the python assignment you should ensure that the sequences on both sides are the same.

To do what you seem to want you would usually (also in functional languages) use either a auxiliary function (as mentioned by others) or something similar to let or where constructs (which you can regard as using anonymous functions). For example:

(head, tail) = (x[0], x[1:]) where x = my_func()

Or, in actual python:

(head, tail) = (lambda x: (x[0], x[1:]))(my_func())

Note that this is essentially the same as the solutions given by others with an auxiliary function except that this is the one-liner you wanted. It is, however, not necessarily better than a separate function.

(Sorry if my answer is a bit over the top. I just think it’s important to make the distinction clear.)

Answered By: mweerden

there was a reciepe in the python cookbook to do this. i cant seem to find it now but here is the code (i modified it slightly)


def peel(iterable,result=tuple):
    '''Removes the requested items from the iterable and stores the remaining in a tuple
    >>> x,y,z=peel('test')
    >>> print repr(x),repr(y),z
    't' 'e' ('s', 't')
    '''
    def how_many_unpacked():
        import inspect,opcode
        f = inspect.currentframe().f_back.f_back
        if ord(f.f_code.co_code[f.f_lasti])==opcode.opmap['UNPACK_SEQUENCE']:
            return ord(f.f_code.co_code[f.f_lasti+1])
        raise ValueError("Must be a generator on RHS of a multiple assignment!!")
    iterator=iter(iterable)
    hasItems=True
    amountToUnpack=how_many_unpacked()-1
    next=None
    for num in xrange(amountToUnpack):
        if hasItems:        
            try:
                next = iterator.next()
            except StopIteration:
                next = None
                hasItems = False
        yield next
    if hasItems:
        yield result(iterator)
    else:
        yield None

however you should note that that only works when using an assignment unpack because of the way it inespects the previous frame… still its quite useful.

Answered By: Jake

Further to the other answers, note that the equivalent head / tail operation in Python, including python3’s extension of the * syntax is generally going to be less efficient than Haskell’s pattern matching.

Python lists are implemented as vectors, so obtaining the tail will need to take a copy of the list. This is O(n) wrt the size of the list, whereas an implementaion using linked lists like Haskell can merely use the tail pointer, an O(1) operation.

The only exception may be iterator based approaches, where the list isn’t actually returned, but an iterator is. However this may not be applicable all places where a list is desired (eg. iterating multiple times).

For instance, Cipher’s approach, if modified to return the iterator rather than converting it to a tuple will have this behaviour. Alternatively a simpler 2-item only method not relying on the bytecode would be:

def head_tail(lst):
    it = iter(list)
    yield it.next()
    yield it

>>> a, tail = head_tail([1,2,3,4,5])
>>> b, tail = head_tail(tail)
>>> a,b,tail
(1, 2, <listiterator object at 0x2b1c810>)
>>> list(tail)
[3, 4]

Obviously though you still have to wrap in a utility function rather than there being nice syntactic sugar for it.

Answered By: Brian

Unlike Haskell or ML, Python doesn’t have built-in pattern-matching of structures. The most Pythonic way of doing pattern-matching is with a try-except block:

def recursive_sum(x):
    try:
        head, tail = x[0], x[1:]
        return head + recursive-sum(tail)
    except IndexError:  # empty list: [][0] raises IndexError
        return 0

Note that this only works with objects with slice indexing. Also, if the function gets complicated, something in the body after the head, tail line might raise IndexError, which will lead to subtle bugs. However, this does allow you to do things like:

for frob in eggs.frob_list:
    try:
        frob.spam += 1
    except AttributeError:
        eggs.no_spam_count += 1

In Python, tail recursion is generally better implemented as a loop with an accumulator, i.e.:

def iterative_sum(x):
    ret_val = 0
    for i in x:
        ret_val += i
    return ret_val

This is the one obvious, right way to do it 99% of the time. Not only is it clearer to read, it’s faster and it will work on things other than lists (sets, for instance). If there’s an exception waiting to happen in there, the function will happily fail and deliver it up the chain.

Answered By: giltay

I’m working on pyfpm, a library for pattern matching in Python with a Scala-like syntax. You can use it to unpack objects like this:

from pyfpm import Unpacker

unpacker = Unpacker()

unpacker('head :: tail') << (1, 2, 3)

unpacker.head # 1
unpacker.tail # (2, 3)

Or in a function’s arglist:

from pyfpm import match_args

@match_args('head :: tail')
def f(head, tail):
    return (head, tail)

f(1)          # (1, ())
f(1, 2, 3, 4) # (1, (2, 3, 4))
Answered By: Martin Blech

For your specific use case – emulating Haskell’s fun (head : rest) = ..., sure. Function definitions have supported parameter unpacking for quite some time:

def my_method(head, *rest):
    # ...

As of Python 3.0, as @bpowah mentioned, Python supports unpacking on assignment as well:

my_list = ['alpha', 'bravo', 'charlie', 'delta', 'echo']
head, *rest = my_list
assert head == 'alpha'
assert rest == ['bravo', 'charlie', 'delta', 'echo']

Note that the asterisk (the “splat”) means “the remainder of the iterable”, not “until the end”. The following works fine:

first, *middle, last = my_list
assert first == 'alpha'
assert last == 'echo'
assert middle == ['bravo', 'charlie', 'delta']

first, *middle, last = ['alpha', 'bravo']
assert first == 'alpha'
assert last == 'bravo'
assert middle == []
Answered By: Lyndsy Simon