What is the under the hood reason that we can use nested for loops in list comprehensions

Question:

I’ve been studying list comprehensions and something stopped me for days.

A simple list comprehension has the form

[expression for item in iterable]

the equivalent for loop is

li=[]
for item in iterable
    li.append(item)

If I’m right what generally a list comprehension does is it iterates through the iterable, evaluates the expression for each iteration, then appends it to the list.

Whatever should happen inside the for loop is written at the beginning of the liscomp.

We can think that in a listcomp Python only allows one expression and for loop’s suit is permitted to have only an if clause or nested for loops.

To quote a book I was reading it states that

Since list comprehensions produce lists, that is, iterables, and since the syntax for list comprehensions requires an iterable, it is possible to nest list comprehensions. This is the equivalent of having nested for … in loops.

This confused my understanding.

Does this says the reason for having a listcomp like [s+z for s in iterable_1 for z in iterable_2]

Can someone please explain what this says.

Answers:

Your first translation should be

li=[]
for item in iterable: 
    li.append( expression )

Your example [s+z for s in iterable_1 for z in iterable_2] is translated as

li=[]
for s in iterable_1:
    for z in iterable_2:
        li.append(s+z)

Congrats, you have discovered … monads! which are essentially what you’ve described, generalized nested loops.

Nested loops just produce a plain stream of results. Nested lists, when flattened, also turn into a plain stream of elements. That’s the similarity. A lazy append is pretty much like yield.

Each monad type is defined by how it implements its version of the flatMap function, which is a map followed by the flattening of the resulting nested structure. The flattening of the nested structure at each nesting level allows for an arbitrary depth of nesting to be flattened:

M [M (a)]  ==>  M (a)

M [M [M (a)]]  ==>   # flatten the two outer layers first:
                       M [M (a)]  ==>  M (a)
               OR:
               ==>   # flatten the two inner layers first:
                       M [M (a)]  ==>  M (a)

See the difference? There isn’t any! Any type that does the above, is a "monad". Like with lists,

    [ [ [a, b], [c] ], [], [ [], [d, e, f] ] ]
    [   [a, b], [c]  ,       [], [d, e, f]   ]
    [    a, b ,  c   ,            d, e, f    ]

    # or,
    [ [ [a, b], [c] ], [], [ [], [d, e, f] ] ]
    [ [  a, b ,  c  ], [], [      d, e, f  ] ]
    [    a, b ,  c   ,            d, e, f    ]

There really is no difference when we’re focusing on the actual values inside, on the deepest level. At this stage there’s no point to all the "ornamental" noise anymore. It made a difference on how we got there, to that deepest level; but while there, it’s played its role, and is no more.

Still the outer layer remains, enclosing the values in one particular type of container/producer/etc value, maintaining the encapsulation of the inner value(s) by the representative, wrapping type of value. All the "flattening" is done "on the inside", "under wraps". It is well defined when the flattening can be done in any order, leading to the same final result, so the order can be left unspecified. Like with multiplication, 3*4*2 = (3*4)*2 = 3*(4*2). With plain values like the numbers this is known as Monoid in maths, but with the higher-order, encapsulating values like list it is known as Monad (which could be referred to as higher-order monoids, but isn’t, as far as I know).

So it is with loops as well, which can be nested to an arbitrary depth — two, three, whatever, it doesn’t matter. The whole structure is still producing its results one by one, and these are the results which the innermost loop is producing, one by one.

That is the under the hood reason why we can use nested for loops in list comprehensions. Or, saying the same thing in a fancy way, it is because list comprehensions are just like monadic chains of operations (and can be translated as such).


The fixed nested loops, i.e. such that are all known in advance (as shown above with the lists example), correspond to so called Applicative (a.k.a. Monoidal) Functors. A side note, for lists specifically there are two distinct ways of combining two nested lists into one — one way is combining them sequentially, leading to a "cross-product"-like combination, and the other way is in parallel, leading to a zipping-like combination, like the dot-product in maths (sans the final summation).

Non-nested "loops" (i.e. yielding producers) with modifiable yielded values are just plain Functors.

The crucial distinction of monads is the ability to handle nested loop calculated from values yielded by the loops above it. Or in your case,

li=[]
for s in iterable_1:
    for z in foo(s):      # for some appropriate foo()
        li.append(s+z)

or the equivalent [s+z for s in iterable_1 for z in foo(s)].

This amounts to an Interpreter Pattern, in general terms.

Answered By: Will Ness