How to save repeated computation in list comprehension in Python?

Question:

In the following Python code:

keyboards = [3, 1]
drivers = [5, 2, 8]
upper_limit = 10
sums = [k + d for k in keyboards for d in drivers if (k + d) <= upper_limit]

I’d like to store the result of k+d in the list comprehension so that it can be referred in the list comprehension. Is it possible in Python3?

I know that we can do the following:

sums = []
for k in keyboards:
    for d in drivers:
        s = k + d
        if s <= upper_limit:
            sums.append(s)

But I wish to avoid the side effect operation of append.

Asked By: Yu Shen

||

Answers:

If you are using Python 3.8 or newer, you can use the assignment operator (a.k.a. the walrus operator) to create a new local name and assign to it, inside a list comprehension. For your specific example, you’d do that in the left-hand operand for the <= comparison:

sums = [
    s
    for k in keyboards
    for d in drivers
    if (s := k + d) <= upper_limit
]

Note that you have to use parentheses around the assignment here, as the walrus operator has the lowest precedence of all Python operators; without parentheses you’d be assigning the result of k + d <= upper_limit to s, so a boolean value.

Take into account that s will be visible in the surrounding scope, the name s will have the same scope as sums has; local inside a function, global if you run the list comprehension at the module level. k and d, on the other hand, are local to the list comprehension loop, and are not visible outside of the comprehension.

Demo:

>>> keyboards = [3, 1]
>>> drivers = [5, 2, 8]
>>> upper_limit = 10
>>> [s for k in keyboards for d in drivers if (s := k + d) <= upper_limit]
[8, 5, 6, 3, 9]
>>> s
9
>>> k
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'k' is not defined

In earlier Python versions, only names in the target of for loops can be used to assign to a new name in a list comprehension, so if you need a ‘local’ variable to reference a calculation, you need to find ways to add an extra loop.

So in Python 3.7 or older, you can add another loop over a single element tuple that calculates k + d:

sums = [
    s
    for k in keyboards
    for d in drivers
    for s in (k + d,)
    if s <= upper_limit
]

(k + d,) is a single-element tuple, and so the for s in (k + d,) execute exactly once for each iteration over keyboards and drivers, effectively assigning k + d to s.

You could also use a generator expression to produce k + d sums for the two nested for loops, then iterate over the results of that expression:

sums = [
    s
    for s in (
        k + d
        for k in keyboards
        for d in drivers
    )
    if s <= upper_limit
]

In the latter case you can store that expression as a separate variable first:

s_calc = (k + d for k in keyboards for d in drivers)
sums = [s for s in s_calc if s <= upper_limit]

With these options, s is always local to the comprehension loop and is not visible at the sums scope level. It doesn’t ‘bleed’ out of the comprehension expression.

Demo of the latter options:

>>> [s for k in keyboards for d in drivers for s in (k + d,) if s <= upper_limit]
[8, 5, 6, 3, 9]
>>> [s for s in (k + d for k in keyboards for d in drivers) if s <= upper_limit]
[8, 5, 6, 3, 9]
>>> s_calc = (k + d for k in keyboards for d in drivers)
>>> [s for s in s_calc if s <= upper_limit]
[8, 5, 6, 3, 9]
Answered By: Martijn Pieters

In 3.8 and above, you can use the walrus operator for this:

sums = [s for k in keyboards for d in drivers if (s := k + d) <= upper_limit]

The performance advantage appears to be slight for this example:

$ python -m timeit "[s for k in range(1000) for d in range(1000) if (s := k + d) <= 1000]"
5 loops, best of 5: 72.5 msec per loop
$ python -m timeit "[k + d for k in range(1000) for d in range(1000) if k + d <= 1000]"
5 loops, best of 5: 75.6 msec per loop

The k + d computation would, presumably, have to be a lot more complex to show a major benefit.

Answered By: Karl Knechtel