How to limit the size of a comprehension?

Question:

I have a list and want to build (via a comprehension) another list. I would like this new list to be limited in size, via a condition

The following code will fail:

a = [1, 2, 1, 2, 1, 2]
b = [i for i in a if i == 1 and len(b) < 3]

with

Traceback (most recent call last):
  File "compr.py", line 2, in <module>
    b = [i for i in a if i == 1 and len(b) < 3]
  File "compr.py", line 2, in <listcomp>
    b = [i for i in a if i == 1 and len(b) < 3]
NameError: name 'b' is not defined

because b is not defined yet at the time the comprehension is built.

Is there a way to limit the size of the new list at build time?

Note: I could break the comprehension into a for loop with the proper break when a counter is reached but I would like to know if there is a mechanism which uses a comprehension.

Asked By: WoJ

||

Answers:

You can use a generator expression to do the filtering, then use islice() to limit the number of iterations:

from itertools import islice

filtered = (i for i in a if i == 1)
b = list(islice(filtered, 3))

This ensures you don’t do more work than you have to to produce those 3 elements.

Note that there is no point anymore in using a list comprehension here; a list comprehension can’t be broken out of, you are locked into iterating to the end.

Answered By: Martijn Pieters

@Martijn Pieters is completly right that itertools.islice is the best way to solve this. However if you don’t mind an additional (external) library you can use iteration_utilities which wraps a lot of these itertools and their applications (and some additional ones). It could make this a bit easier, at least if you like functional programming:

>>> from iteration_utilities import Iterable

>>> Iterable([1, 2, 1, 2, 1, 2]).filter((1).__eq__)[:2].as_list()
[1, 1]

>>> (Iterable([1, 2, 1, 2, 1, 2])
...          .filter((1).__eq__)   # like "if item == 1"
...          [:2]                  # like "islice(iterable, 2)"
...          .as_list())           # like "list(iterable)"
[1, 1]

The iteration_utilities.Iterable class uses generators internally so it will only process as many items as neccessary until you call any of the as_* (or get_*) -methods.


Disclaimer: I’m the author of the iteration_utilities library.

Answered By: MSeifert

You could use itertools.count to generate a counter and itertools.takewhile to stop the iterating over a generator when the counter reaches the desired integer (3 in this case):

from itertools import count, takewhile
c = count()
b = list(takewhile(lambda x: next(c) < 3, (i for i in a if i == 1)))

Or a similar idea building a construct to raise StopIteration to terminate the generator. That is the closest you’ll get to your original idea of breaking the list comprehension, but I would not recommend it as best practice:

c = count()
b = list(i if next(c) < 3 else next(iter([])) for i in a if i == 1)

Examples:

>>> a = [1,2,1,4,1,1,1,1]

>>> c = count()
>>> list(takewhile(lambda x: next(c) < 3, (i for i in a if i == 1)))
[1, 1, 1]

>>> c = count()
>>> list(i if next(c) < 3 else next(iter([])) for i in a if i == 1)
[1, 1, 1]
Answered By: Chris_Rands

use enumerate:

b = [n for i,n in enumerate(a) if n==1 and i<3]
Answered By: Dorianux

itertools.slice is the natural way to extract n items from a generator.

But you can also implement this yourself using a helper function. Just like the itertools.slice pseudo-code, we catch StopIteration to limit the number of items yielded.

This is more adaptable because it allows you to specify logic if n is greater than the number of items in your generator.

def take_n(gen, n):
    for _ in range(n):
        try:
            yield next(gen)
        except StopIteration:
            break

g = (i**2 for i in range(5))
res = list(take_n(g, 20))

print(res)

[0, 1, 4, 9, 16]
Answered By: jpp

Same solution just without islice:

filtered = (i for i in a if i == 1)
b = [filtered.next() for j in range(3)]

BTW, pay attention if the generator is empty or if it has less than 3 – you’ll get StopIteration Exception.

To prevent that, you may want to use next() with default value. For example:

b = [next(filtered, None) for j in range(3)]

And if you don’t want ‘None’ in your list:

b = [i for i in b if i is not None]
Answered By: madaniel
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.