Is the function "next" a good practice to find first occurrence in a iterable?

Question:

I’ve learned about iterators and such and discovered this quite interesting way of getting the first element in a list that a condition is applied (and also with default value in case we don’t find it):

 first_occurence = next((x for x in range(1,10) if x > 5), None)

For me, it seems a very useful, clear way of obtaining the result.

But since I’ve never seen that in production code, and since next is a little more "low-level" in the python structure I was wondering if that could be bad practice for some reason. Is that the case? and why?

Asked By: Matheus Oliveira

||

Answers:

It’s fine. It’s efficient, it’s fairly readable, etc.

If you’re expecting a result, or None is a possible result (so using None as a placeholder makes it hard to figure out if you got a result or got the default) it may be better to use the EAFP form rather than providing a default, catching the StopIteration it raises if no item is found, or just letting it bubble up if the problem is from the caller’s input not meeting specs (so it’s up to them to handle it). It looks even cleaner at point of use that way:

first_occurence = next(x for x in range(1,10) if x > 5)

Alternatively, when None is a valid result, you can use an explicit sentinel object that’s guaranteed unique like so:

sentinel = object()  # An anonymous object you construct can't possibly appear in the input
first_occurence = next((x for x in range(1,10) if x > 5), sentinel)
if first_occurence is not sentinel:  # Compare with is for performance and to avoid broken __eq__ comparing equal to sentinel

A common use case for this one of these constructs to replace a call to any when you not only need to know if any item passed the test, but which item (any can only return True or False, so it’s unsuited to finding which item passed).

Answered By: ShadowRanger

We can wrap it up in a function to provide an even nicer interface:

_raise = object()
# can pass either an iterable or an iterator
def first(iterable, condition, *, default=_raise, exctype=None):
    """Get the first value from `iterable` which meets `condition`.
    Will consume elements from the iterable.
    default -> if no element meets the condition, return this instead.
    exctype -> if no element meets the condition and there is no default,
               raise this kind of exception rather than `StopIteration`.
               (It will be chained from the original `StopIteration`.)
    """
    try:
        # `iter` is idempotent; this makes sure we have an iterator
        return next(filter(condition, iter(iterable)))
    except StopIteration as e:
        if default is not _raise:
            return default
        if exctype:
            raise exctype() from e
        raise

Let’s test it:

>>> first(range(10), lambda x: x > 5)
6
>>> first(range(10), lambda x: x > 11)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in first
StopIteration
>>> first(range(10), lambda x: x > 11, exctype=ValueError)
Traceback (most recent call last):
  File "<stdin>", line 4, in first
StopIteration

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 9, in first
ValueError
>>> first(range(10), lambda x: x > 11, default=None)
>>> 
Answered By: Karl Knechtel
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.