what's the difference between yield from and yield in python 3.3.2+

Question:

After python 3.3.2+ python support a new syntax for create generator function

yield from <expression>

I have made a quick try for this by

>>> def g():
...     yield from [1,2,3,4]
...
>>> for i in g():
...     print(i)
...
1
2
3
4
>>>

It seems simple to use but the PEP document is complex. My question is that is there any other difference compare to the previous yield statement? Thanks.

Asked By: Erxin

||

Answers:

Here is an example that illustrates it:

>>> def g():
...     yield from range(5)
... 
>>> list(g())
[0, 1, 2, 3, 4]
>>> def g():
...     yield range(5)
... 
>>> list(g())
[range(0, 5)]
>>>

yield from yields each item of the iterable, but yield yields the iterable itself.

Answered By: zondo

At first sight, yield from is an algorithmic shortcut for:

def generator1():
    for item in generator2():
        yield item
    # do more things in this generator

Which is then mostly equivalent to just:

def generator1():
    yield from generator2()
    # more things on this generator

In English: when used inside an iterable, yield from issues each element in another iterable, as if that item were coming from the first generator, from the point of view of the code calling the first generator.

The main reasoning for its creation is to allow easy refactoring of code relying heavily on iterators – code which use ordinary functions always could, at very little extra cost, have blocks of one function refactored to other functions, which are then called – that divides tasks, simplifies reading and maintaining the code, and allows for more reusability of small code snippets –

So, large functions like this:

def func1():
    # some calculation
    for i in somesequence:
        # complex calculation using i 
        # ...
        # ...
        # ...
    # some more code to wrap up results
    # finalizing
    # ...

Can become code like this, without drawbacks:

def func2(i):
    # complex calculation using i 
    # ...
    # ...
    # ...
    return calculated_value

def func1():
    # some calculation
    for i in somesequence:
         func2(i)
    # some more code to wrap up results
    # finalizing
    # ...

When getting to iterators however, the form

def generator1():
    for item in generator2():
        yield item
    # do more things in this generator

for item in generator1():
    # do things

requires that for each item consumed from generator2, the running context be first switched to generator1, nothing is done in that context, and the cotnext have to be switched to generator2 – and when that one yields a value, there is another intermediate context switch to generator1, before getting the value to the actual code consuming those values.

With yield from these intermediate context switches are avoided, which can save quite some resources if there are a lot of iterators chained: the context switches straight from the context consuming the outermost generator to the innermost generator, skipping the context of the intermediate generators altogether, until the inner ones are exhausted.

Later on, the language took advantage of this “tunelling” through intermediate contexts to use these generators as co-routines: functions that can make asynchronous calls. With the proper framework in place, as descibed in https://www.python.org/dev/peps/pep-3156/ , these co-routines are written in a way that when they will call a function that would take a long time to resolve (due to a network operation, or a CPU intensive operation that can be offloaded to another thread) – that call is made with a yield from statement – the framework main loop then arranges so that the called expensive function is properly scheduled, and retakes execution (the framework mainloop is always the code calling the co-routines themselves). When the expensive result is ready, the framework makes the called co-routine behave like an exhausted generator, and execution of the first co-routine resumes.

From the programmer’s point of view it is as if the code was running straight forward, with no interruptions. From the process point of view, the co-routine was paused at the point of the expensive call, and other (possibly parallel calls to the same co-routine) continued running.

So, one might write as part of a web crawler some code along:

@asyncio.coroutine
def crawler(url):
   page_content = yield from async_http_fetch(url)
   urls = parse(page_content)
   ...

Which could fetch tens of html pages concurrently when called from the asyncio loop.

Python 3.4 added the asyncio module to the stdlib as the default provider for this kind of functionality. It worked so well, that in Python 3.5 several new keywords were added to the language to distinguish co-routines and asynchronous calls from the generator usage, described above. These are described in https://www.python.org/dev/peps/pep-0492/

Answered By: jsbueno

For most applications, yield from just yields everything from the left iterable in order:

def iterable1():
    yield 1
    yield 2

def iterable2():
    yield from iterable1()
    yield 3

assert list(iterable2) == [1, 2, 3]

For 90% of users who see this post, I’m guessing that this will be explanation enough for them. yield from simply delegates to the iterable on the right hand side.


Coroutines

However, there are some more esoteric generator circumstances that also have importance here. A less known fact about Generators is that they can be used as co-routines. This isn’t super common, but you can send data to a generator if you want:

def coroutine():
    x = yield None
    yield 'You sent: %s' % x

c = coroutine()
next(c)
print(c.send('Hello world'))

Aside: You might be wondering what the use-case is for this (and you’re not alone). One example is the contextlib.contextmanager decorator. Co-routines can also be used to parallelize certain tasks. I don’t know too many places where this is taken advantage of, but google app-engine’s ndb datastore API uses it for asynchronous operations in a pretty nifty way.

Now, lets assume you send data to a generator that is yielding data from another generator … How does the original generator get notified? The answer is that it doesn’t in python2.x where you need to wrap the generator yourself:

def python2_generator_wapper():
    for item in some_wrapped_generator():
        yield item

At least not without a whole lot of pain:

def python2_coroutine_wrapper():
    """This doesn't work.  Somebody smarter than me needs to fix it. . .

    Pain.  Misery. Death lurks here :-("""
    # See https://www.python.org/dev/peps/pep-0380/#formal-semantics for actual working implementation :-)
    g = some_wrapped_generator()
    for item in g:
        try:
            val = yield item
        except Exception as forward_exception:  # What exceptions should I not catch again?
            g.throw(forward_exception)
        else:
            if val is not None:
                g.send(val)  # Oops, we just consumed another cycle of g ... How do we handle that properly ...

This all becomes trivial with yield from:

def coroutine_wrapper():
    yield from coroutine()

Because yield from truly delegates (everything!) to the underlying generator.


Return semantics

Note that the PEP in question also changes the return semantics. While not directly in OP’s question, it’s worth a quick digression if you are up for it. In python2.x, you can’t do the following:

def iterable():
    yield 'foo'
    return 'done'

It’s a SyntaxError. With the update to yield, the above function is not legal. Again, the primary use-case is with coroutines (see above). You can send data to the generator and it can do it’s work magically (maybe using threads?) while the rest of the program does other things. When flow control passes back to the generator, StopIteration will be raised (as is normal for the end of a generator), but now the StopIteration will have a data payload. It is the same thing as if a programmer instead wrote:

 raise StopIteration('done')

Now the caller can catch that exception and do something with the data payload to benefit the rest of humanity.

Answered By: mgilson

The difference is simple:

yield:

[extra info, if you know the working of generator you can skip that]

yield is used to produce a single value from the generator function. When the generator function is called, it starts executing, and when a yield statement is encountered, it temporarily suspends the execution of the function, returns the value to the caller, and saves its current state. The next time the function is called, it resumes execution from where it left off, and continues until it hits the next yield statement.

In example below, generator1 and generator2 returning a value wrapped in a generator object but combined_generator is also returning a generator object but that object has another generator object, Now, to get the value of these nested generator we were using yield from

class Gen:
def generator1(self):
    yield 1
    yield 2
    yield 3

def generator2(self):
    yield 'a'
    yield 'b'
    yield 'c'

def combined_generator(self):
    """
    This function yielding a generator, which inturn yielding a generator
    so we need to use `yield from` so that our end function can directly consume the values instead.
    """
    yield from self.generator1()
    yield from self.generator2()

def run(self):
    print("Gen running ...")
    for item in self.combined_generator():
        print(item)

g = Gen()
g.run()

The output of above is:

Gen calling …
1
2
3
a
b
c

Answered By: Deepanshu Mehta
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.