Is it safe to yield from within a "with" block in Python (and why)?

Question:

The combination of coroutines and resource acquisition seems like it could have some unintended (or unintuitive) consequences.

The basic question is whether or not something like this works:

def coroutine():
    with open(path, 'r') as fh:
        for line in fh:
            yield line

Which it does. (You can test it!)

The deeper concern is that with is supposed to be something an alternative to finally, where you ensure that a resource is released at the end of the block. Coroutines can suspend and resume execution from within the with block, so how is the conflict resolved?

For example, if you open a file with read/write both inside and outside a coroutine while the coroutine hasn’t yet returned:

def coroutine():
    with open('test.txt', 'rw+') as fh:
        for line in fh:
            yield line

a = coroutine()
assert a.next() # Open the filehandle inside the coroutine first.
with open('test.txt', 'rw+') as fh: # Then open it outside.
    for line in fh:
        print 'Outside coroutine: %r' % repr(line)
assert a.next() # Can we still use it?

Update

I was going for write-locked file handle contention in the previous example, but since most OSes allocate filehandles per-process there will be no contention there. (Kudos to @Miles for pointing out the example didn’t make too much sense.) Here’s my revised example, which shows a real deadlock condition:

import threading

lock = threading.Lock()

def coroutine():
    with lock:
        yield 'spam'
        yield 'eggs'

generator = coroutine()
assert generator.next()
with lock: # Deadlock!
    print 'Outside the coroutine got the lock'
assert generator.next()
Asked By: cdleary

||

Answers:

I don’t really understand what conflict you’re asking about, nor the problem with the example: it’s fine to have two coexisting, independent handles to the same file.

One thing I didn’t know that I learned in response to your question it that there is a new close() method on generators:

close() raises a new GeneratorExit exception inside the generator to terminate the iteration. On receiving this exception, the generator’s code must either raise GeneratorExit or StopIteration.

close() is called when a generator is garbage-collected, so this means the generator’s code gets one last chance to run before the generator is destroyed. This last chance means that try...finally statements in generators can now be guaranteed to work; the finally clause will now always get a chance to run. This seems like a minor bit of language trivia, but using generators and try...finally is actually necessary in order to implement the with statement described by PEP 343.

http://docs.python.org/whatsnew/2.5.html#pep-342-new-generator-features

So that handles the situation where a with statement is used in a generator, but it yields in the middle but never returns—the context manager’s __exit__ method will be called when the generator is garbage-collected.


Edit:

With regards to the file handle issue: I sometimes forget that there exist platforms that aren’t POSIX-like. 🙂

As far as locks go, I think Rafał Dowgird hits the head on the nail when he says “You just have to be aware that the generator is just like any other object that holds resources.” I don’t think the with statement is really that relevant here, since this function suffers from the same deadlock issues:

def coroutine():
    lock.acquire()
    yield 'spam'
    yield 'eggs'
    lock.release()

generator = coroutine()
generator.next()
lock.acquire() # whoops!
Answered By: Miles

That would be how I expected things to work. Yes, the block will not release its resources until it completes, so in that sense the resource has escaped it’s lexical nesting. However but this is no different to making a function call that tried to use the same resource within a with block – nothing helps you in the case where the block has not yet terminated, for whatever reason. It’s not really anything specific to generators.

One thing that might be worth worrying about though is the behaviour if the generator is never resumed. I would have expected the with block to act like a finally block and call the __exit__ part on termination, but that doesn’t seem to be the case.

Answered By: Brian

I don’t think there is a real conflict. You just have to be aware that the generator is just like any other object that holds resources, so it is the creator’s responsibility to make sure it is properly finalized (and to avoid conflicts/deadlock with the resources held by the object). The only (minor) problem I see here is that generators don’t implement the context management protocol (at least as of Python 2.5), so you cannot just:

with coroutine() as cr:
  doSomething(cr)

but instead have to:

cr = coroutine()
try:
  doSomething(cr)
finally:
  cr.close()

The garbage collector does the close() anyway, but it’s bad practice to rely on that for freeing resources.

Answered By: Rafał Dowgird

Because yield can execute arbitrary code, I’d be very wary of holding a lock over a yield statement. You can get a similar effect in lots of other ways, though, including calling a method or functions which might be have been overridden or otherwise modified.

Generators, however, are always (nearly always) “closed”, either with an explicit close() call, or just by being garbage-collected. Closing a generator throws a GeneratorExit exception inside the generator and hence runs finally clauses, with statement cleanup, etc. You can catch the exception, but you must throw or exit the function (i.e. throw a StopIteration exception), rather than yield. It’s probably poor practice to rely on the garbage collector to close the generator in cases like you’ve written, because that could happen later than you might want, and if someone calls sys._exit(), then your cleanup might not happen at all.

Answered By: Doug

For a TLDR, look at it this way:

with Context():
    yield 1
    pass  # explicitly do nothing *after* yield
# exit context after explicitly doing nothing

The Context ends after pass is done (i.e. nothing), pass executes after yield is done (i.e. execution resumes). So, the with ends after control is resumed at yield.

TLDR: A with context remains held when yield releases control.


There are actually just two rules that are relevant here:

  1. When does with release its resource?

    It does so once and directly after its block is done. The former means it does not release during a yield, as that could happen several times. The later means it does release after yield has completed.

  2. When does yield complete?

    Think of yield as a reverse call: control is passed up to a caller, not down to a called one. Similarly, yield completes when control is passed back to it, just like when a call returns control.

Note that both with and yield are working as intended here! The point of a with lock is to protect a resource and it stays protected during a yield. You can always explicitly release this protection:

def safe_generator():
  while True:
    with lock():
      # keep lock for critical operation
      result = protected_operation()
    # release lock before releasing control
    yield result
Answered By: MisterMiyagi
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.