Using the same iterator in nested for loops
Question:
Consider the following code:
from itertools import chain
lst = ['a', 1, 2, 3, 'b', 4, 5, 'c', 6]
def nestedForLoops():
it = iter(lst)
for item0 in it:
if isinstance(item0, str):
print(item0)
else:
# this shouldn't happen because of
# 1. lst[0] is a str, and
# 2. line A
print(f"this shouldn't happen: {item0=}")
pass
for item1 in it:
if not isinstance(item1, int):
break
print(f't{item1}')
else: # no-break
# reached end of iterator
return
# reached a str
assert isinstance(item1, str)
it = chain(item1, it) # line A
nestedForLoops()
I was expecting it to print
a
1
2
3
b
4
5
c
6
but instead it printed
a
1
2
3
this shouldn't happen: item0=4
this shouldn't happen: item0=5
c
this shouldn't happen: item0=6
I wrote what I thought was equivalent code using while
loops instead of for loops:
from itertools import chain
lst = ['a', 1, 2, 3, 'b', 4, 5, 'c', 6]
def nestedWhileLoops():
it = iter(lst)
while True:
try:
item0 = next(it)
except StopIteration:
break
if isinstance(item0, str):
print(item0)
else:
# this shouldn't happen because of
# 1. lst[0] is a str, and
# 2. line B
print(f"this shouldn't happen: {item0=}")
pass
while True:
try:
item1 = next(it)
except StopIteration:
# reached end of iterator
return
if not isinstance(item1, int):
break
print(f't{item1}')
# reached a str
assert isinstance(item1, str)
it = chain(item1, it) # line B
nestedWhileLoops()
and this while
loop version does print what I expected, namely
a
1
2
3
b
4
5
c
6
So why does nestedForLoops
behave differently than nestedWhileLoops
?
Answers:
Once a for
loop is entered, the current iterator it is using cannot be re-assigned until the outside of that scope is reached.
Here:
for item0 in it:
You start iterating over it
and continue in the scope of this for
loop until the end of the function.
When you reassign it
within its scope:
it = chain(item1, it) # line A
It has no effect on the iterator you are already iterating over.
The reason it "kind of" works for the inner for
loop is because you exit the scope of that for
loop on each string.
So in short, your for
loop example does the following:
- Enter the outer
for
loop and start iterating the original it
- Enter the inner
for
loop and continue iterating the original it
- Exit inner
for
loop scope and re-assign it
- In outer scope, continue iterating the original
it
- Re-enter the inner
for
loop and start iterating over the newly assigned it
- Repeat 3 through 6
Doing something like this may give you a better understanding:
it = [1, 2, 3, 4]
for item in it:
print(item)
it = None
# 1
# 2
# 3
# 4
A for
loop includes an implicit call to iter
to get an iterator for the given iterable. Once you get that iterator, you cannot modify it. Assigning a new iterable to the name it
doesn’t affect the outer loop, because the outer loop only looks at the value bound to it
when the loop starts.
With your while
loop, you make explicit calls to next(it)
, meaning you have the opportunity to change the value associated with the name it
between calls. The implicit calls to next
made by the for
loop don’t use the name it
as its argument; the loop has its own, private reference to the original iterator, and you can’t modify that iterator. (chain(item, it)
creates a new iterator, rather than modifying the one it
currently refers to.)
Consider the following code:
from itertools import chain
lst = ['a', 1, 2, 3, 'b', 4, 5, 'c', 6]
def nestedForLoops():
it = iter(lst)
for item0 in it:
if isinstance(item0, str):
print(item0)
else:
# this shouldn't happen because of
# 1. lst[0] is a str, and
# 2. line A
print(f"this shouldn't happen: {item0=}")
pass
for item1 in it:
if not isinstance(item1, int):
break
print(f't{item1}')
else: # no-break
# reached end of iterator
return
# reached a str
assert isinstance(item1, str)
it = chain(item1, it) # line A
nestedForLoops()
I was expecting it to print
a
1
2
3
b
4
5
c
6
but instead it printed
a
1
2
3
this shouldn't happen: item0=4
this shouldn't happen: item0=5
c
this shouldn't happen: item0=6
I wrote what I thought was equivalent code using while
loops instead of for loops:
from itertools import chain
lst = ['a', 1, 2, 3, 'b', 4, 5, 'c', 6]
def nestedWhileLoops():
it = iter(lst)
while True:
try:
item0 = next(it)
except StopIteration:
break
if isinstance(item0, str):
print(item0)
else:
# this shouldn't happen because of
# 1. lst[0] is a str, and
# 2. line B
print(f"this shouldn't happen: {item0=}")
pass
while True:
try:
item1 = next(it)
except StopIteration:
# reached end of iterator
return
if not isinstance(item1, int):
break
print(f't{item1}')
# reached a str
assert isinstance(item1, str)
it = chain(item1, it) # line B
nestedWhileLoops()
and this while
loop version does print what I expected, namely
a
1
2
3
b
4
5
c
6
So why does nestedForLoops
behave differently than nestedWhileLoops
?
Once a for
loop is entered, the current iterator it is using cannot be re-assigned until the outside of that scope is reached.
Here:
for item0 in it:
You start iterating over it
and continue in the scope of this for
loop until the end of the function.
When you reassign it
within its scope:
it = chain(item1, it) # line A
It has no effect on the iterator you are already iterating over.
The reason it "kind of" works for the inner for
loop is because you exit the scope of that for
loop on each string.
So in short, your for
loop example does the following:
- Enter the outer
for
loop and start iterating the originalit
- Enter the inner
for
loop and continue iterating the originalit
- Exit inner
for
loop scope and re-assignit
- In outer scope, continue iterating the original
it
- Re-enter the inner
for
loop and start iterating over the newly assignedit
- Repeat 3 through 6
Doing something like this may give you a better understanding:
it = [1, 2, 3, 4]
for item in it:
print(item)
it = None
# 1
# 2
# 3
# 4
A for
loop includes an implicit call to iter
to get an iterator for the given iterable. Once you get that iterator, you cannot modify it. Assigning a new iterable to the name it
doesn’t affect the outer loop, because the outer loop only looks at the value bound to it
when the loop starts.
With your while
loop, you make explicit calls to next(it)
, meaning you have the opportunity to change the value associated with the name it
between calls. The implicit calls to next
made by the for
loop don’t use the name it
as its argument; the loop has its own, private reference to the original iterator, and you can’t modify that iterator. (chain(item, it)
creates a new iterator, rather than modifying the one it
currently refers to.)