pythonic way to iterate over part of a list
Question:
I want to iterate over everything in a list except the first few elements, e.g.:
for line in lines[2:]:
foo(line)
This is concise, but copies the whole list, which is unnecessary. I could do:
del lines[0:2]
for line in lines:
foo(line)
But this modifies the list, which isn’t always good.
I can do this:
for i in xrange(2, len(lines)):
line = lines[i]
foo(line)
But, that’s just gross.
Better might be this:
for i,line in enumerate(lines):
if i < 2: continue
foo(line)
But it isn’t quite as obvious as the very first example.
So: What’s a way to do it that is as obvious as the first example, but doesn’t copy the list unnecessarily?
Answers:
def skip_heading( iterable, items ):
the_iter= iter( iterable ):
for i, e in enumerate(the_iter):
if i == items: break
for e in the_iter:
yield e
Now you can for i in skip_heading( lines, 2 ):
without worrying.
You can try itertools.islice(iterable[, start], stop[, step])
:
import itertools
for line in itertools.islice(lines, start, stop):
foo(line)
You might build a helper generator:
def rangeit(lst, rng):
for i in rng:
yield lst[i]
for e in rangeit(["A","B","C","D","E","F"], range(2,4)):
print(e)
for fooable in (line for i,line in enumerate(lines) if i >= 2):
foo(fooable)
I prefer to use dropwhile for this. It feels natural after using it in Haskell and some other languages, and seems reasonably clear. You can also use it in many other cases where you want to look for a “trigger” condition more complex than the item’s index for the start of the iteration.
from itertools import dropwhile
for item in dropwhile(lambda x: x[0] < 2, enumerate(lst)):
# ... do something with item
Although itertools.islice
appears to be the optimal solution for this problem, somehow, the extra import just seems like overkill for something so simple.
Personally, I find the enumerate
solution perfectly readable and succinct – although I would prefer to write it like this:
for index, line in enumerate(lines):
if index >= 2:
foo(line)
The original solution is, in most cases, the appropriate one.
for line in lines[2:]:
foo(line)
While this does copy the list, it is only a shallow copy, and is quite quick. Don’t worry about optimizing until you have profiled the code and found this to be a bottleneck.
why not just try
for i in range(2, len(lines)):
foo(lines[i])
This gives you the index just fine(without doing any copying) and, while you don’t get to give the element a proper name(like in enumerate), it strikes me as simple and pythonic. xrange, aside from being deprecated, is less efficient than range since it has to construct the integer each iteration.
I want to iterate over everything in a list except the first few elements, e.g.:
for line in lines[2:]:
foo(line)
This is concise, but copies the whole list, which is unnecessary. I could do:
del lines[0:2]
for line in lines:
foo(line)
But this modifies the list, which isn’t always good.
I can do this:
for i in xrange(2, len(lines)):
line = lines[i]
foo(line)
But, that’s just gross.
Better might be this:
for i,line in enumerate(lines):
if i < 2: continue
foo(line)
But it isn’t quite as obvious as the very first example.
So: What’s a way to do it that is as obvious as the first example, but doesn’t copy the list unnecessarily?
def skip_heading( iterable, items ):
the_iter= iter( iterable ):
for i, e in enumerate(the_iter):
if i == items: break
for e in the_iter:
yield e
Now you can for i in skip_heading( lines, 2 ):
without worrying.
You can try itertools.islice(iterable[, start], stop[, step])
:
import itertools
for line in itertools.islice(lines, start, stop):
foo(line)
You might build a helper generator:
def rangeit(lst, rng):
for i in rng:
yield lst[i]
for e in rangeit(["A","B","C","D","E","F"], range(2,4)):
print(e)
for fooable in (line for i,line in enumerate(lines) if i >= 2):
foo(fooable)
I prefer to use dropwhile for this. It feels natural after using it in Haskell and some other languages, and seems reasonably clear. You can also use it in many other cases where you want to look for a “trigger” condition more complex than the item’s index for the start of the iteration.
from itertools import dropwhile
for item in dropwhile(lambda x: x[0] < 2, enumerate(lst)):
# ... do something with item
Although itertools.islice
appears to be the optimal solution for this problem, somehow, the extra import just seems like overkill for something so simple.
Personally, I find the enumerate
solution perfectly readable and succinct – although I would prefer to write it like this:
for index, line in enumerate(lines):
if index >= 2:
foo(line)
The original solution is, in most cases, the appropriate one.
for line in lines[2:]:
foo(line)
While this does copy the list, it is only a shallow copy, and is quite quick. Don’t worry about optimizing until you have profiled the code and found this to be a bottleneck.
why not just try
for i in range(2, len(lines)):
foo(lines[i])
This gives you the index just fine(without doing any copying) and, while you don’t get to give the element a proper name(like in enumerate), it strikes me as simple and pythonic. xrange, aside from being deprecated, is less efficient than range since it has to construct the integer each iteration.