itertools.ifilter Vs. filter Vs. list comprehensions
Question:
I am trying to become more familiar with the itertools
module and have found a function called ifilter
.
From what I understand, it filters and iterable based on the given function and returns an iterator over a list containing the elements of the iterable on which the function evaluates to True
.
Question 1: is my understanding thus far correct?
Question 2: aside from the fact that this returns and iterator, how is it different from the built-in filter
function?
Question 3 Which is faster?
From what I can tell, it is not. Am I missing something? (I ran the following test)
>>> itertools.ifilter(lambda x: x%2, range(5))
<itertools.ifilter object at 0x7fb1a101b210>
>>> for i in itertools.ifilter(lambda x: x%2, range(5)): print i
...
1
3
>>> filter(lambda x: x%2, range(5))
[1, 3]
>>> function = lambda x: x%2
>>> [item for item in range(5) if function(item)]
[1,3]
Answers:
ifilter
returns a generator, not a list.
Generators create their items on the fly when needed, instead of allocating the entire list first. That’s the only difference between ifilter
and filter
Here, you can see the diference:
filter(function, iterable): Construct a list from those elements of iterable for which function returns true.
itertools.ifilter(predicate, iterable): Make an iterator that filters elements from iterable returning only those for which the predicate is True.
This means that to obtain ‘ifiltered’ items you should iterate with returned iterator, but ‘filter’ returns all elements in a list with out iteration needed.
Your understanding is correct: the only difference is that ifilter
returns an iterator, while using filter
is like calling:
list(ifilter(...))
You may also be interested in what PEP 289 says about filter and ifilter:
List comprehensions greatly reduced the need for filter()
and map()
. Likewise, generator expressions are expected to minimize the need for itertools.ifilter()
and itertools.imap()
. […]
Also note that ifilter
became filter
in Python-3 (hence removed from itertools).
The example below includes a number generator that prints a message immediately before yielding the number, shows up how filter()
first builds the list, then runs through that and filters it. Whereas itertools.ifilter
filters as it goes, never building a list. If you’re filtering 500,000 significant things, you want ifilter
, so you’re not building a list.
import itertools
def number_generator():
for i in range(0, 3):
print "yield", i
yield i
print "stopping"
function = lambda x: x > 0
numbers = number_generator()
print "itertools.ifilter:"
for n in itertools.ifilter(function, numbers):
print n
print "nfilter:"
numbers = number_generator()
for n in filter(function, numbers):
print n
Output:
itertools.ifilter:
yield 0
yield 1
1
yield 2
2
stopping
filter:
yield 0
yield 1
yield 2
stopping
1
2
I am trying to become more familiar with the itertools
module and have found a function called ifilter
.
From what I understand, it filters and iterable based on the given function and returns an iterator over a list containing the elements of the iterable on which the function evaluates to True
.
Question 1: is my understanding thus far correct?
Question 2: aside from the fact that this returns and iterator, how is it different from the built-in filter
function?
Question 3 Which is faster?
From what I can tell, it is not. Am I missing something? (I ran the following test)
>>> itertools.ifilter(lambda x: x%2, range(5))
<itertools.ifilter object at 0x7fb1a101b210>
>>> for i in itertools.ifilter(lambda x: x%2, range(5)): print i
...
1
3
>>> filter(lambda x: x%2, range(5))
[1, 3]
>>> function = lambda x: x%2
>>> [item for item in range(5) if function(item)]
[1,3]
ifilter
returns a generator, not a list.
Generators create their items on the fly when needed, instead of allocating the entire list first. That’s the only difference between ifilter
and filter
Here, you can see the diference:
filter(function, iterable): Construct a list from those elements of iterable for which function returns true.
itertools.ifilter(predicate, iterable): Make an iterator that filters elements from iterable returning only those for which the predicate is True.
This means that to obtain ‘ifiltered’ items you should iterate with returned iterator, but ‘filter’ returns all elements in a list with out iteration needed.
Your understanding is correct: the only difference is that ifilter
returns an iterator, while using filter
is like calling:
list(ifilter(...))
You may also be interested in what PEP 289 says about filter and ifilter:
List comprehensions greatly reduced the need for
filter()
andmap()
. Likewise, generator expressions are expected to minimize the need foritertools.ifilter()
anditertools.imap()
. […]
Also note that ifilter
became filter
in Python-3 (hence removed from itertools).
The example below includes a number generator that prints a message immediately before yielding the number, shows up how filter()
first builds the list, then runs through that and filters it. Whereas itertools.ifilter
filters as it goes, never building a list. If you’re filtering 500,000 significant things, you want ifilter
, so you’re not building a list.
import itertools
def number_generator():
for i in range(0, 3):
print "yield", i
yield i
print "stopping"
function = lambda x: x > 0
numbers = number_generator()
print "itertools.ifilter:"
for n in itertools.ifilter(function, numbers):
print n
print "nfilter:"
numbers = number_generator()
for n in filter(function, numbers):
print n
Output:
itertools.ifilter: yield 0 yield 1 1 yield 2 2 stopping filter: yield 0 yield 1 yield 2 stopping 1 2