Is there a zip-like function that pads to longest length?
Question:
Is there a built-in function that works like zip()
but that will pad the results so that the length of the resultant list is the length of the longest input rather than the shortest input?
>>> a = ['a1']
>>> b = ['b1', 'b2', 'b3']
>>> c = ['c1', 'c2']
>>> zip(a, b, c)
[('a1', 'b1', 'c1')]
>>> What command goes here?
[('a1', 'b1', 'c1'), (None, 'b2', 'c2'), (None, 'b3', None)]
Answers:
For Python 2.6x use itertools
module’s izip_longest
.
For Python 3 use zip_longest
instead (no leading i
).
>>> list(itertools.izip_longest(a, b, c))
[('a1', 'b1', 'c1'), (None, 'b2', 'c2'), (None, 'b3', None)]
In Python 3 you can use itertools.zip_longest
>>> list(itertools.zip_longest(a, b, c))
[('a1', 'b1', 'c1'), (None, 'b2', 'c2'), (None, 'b3', None)]
You can pad with a different value than None
by using the fillvalue
parameter:
>>> list(itertools.zip_longest(a, b, c, fillvalue='foo'))
[('a1', 'b1', 'c1'), ('foo', 'b2', 'c2'), ('foo', 'b3', 'foo')]
With Python 2 you can either use itertools.izip_longest
(Python 2.6+), or you can use map
with None
. It is a little known feature of map
(but map
changed in Python 3.x, so this only works in Python 2.x).
>>> map(None, a, b, c)
[('a1', 'b1', 'c1'), (None, 'b2', 'c2'), (None, 'b3', None)]
non itertools Python 3 solution:
def zip_longest(*lists):
def g(l):
for item in l:
yield item
while True:
yield None
gens = [g(l) for l in lists]
for _ in range(max(map(len, lists))):
yield tuple(next(g) for g in gens)
non itertools My Python 2 solution:
if len(list1) < len(list2):
list1.extend([None] * (len(list2) - len(list1)))
else:
list2.extend([None] * (len(list1) - len(list2)))
Im using a 2d array but the concept is the similar using python 2.x:
if len(set([len(p) for p in printer])) > 1:
printer = [column+['']*(max([len(p) for p in printer])-len(column)) for column in printer]
In addition to the accepted answer, if you’re working with iterables that might be different lengths but shouldn’t be, it’s recommended to pass strict=True
to zip()
(supported since Python 3.10).
To quote the documentation:
zip()
is often used in cases where the iterables are assumed to be of
equal length. In such cases, it’s recommended to use the strict=True
option. Its output is the same as regular zip()
:
>>> list(zip(('a', 'b', 'c'), (1, 2, 3), strict=True))
[('a', 1), ('b', 2), ('c', 3)]
Unlike the default behavior, it checks that the
lengths of iterables are identical, raising a ValueError
if they
aren’t:
>>> list(zip(range(3), ['fee', 'fi', 'fo', 'fum'], strict=True))
Traceback (most recent call last):
...
ValueError: zip() argument 2 is longer than argument 1
Without the strict=True
argument, any bug
that results in iterables of different lengths will be silenced,
possibly manifesting as a hard-to-find bug in another part of the
program.
To add to the answers already given, the following works for any iterable and does not use itertools
, answering @ProdIssue’s question:
def zip_longest(*iterables, default_value):
iterators = tuple(iter(i) for i in iterables)
sentinel = object()
while True:
new = tuple(next(i, sentinel) for i in iterators)
if all(n is sentinel for n in new):
return
yield tuple(default_value if n is sentinel else n for n in new)
The use of sentinel
is needed so an iterator yielding default_value
will not be erroneously be identified as empty.
Just use iterators, nothing fancy.
def zip_longest(*iterables):
items = 0
for iterable in iterables:
items = max(items, len(iterable))
iters = [iter(iterable) for iterable in iterables]
while items:
yield (*[next(i, None) for i in iters],)
items -= 1
Is there a built-in function that works like zip()
but that will pad the results so that the length of the resultant list is the length of the longest input rather than the shortest input?
>>> a = ['a1']
>>> b = ['b1', 'b2', 'b3']
>>> c = ['c1', 'c2']
>>> zip(a, b, c)
[('a1', 'b1', 'c1')]
>>> What command goes here?
[('a1', 'b1', 'c1'), (None, 'b2', 'c2'), (None, 'b3', None)]
For Python 2.6x use itertools
module’s izip_longest
.
For Python 3 use zip_longest
instead (no leading i
).
>>> list(itertools.izip_longest(a, b, c))
[('a1', 'b1', 'c1'), (None, 'b2', 'c2'), (None, 'b3', None)]
In Python 3 you can use itertools.zip_longest
>>> list(itertools.zip_longest(a, b, c))
[('a1', 'b1', 'c1'), (None, 'b2', 'c2'), (None, 'b3', None)]
You can pad with a different value than None
by using the fillvalue
parameter:
>>> list(itertools.zip_longest(a, b, c, fillvalue='foo'))
[('a1', 'b1', 'c1'), ('foo', 'b2', 'c2'), ('foo', 'b3', 'foo')]
With Python 2 you can either use itertools.izip_longest
(Python 2.6+), or you can use map
with None
. It is a little known feature of map
(but map
changed in Python 3.x, so this only works in Python 2.x).
>>> map(None, a, b, c)
[('a1', 'b1', 'c1'), (None, 'b2', 'c2'), (None, 'b3', None)]
non itertools Python 3 solution:
def zip_longest(*lists):
def g(l):
for item in l:
yield item
while True:
yield None
gens = [g(l) for l in lists]
for _ in range(max(map(len, lists))):
yield tuple(next(g) for g in gens)
non itertools My Python 2 solution:
if len(list1) < len(list2):
list1.extend([None] * (len(list2) - len(list1)))
else:
list2.extend([None] * (len(list1) - len(list2)))
Im using a 2d array but the concept is the similar using python 2.x:
if len(set([len(p) for p in printer])) > 1:
printer = [column+['']*(max([len(p) for p in printer])-len(column)) for column in printer]
In addition to the accepted answer, if you’re working with iterables that might be different lengths but shouldn’t be, it’s recommended to pass strict=True
to zip()
(supported since Python 3.10).
To quote the documentation:
zip()
is often used in cases where the iterables are assumed to be of
equal length. In such cases, it’s recommended to use thestrict=True
option. Its output is the same as regularzip()
:>>> list(zip(('a', 'b', 'c'), (1, 2, 3), strict=True)) [('a', 1), ('b', 2), ('c', 3)]
Unlike the default behavior, it checks that the
lengths of iterables are identical, raising aValueError
if they
aren’t:>>> list(zip(range(3), ['fee', 'fi', 'fo', 'fum'], strict=True)) Traceback (most recent call last): ... ValueError: zip() argument 2 is longer than argument 1
Without the
strict=True
argument, any bug
that results in iterables of different lengths will be silenced,
possibly manifesting as a hard-to-find bug in another part of the
program.
To add to the answers already given, the following works for any iterable and does not use itertools
, answering @ProdIssue’s question:
def zip_longest(*iterables, default_value):
iterators = tuple(iter(i) for i in iterables)
sentinel = object()
while True:
new = tuple(next(i, sentinel) for i in iterators)
if all(n is sentinel for n in new):
return
yield tuple(default_value if n is sentinel else n for n in new)
The use of sentinel
is needed so an iterator yielding default_value
will not be erroneously be identified as empty.
Just use iterators, nothing fancy.
def zip_longest(*iterables):
items = 0
for iterable in iterables:
items = max(items, len(iterable))
iters = [iter(iterable) for iterable in iterables]
while items:
yield (*[next(i, None) for i in iters],)
items -= 1