Concatenation of many lists in Python
Question:
Suppose I have a function like this:
def getNeighbors(vertex)
which returns a list of vertices that are neighbors of the given vertex. Now I want to create a list with all the neighbors of the neighbors. I do that like this:
listOfNeighborsNeighbors = []
for neighborVertex in getNeighbors(vertex):
listOfNeighborsNeighbors.append(getNeighbors(neighborsVertex))
Is there a more pythonic way to do that?
Answers:
[x for n in getNeighbors(vertex) for x in getNeighbors(n)]
or
sum(getNeighbors(n) for n in getNeighbors(vertex), [])
Appending lists can be done with + and sum():
>>> c = [[1, 2], [3, 4]]
>>> sum(c, [])
[1, 2, 3, 4]
As usual, the itertools module contains a solution:
>>> l1=[1, 2, 3]
>>> l2=[4, 5, 6]
>>> l3=[7, 8, 9]
>>> import itertools
>>> list(itertools.chain(l1, l2, l3))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
If speed matters, it may be better to use this:
from operator import iadd
reduce(iadd, (getNeighbors(n) for n in getNeighbors(vertex)))
The point of this code is in concatenating whole lists by list.extend
where list comprehension would add one item by one, as if calling list.append
. That saves a bit of overhead, making the former (according to my measurements) about three times faster. (The iadd
operator is normally written as +=
and does the same thing as list.extend
.)
Using list comprehensions (the first solution by Ignacio) is still usually the right way, it is easier to read.
But definitely avoid using sum(..., [])
, because it runs in quadratic time. That is very impractical for many lists (more than a hundred or so).
I like itertools.chain
approach because it runs in linear time (sum(…) runs in qudratic time) but @Jochen didn’t show how to deal with lists of dynamic length. Here is solution for OP’s question.
import itertools
list(itertools.chain(*[getNeighbors(n) for n in getNeighbors(vertex)]))
You can get rid of list(...)
call if iterable is sufficient for you.
Quickest to slowest:
list_of_lists = [[x,1] for x in xrange(1000)]
%timeit list(itertools.chain.from_iterable(list_of_lists))
30 µs ± 320 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit list(itertools.chain(*list_of_lists))
33.4 µs ± 761 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
min(timeit.repeat("ll=[];nfor l in list_of_lists:n ll.extend(l)", "list_of_lists=[[x,1] for x in range(1000)]",repeat=3, number=100))/100.0
4.1411130223423245e-05
%timeit [y for z in list_of_lists for y in z]
53.9 µs ± 156 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit sum(list_of_lists, [])
1.5 ms ± 10.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
(Python 3.7.10)
Python2:
list_of_lists = [[x,1] for x in xrange(1000)]
%timeit list(itertools.chain(*list_of_lists))
100000 loops, best of 3: 14.6 µs per loop
%timeit list(itertools.chain.from_iterable(list_of_lists))
10000 loops, best of 3: 60.2 µs per loop
min(timeit.repeat("ll=[];nfor l in list_of_lists:n ll.extend(l)", "list_of_lists=[[x,1] for x in xrange(1000)]",repeat=3, number=100))/100.0
9.620904922485351e-05
%timeit [y for z in list_of_lists for y in z]
10000 loops, best of 3: 108 µs per loop
%timeit sum(list_of_lists, [])
100 loops, best of 3: 3.7 ms per loop
Using .extend() (update in place) combined with reduce instead of sum() (new object each time) should be more efficient however I’m too lazy to test that 🙂
mylist = [[1,2], [3,4], [5,6]]
reduce(lambda acc_l, sl: acc_l.extend(sl) or acc_l, mylist)
Suppose I have a function like this:
def getNeighbors(vertex)
which returns a list of vertices that are neighbors of the given vertex. Now I want to create a list with all the neighbors of the neighbors. I do that like this:
listOfNeighborsNeighbors = []
for neighborVertex in getNeighbors(vertex):
listOfNeighborsNeighbors.append(getNeighbors(neighborsVertex))
Is there a more pythonic way to do that?
[x for n in getNeighbors(vertex) for x in getNeighbors(n)]
or
sum(getNeighbors(n) for n in getNeighbors(vertex), [])
Appending lists can be done with + and sum():
>>> c = [[1, 2], [3, 4]]
>>> sum(c, [])
[1, 2, 3, 4]
As usual, the itertools module contains a solution:
>>> l1=[1, 2, 3]
>>> l2=[4, 5, 6]
>>> l3=[7, 8, 9]
>>> import itertools
>>> list(itertools.chain(l1, l2, l3))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
If speed matters, it may be better to use this:
from operator import iadd
reduce(iadd, (getNeighbors(n) for n in getNeighbors(vertex)))
The point of this code is in concatenating whole lists by list.extend
where list comprehension would add one item by one, as if calling list.append
. That saves a bit of overhead, making the former (according to my measurements) about three times faster. (The iadd
operator is normally written as +=
and does the same thing as list.extend
.)
Using list comprehensions (the first solution by Ignacio) is still usually the right way, it is easier to read.
But definitely avoid using sum(..., [])
, because it runs in quadratic time. That is very impractical for many lists (more than a hundred or so).
I like itertools.chain
approach because it runs in linear time (sum(…) runs in qudratic time) but @Jochen didn’t show how to deal with lists of dynamic length. Here is solution for OP’s question.
import itertools
list(itertools.chain(*[getNeighbors(n) for n in getNeighbors(vertex)]))
You can get rid of list(...)
call if iterable is sufficient for you.
Quickest to slowest:
list_of_lists = [[x,1] for x in xrange(1000)]
%timeit list(itertools.chain.from_iterable(list_of_lists))
30 µs ± 320 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit list(itertools.chain(*list_of_lists))
33.4 µs ± 761 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
min(timeit.repeat("ll=[];nfor l in list_of_lists:n ll.extend(l)", "list_of_lists=[[x,1] for x in range(1000)]",repeat=3, number=100))/100.0
4.1411130223423245e-05
%timeit [y for z in list_of_lists for y in z]
53.9 µs ± 156 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit sum(list_of_lists, [])
1.5 ms ± 10.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
(Python 3.7.10)
Python2:
list_of_lists = [[x,1] for x in xrange(1000)]
%timeit list(itertools.chain(*list_of_lists))
100000 loops, best of 3: 14.6 µs per loop
%timeit list(itertools.chain.from_iterable(list_of_lists))
10000 loops, best of 3: 60.2 µs per loop
min(timeit.repeat("ll=[];nfor l in list_of_lists:n ll.extend(l)", "list_of_lists=[[x,1] for x in xrange(1000)]",repeat=3, number=100))/100.0
9.620904922485351e-05
%timeit [y for z in list_of_lists for y in z]
10000 loops, best of 3: 108 µs per loop
%timeit sum(list_of_lists, [])
100 loops, best of 3: 3.7 ms per loop
Using .extend() (update in place) combined with reduce instead of sum() (new object each time) should be more efficient however I’m too lazy to test that 🙂
mylist = [[1,2], [3,4], [5,6]]
reduce(lambda acc_l, sl: acc_l.extend(sl) or acc_l, mylist)