Concatenation of many lists in Python

Question:

Suppose I have a function like this:

def getNeighbors(vertex)

which returns a list of vertices that are neighbors of the given vertex. Now I want to create a list with all the neighbors of the neighbors. I do that like this:

listOfNeighborsNeighbors = []
for neighborVertex in getNeighbors(vertex):
    listOfNeighborsNeighbors.append(getNeighbors(neighborsVertex))

Is there a more pythonic way to do that?

Asked By: Björn Pollex

||

Answers:

[x for n in getNeighbors(vertex) for x in getNeighbors(n)]

or

sum(getNeighbors(n) for n in getNeighbors(vertex), [])

Appending lists can be done with + and sum():

>>> c = [[1, 2], [3, 4]]
>>> sum(c, [])
[1, 2, 3, 4]
Answered By: Sjoerd

As usual, the itertools module contains a solution:

>>> l1=[1, 2, 3]

>>> l2=[4, 5, 6]

>>> l3=[7, 8, 9]

>>> import itertools

>>> list(itertools.chain(l1, l2, l3))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Answered By: Jochen

If speed matters, it may be better to use this:

from operator import iadd
reduce(iadd, (getNeighbors(n) for n in getNeighbors(vertex)))

The point of this code is in concatenating whole lists by list.extend where list comprehension would add one item by one, as if calling list.append. That saves a bit of overhead, making the former (according to my measurements) about three times faster. (The iadd operator is normally written as += and does the same thing as list.extend.)

Using list comprehensions (the first solution by Ignacio) is still usually the right way, it is easier to read.

But definitely avoid using sum(..., []), because it runs in quadratic time. That is very impractical for many lists (more than a hundred or so).

Answered By: emu

I like itertools.chain approach because it runs in linear time (sum(…) runs in qudratic time) but @Jochen didn’t show how to deal with lists of dynamic length. Here is solution for OP’s question.

import itertools
list(itertools.chain(*[getNeighbors(n) for n in getNeighbors(vertex)]))

You can get rid of list(...) call if iterable is sufficient for you.

Answered By: renadeen

Quickest to slowest:

list_of_lists = [[x,1] for x in xrange(1000)]

%timeit list(itertools.chain.from_iterable(list_of_lists))
30 µs ± 320 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit list(itertools.chain(*list_of_lists))
33.4 µs ± 761 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

min(timeit.repeat("ll=[];nfor l in list_of_lists:n ll.extend(l)", "list_of_lists=[[x,1] for x in range(1000)]",repeat=3, number=100))/100.0
4.1411130223423245e-05

%timeit [y for z in list_of_lists for y in z]
53.9 µs ± 156 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit sum(list_of_lists, [])
1.5 ms ± 10.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

(Python 3.7.10)

Python2:

list_of_lists = [[x,1] for x in xrange(1000)]

%timeit list(itertools.chain(*list_of_lists))
100000 loops, best of 3: 14.6 µs per loop

%timeit list(itertools.chain.from_iterable(list_of_lists))
10000 loops, best of 3: 60.2 µs per loop

min(timeit.repeat("ll=[];nfor l in list_of_lists:n ll.extend(l)", "list_of_lists=[[x,1] for x in xrange(1000)]",repeat=3, number=100))/100.0
9.620904922485351e-05

%timeit [y for z in list_of_lists for y in z]
10000 loops, best of 3: 108 µs per loop

%timeit sum(list_of_lists, [])
100 loops, best of 3: 3.7 ms per loop
Answered By: Yariv

Using .extend() (update in place) combined with reduce instead of sum() (new object each time) should be more efficient however I’m too lazy to test that 🙂

mylist = [[1,2], [3,4], [5,6]] 
reduce(lambda acc_l, sl: acc_l.extend(sl) or acc_l, mylist)
Answered By: realmaniek
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.