How is sum() used to merge lists

Question:

I am learning trees in python, which I was showed with these functions to build a tree:

def tree(label, branches=[]):
    return [label] + list(branches)

def label(tree):
    return tree[0]

def branches(tree):
    return tree[1:]

There is a function that can extract all the nodes of a tree into a list:

def all_nodes(tree):
    return [label(tree)] + sum([all_nodes(b) for b in branches(tree)], [])

T = tree(1, [tree(2, [tree(4), tree(5)]), tree(3, [tree(6), tree(7)])])
print(all_nodes(T))
# >>> [1, 2, 4, 5, 3, 6, 7]

You can see that this worked very well, but I got confused how sum() is used here.

I know that a list can be added to another list:

print([1] + [2]) # >>> [1, 2]

But I can’t make it work by using sum():

a, b = [1], [2]
print(sum(a, b))
# >>> TypeError: can only concatenate list (not "int") to list
print(sum([a, b]))
# >>> TypeError: unsupported operand type(s) for +: 'int' and 'list

In the tree function, how did sum() work to merge all the lists?

Asked By: jxie0755

||

Answers:

sum operates on a sequence of elements, such as sum([1, 2, 3]) (producing 6) or sum([ [1], [2] ], []) (producing [1, 2]). There is an optional second argument, the start value. For instance, sum([1, 2, 3], 10) would start the summation at 10, providing 16. start defaults to 0: if you’re summing non-numeric objects, you have to provide a compatible start value.

When you give it sum(a, b), the list a becomes the list of arguments. What sum did was to (correctly) iterate through the items of that list, adding them to the start value you provided. The logic is something like this:

result = b
for element in a:
    result = result + element

Thus, the first thing you tried to do was result = [2] + 1. Remember, that first argument is a sequence of the things you want to add up. The most trivial change (although not the most readable) that would let your attempt work is

sum([a], b)

which produces [2, 1], since b is the starting value.

Does that explain what happened?

Answered By: Prune

The builtin method sum will use the + operation to sum a list of elements. Its second argument is the starting value.

By default the starting value is 0, meaning sum([[1], [2]]) is equivalent to 0 + [1] + [2] which raises a TypeError.

To concatenate lists you want the initial value to be [], an empty list. Then, sum([[1], [2], [3]], []) is equivalent to [] + [1] + [2] + [3] as desired.

Performance

It is not recommended to use sum to concatenate a list of lists. Indeed, on every addition a new list is created. Instead you want to use a solution that traverses all lists and append the items to a new list.

def concat_lists(lists):
    new_list = []
    for l in lists:
        new_list.extend(l)

Or alternatively using itertools.

from itertools import chain

new_list = list(chain(*lists))
Answered By: Olivier Melançon
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.