averaging list of lists python column-wise

Question:

I have a list of lists:
something like:

data = [[240, 240, 239],
        [250, 249, 237], 
        [242, 239, 237],
        [240, 234, 233]]

And I want to average this out like

[average_column_1, average_column_2, average_column_3]

My piece of code is like not very elegant.
It is the naive way of going thru the list, keeping the sum in seperate container and then dividing by number of elements.

I think there is a pythonic way to do this.
Any suggestions?
Thanks

Asked By: frazman

||

Answers:

Use zip(), like so:

averages = [sum(col) / float(len(col)) for col in zip(*data)]

zip() takes multiple iterable arguments, and returns slices of those iterables (as tuples), until one of the iterables cannot return anything more. In effect, it performs a transpose operation, akin to matrices.

>>> data = [[240, 240, 239],
...         [250, 249, 237], 
...         [242, 239, 237],
...         [240, 234, 233]]

>>> [list(col) for col in zip(*data)]
[[240, 250, 242, 240],
 [240, 249, 239, 234],
 [239, 237, 237, 233]]

By performing sum() on each of those slices, you effectively get the column-wise sum. Simply divide by the length of the column to get the mean.

Side point: In Python 2.x, division on integers floors the decimal by default, which is why float() is called to “promote” the result to a floating point type.

Answered By: voithos

Pure Python:

from __future__ import division
def mean(a):
    return sum(a) / len(a)
a = [[240, 240, 239],
     [250, 249, 237], 
     [242, 239, 237],
     [240, 234, 233]]
print map(mean, zip(*a))

printing

[243.0, 240.5, 236.5]

NumPy:

a = numpy.array([[240, 240, 239],
                 [250, 249, 237], 
                 [242, 239, 237],
                 [240, 234, 233]])
print numpy.mean(a, axis=0)

Python 3:

from statistics import mean
a = [[240, 240, 239],
     [250, 249, 237], 
     [242, 239, 237],
     [240, 234, 233]]
print(*map(mean, zip(*a)))
Answered By: Sven Marnach
data = [[240, 240, 239],
        [250, 249, 237], 
        [242, 239, 237],
        [240, 234, 233]]
avg = [float(sum(col))/len(col) for col in zip(*data)]
# [243.0, 240.5, 236.5]

This works because zip(*data) will give you a list with the columns grouped, the float() call is only necessary on Python 2.x, which uses integer division unless from __future__ import division is used.

Answered By: Andrew Clark
import numpy as np

data = [[240, 240, 239],
        [250, 249, 237], 
        [242, 239, 237],
        [240, 234, 233]]

np.mean(data, axis=0)
# array([ 243. ,  240.5,  236.5])

Seems to work.

Answered By: Oren

You can use map and zip:

list(map(lambda x: sum(x)/len(x), zip(*data)))
[243.0, 240.5, 236.5]
Answered By: Nicolas Gervais
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.