averaging list of lists python column-wise

Question

I have a list of lists:
something like:

data = [[240, 240, 239],
        [250, 249, 237], 
        [242, 239, 237],
        [240, 234, 233]]

And I want to average this out like

[average_column_1, average_column_2, average_column_3]

My piece of code is like not very elegant.
It is the naive way of going thru the list, keeping the sum in seperate container and then dividing by number of elements.

I think there is a pythonic way to do this.
Any suggestions?
Thanks

Asked By: frazman

||

Source

Answer 1

Use zip(), like so:

averages = [sum(col) / float(len(col)) for col in zip(*data)]

zip() takes multiple iterable arguments, and returns slices of those iterables (as tuples), until one of the iterables cannot return anything more. In effect, it performs a transpose operation, akin to matrices.

>>> data = [[240, 240, 239],
...         [250, 249, 237], 
...         [242, 239, 237],
...         [240, 234, 233]]

>>> [list(col) for col in zip(*data)]
[[240, 250, 242, 240],
 [240, 249, 239, 234],
 [239, 237, 237, 233]]

By performing sum() on each of those slices, you effectively get the column-wise sum. Simply divide by the length of the column to get the mean.

Side point: In Python 2.x, division on integers floors the decimal by default, which is why float() is called to “promote” the result to a floating point type.

Answered By: voithos

Answer 2

Pure Python:

from __future__ import division
def mean(a):
    return sum(a) / len(a)
a = [[240, 240, 239],
     [250, 249, 237], 
     [242, 239, 237],
     [240, 234, 233]]
print map(mean, zip(*a))

printing

[243.0, 240.5, 236.5]

NumPy:

a = numpy.array([[240, 240, 239],
                 [250, 249, 237], 
                 [242, 239, 237],
                 [240, 234, 233]])
print numpy.mean(a, axis=0)

Python 3:

from statistics import mean
a = [[240, 240, 239],
     [250, 249, 237], 
     [242, 239, 237],
     [240, 234, 233]]
print(*map(mean, zip(*a)))

Answered By: Sven Marnach

Answer 3

data = [[240, 240, 239],
        [250, 249, 237], 
        [242, 239, 237],
        [240, 234, 233]]
avg = [float(sum(col))/len(col) for col in zip(*data)]
# [243.0, 240.5, 236.5]

This works because zip(*data) will give you a list with the columns grouped, the float() call is only necessary on Python 2.x, which uses integer division unless from __future__ import division is used.

Answered By: Andrew Clark

Answer 4

import numpy as np

data = [[240, 240, 239],
        [250, 249, 237], 
        [242, 239, 237],
        [240, 234, 233]]

np.mean(data, axis=0)
# array([ 243. ,  240.5,  236.5])

Seems to work.

Answered By: Oren

Answer 5

You can use map and zip:

list(map(lambda x: sum(x)/len(x), zip(*data)))

[243.0, 240.5, 236.5]

Answered By: Nicolas Gervais

averaging list of lists python column-wise

Question:

Answers: