Calculating the average of a column in a list in Python

Question:

I have to calculate the average of each column in a specific nested list and then save the average into a new list. In my code so far, I have set the original list into a nested list and transposed it to read in the columns. I am just not sure how to code the average.

#First open the data text file
import re

f = open('C:Python27Fake1.txt', 'r')

#Convert to a nested list
nestedlist = []
q = f.read()
f.close()
numbers = re.split('n', q) #Splits the n and t out of the list
newlist = []
for row in numbers:
   newlist.append(row.split('t'))


#Reading in columns
def mytranspose(nestedlist):
   list_prime = []
   for i in range(len(nestedlist[0])):
          list_prime.append([])
   for row in nestedlist:
          for i in range(len(row)):
                 list_prime[i].append(row[i])
   return(list_prime)
print (mytranspose(newlist))


#Average of Columns
def myaverage(nestedlist):
   avg_list = []
   a = 0
   avg = 0
   for i in newlist:
          a = sum(newlist[i])
          avg = a/len(row)
          avg_list.append(avg[i])
   return(list_prime)

print(myaverage(newlist))
Asked By: lilceex15

||

Answers:

Here is a simpler way to do the whole thing:

with open('C:Python27Fake1.txt', 'r') as f:
    data = [map(float, line.split()) for line in f]

num_rows = len(data)
num_cols = len(data[0])

totals = num_cols * [0.0]
for line in data:
    for index in xrange(num_cols):
        totals[index] += line[index]

averages = [total / num_rows for total in totals]
print averages

I would however recommend using numpy for this sort of thing as it becomes trivial (as well as much faster):

import numpy as np
data = np.loadtxt('C:Python27Fake1.txt')
print data.mean(0)
Answered By: Trevor

Say you have your list of lists

table = [[1, 2, 3],  [10, 20, 30], [100, 200, 300]]

You can transpose it using zip and passing the original list of lists as the argument list (what the asterisk does):

transposed = zip(*table)
: [(1, 10, 100), (2, 20, 200), (3, 30, 300)]

To get the sum of these columns you can map each entry using the map functions:

sums = map(sum, transposed)
: [111, 222, 333]

Since the average is the sum divided by the length, we can do this with a function:

def avg(items):
    return float(sum(items)) / len(items)

Or you could do this in a lambda:

avg = lambda items: float(sum(items)) / len(items)

And use this in place of sum:

averages = map(avg, transposed)

You could put this all together into one function like this:

table = [[1, 2, 3],  [10, 20, 30], [100, 200, 300]]
averages = map(lambda items: float(sum(items)) / len(items), zip(*table))

But that’s a little unreadable, so it’s generally clearer to break it up:

table = [[1, 2, 3],  [10, 20, 30], [100, 200, 300]]
transposed = zip(*table)
avg = lambda items: float(sum(items)) / len(items)
averages = map(avg, transposed)
Answered By: quornian
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.