Calculating the average of a column in a list in Python
Question:
I have to calculate the average of each column in a specific nested list and then save the average into a new list. In my code so far, I have set the original list into a nested list and transposed it to read in the columns. I am just not sure how to code the average.
#First open the data text file
import re
f = open('C:Python27Fake1.txt', 'r')
#Convert to a nested list
nestedlist = []
q = f.read()
f.close()
numbers = re.split('n', q) #Splits the n and t out of the list
newlist = []
for row in numbers:
newlist.append(row.split('t'))
#Reading in columns
def mytranspose(nestedlist):
list_prime = []
for i in range(len(nestedlist[0])):
list_prime.append([])
for row in nestedlist:
for i in range(len(row)):
list_prime[i].append(row[i])
return(list_prime)
print (mytranspose(newlist))
#Average of Columns
def myaverage(nestedlist):
avg_list = []
a = 0
avg = 0
for i in newlist:
a = sum(newlist[i])
avg = a/len(row)
avg_list.append(avg[i])
return(list_prime)
print(myaverage(newlist))
Answers:
Here is a simpler way to do the whole thing:
with open('C:Python27Fake1.txt', 'r') as f:
data = [map(float, line.split()) for line in f]
num_rows = len(data)
num_cols = len(data[0])
totals = num_cols * [0.0]
for line in data:
for index in xrange(num_cols):
totals[index] += line[index]
averages = [total / num_rows for total in totals]
print averages
I would however recommend using numpy for this sort of thing as it becomes trivial (as well as much faster):
import numpy as np
data = np.loadtxt('C:Python27Fake1.txt')
print data.mean(0)
Say you have your list of lists
table = [[1, 2, 3], [10, 20, 30], [100, 200, 300]]
You can transpose it using zip and passing the original list of lists as the argument list (what the asterisk does):
transposed = zip(*table)
: [(1, 10, 100), (2, 20, 200), (3, 30, 300)]
To get the sum of these columns you can map each entry using the map functions:
sums = map(sum, transposed)
: [111, 222, 333]
Since the average is the sum divided by the length, we can do this with a function:
def avg(items):
return float(sum(items)) / len(items)
Or you could do this in a lambda:
avg = lambda items: float(sum(items)) / len(items)
And use this in place of sum:
averages = map(avg, transposed)
You could put this all together into one function like this:
table = [[1, 2, 3], [10, 20, 30], [100, 200, 300]]
averages = map(lambda items: float(sum(items)) / len(items), zip(*table))
But that’s a little unreadable, so it’s generally clearer to break it up:
table = [[1, 2, 3], [10, 20, 30], [100, 200, 300]]
transposed = zip(*table)
avg = lambda items: float(sum(items)) / len(items)
averages = map(avg, transposed)
I have to calculate the average of each column in a specific nested list and then save the average into a new list. In my code so far, I have set the original list into a nested list and transposed it to read in the columns. I am just not sure how to code the average.
#First open the data text file
import re
f = open('C:Python27Fake1.txt', 'r')
#Convert to a nested list
nestedlist = []
q = f.read()
f.close()
numbers = re.split('n', q) #Splits the n and t out of the list
newlist = []
for row in numbers:
newlist.append(row.split('t'))
#Reading in columns
def mytranspose(nestedlist):
list_prime = []
for i in range(len(nestedlist[0])):
list_prime.append([])
for row in nestedlist:
for i in range(len(row)):
list_prime[i].append(row[i])
return(list_prime)
print (mytranspose(newlist))
#Average of Columns
def myaverage(nestedlist):
avg_list = []
a = 0
avg = 0
for i in newlist:
a = sum(newlist[i])
avg = a/len(row)
avg_list.append(avg[i])
return(list_prime)
print(myaverage(newlist))
Here is a simpler way to do the whole thing:
with open('C:Python27Fake1.txt', 'r') as f:
data = [map(float, line.split()) for line in f]
num_rows = len(data)
num_cols = len(data[0])
totals = num_cols * [0.0]
for line in data:
for index in xrange(num_cols):
totals[index] += line[index]
averages = [total / num_rows for total in totals]
print averages
I would however recommend using numpy for this sort of thing as it becomes trivial (as well as much faster):
import numpy as np
data = np.loadtxt('C:Python27Fake1.txt')
print data.mean(0)
Say you have your list of lists
table = [[1, 2, 3], [10, 20, 30], [100, 200, 300]]
You can transpose it using zip and passing the original list of lists as the argument list (what the asterisk does):
transposed = zip(*table)
: [(1, 10, 100), (2, 20, 200), (3, 30, 300)]
To get the sum of these columns you can map each entry using the map functions:
sums = map(sum, transposed)
: [111, 222, 333]
Since the average is the sum divided by the length, we can do this with a function:
def avg(items):
return float(sum(items)) / len(items)
Or you could do this in a lambda:
avg = lambda items: float(sum(items)) / len(items)
And use this in place of sum:
averages = map(avg, transposed)
You could put this all together into one function like this:
table = [[1, 2, 3], [10, 20, 30], [100, 200, 300]]
averages = map(lambda items: float(sum(items)) / len(items), zip(*table))
But that’s a little unreadable, so it’s generally clearer to break it up:
table = [[1, 2, 3], [10, 20, 30], [100, 200, 300]]
transposed = zip(*table)
avg = lambda items: float(sum(items)) / len(items)
averages = map(avg, transposed)