Python: Variance of a list of defined numbers

Question:

I am trying to make a function that prints the variance of a list of defined numbers:

grades = [100, 100, 90, 40, 80, 100, 85, 70, 90, 65, 90, 85, 50.5]

So far, I have tried proceeding on making these three functions:

def grades_sum(my_list):
    total = 0
    for grade in my_list: 
        total += grade
    return total

def grades_average(my_list):
    sum_of_grades = grades_sum(my_list)
    average = sum_of_grades / len(my_list)
    return average

def grades_variance(my_list, average):
    variance = 0
    for i in my_list:
        variance += (average - my_list[i]) ** 2
    return variance / len(my_list)

When I try to execute the code, however, it gives me the following error at the following line:

Line: variance += (average - my_list[i]) ** 2
Error: list index out of range

Apologies if my current Python knowledges are limited, but I am still learning – so please if you wish to help solving this issue try not to suggest extremely-complicated ways on how to solve this, thank you really much.

Asked By: GiamPy

||

Answers:

When you say

 for i in my_list:

i isn’t the index of the item. i is the item

for i in my_list:
    variance += (average - i) ** 2
Answered By: John La Rooy

While gnibbler has solved the problem with your code, you can achieve this much more easily using built-in functions and a generator expression:

average = sum(grades) / len(grades)
varience = sum((average - value) ** 2 for value in grades) / len(grades)

It might look a little scary at first, but if you watch the video I link about list comprehensions and generator expressions – they are actually really simple and useful.

Answered By: Gareth Latty

First I would suggest using Python’s built-in sum method to replace your first custom method. grades_average then becomes:

def grades_average(my_list):
    sum_of_grades = sum(my_list)
    average = sum_of_grades / len(my_list)
    return average

Second, I would strongly recommend looking into the NumPy library, as it has these methods built-in. numpy.mean() and numpy.std() would cover both these cases.

If you’re interested in writing the code for yourself first, that’s totally fine too. As for your specific error, I believe @gnibbler above nailed it. If you want to loop using an index, you can restructure the line in grades_variance to be:

for i in range(0, len(my_list)):

As Lattyware noted, looping by index is not particularly "Pythonic"; the way you’re currently doing it is generally superior. This is just for your reference.

Answered By: Magsol

Try numpy.

import numpy as np
variance = np.var(grades)
Answered By: robinfang

python 3.4 has a statistics lib which does this.

   import statistics
   grades = [100, 100, 90, 40, 80, 100, 85, 70, 90, 65, 90, 85, 50.5]
   statistics.pvariance(grades)
=> 334.07100591715977

https://docs.python.org/3/library/statistics.html#statistics.pvariance

Answered By: zengr

the below code is used to get the average of values

def grades_average(my_list):
    sum_of_grades = sum(my_list)
    average = sum(my_list) / len(my_list)
    return average

variance formula -> The average of the squared differences from the Mean.
This code below is used to get the variance of values

def grades_variance(my_list, average):
    variance = 0
    for i in my_list:
         variance += (average - i) ** 2
    return variance / len(my_list)
Answered By: Bharatwaja

I suppose you would like the sample variance i.e. the unbiased estimator of the variance. I think this function might do the job. It will print the variance and the mean of a vector n.

n = [5, 3, 1, 2, 4]

def variance1337(n):
    var1 = []
    mean1 = sum(n)/len(n)
    for xs in n:
        var1.append((xs - mean1) ** 2)
    print(sum(var1)/(len(n) - 1))
    print(mean1)
Answered By: Don Juan

The below code is used to get the variance I create a custom function

   def variance(val):
       total_sum=sum(val)
       average=total_sum/len(val)
       a=[]
       for i in val:
           a.append((i-average)**2)
       return sum(a)/len(a)

   val=[2.18,2.22,2.24,1.62,1.32,1.85,1.85,2.70,3.60,4.60,1.38,2.34,2.71]
   variance(val)
Answered By: TRINADH NAGUBADI