Mean of Sampleset and powered Sampleset

Question:

I am working on an ICA implementation wich is based on the assumption, that all source signals are independent. So I checked on the basic concepts of Dependence vs. Correlation and tried to show this example on sample data

from numpy import *
from numpy.random import *
k =  1000
s = 10000
mn = 0
mnPow = 0
for i in arange(1,k):
    a = randn(s)
    a = a-mean(a)
    mn = mn + mean(a)
    mnPow = mnPow + mean(a**3)
print "Mean X: ", mn/k
print "Mean X^3: ", mnPow/k

But I couldn’t produce the last step of this example E(X^3) = 0:

>> Mean X:  -1.11174580826e-18
>> Mean X^3:  -0.00125229267144

First value I would consider to be zero, but second value is too large, isn’t it? Since I subtract the mean of a, I expected the mean of a^3 to be zero as well. Does the problem lie in

  1. the random number generator,
  2. the precision of the numerical values
  3. in my misunderstanding of the concepts of mean and expected value?
Asked By: Milla Well

||

Answers:

Probably just rounding error?

Have you tried computing the covariances and averaging those?

Answered By: Ruben

The sample mean itself is a random variable. While its expected value here is zero, the particular realizations will fluctuate around that expected value.

When I run the following many times:

from numpy import *
from numpy.random import *
k =  1000
s = 10000
mn = 0
mnPow = 0
for i in arange(k):
    a = randn(s)
    mn += mean(a)
    mnPow += mean(a**3)
print "Mean X: ", mn/k
print "Mean X^3: ", mnPow/k

I get numbers for both means that fluctuate around zero.

EDIT:

If you plot the density of this, the means look gaussian itself:
enter image description here

Note that I’ve removed a = a-mean(a) from your code since it’s erroneous. With it, mn accumulates mean(a - mean(a)) which is mathematically zero due to linearity of expectation:

E[x - E[x]] = E[x] - E[E[x]] = E[x] - E[x] = 0

The only reason it comes out as slightly non-zero is due to rounding errors.

Answered By: NPE
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.