Mean of Sampleset and powered Sampleset
Question:
I am working on an ICA implementation wich is based on the assumption, that all source signals are independent. So I checked on the basic concepts of Dependence vs. Correlation and tried to show this example on sample data
from numpy import *
from numpy.random import *
k = 1000
s = 10000
mn = 0
mnPow = 0
for i in arange(1,k):
a = randn(s)
a = a-mean(a)
mn = mn + mean(a)
mnPow = mnPow + mean(a**3)
print "Mean X: ", mn/k
print "Mean X^3: ", mnPow/k
But I couldn’t produce the last step of this example E(X^3) = 0:
>> Mean X: -1.11174580826e-18
>> Mean X^3: -0.00125229267144
First value I would consider to be zero, but second value is too large, isn’t it? Since I subtract the mean of a
, I expected the mean of a^3
to be zero as well. Does the problem lie in
- the random number generator,
- the precision of the numerical values
- in my misunderstanding of the concepts of mean and expected value?
Answers:
Probably just rounding error?
Have you tried computing the covariances and averaging those?
The sample mean itself is a random variable. While its expected value here is zero, the particular realizations will fluctuate around that expected value.
When I run the following many times:
from numpy import *
from numpy.random import *
k = 1000
s = 10000
mn = 0
mnPow = 0
for i in arange(k):
a = randn(s)
mn += mean(a)
mnPow += mean(a**3)
print "Mean X: ", mn/k
print "Mean X^3: ", mnPow/k
I get numbers for both means that fluctuate around zero.
EDIT:
If you plot the density of this, the means look gaussian itself:
Note that I’ve removed a = a-mean(a)
from your code since it’s erroneous. With it, mn
accumulates mean(a - mean(a))
which is mathematically zero due to linearity of expectation:
E[x - E[x]] = E[x] - E[E[x]] = E[x] - E[x] = 0
The only reason it comes out as slightly non-zero is due to rounding errors.
I am working on an ICA implementation wich is based on the assumption, that all source signals are independent. So I checked on the basic concepts of Dependence vs. Correlation and tried to show this example on sample data
from numpy import *
from numpy.random import *
k = 1000
s = 10000
mn = 0
mnPow = 0
for i in arange(1,k):
a = randn(s)
a = a-mean(a)
mn = mn + mean(a)
mnPow = mnPow + mean(a**3)
print "Mean X: ", mn/k
print "Mean X^3: ", mnPow/k
But I couldn’t produce the last step of this example E(X^3) = 0:
>> Mean X: -1.11174580826e-18
>> Mean X^3: -0.00125229267144
First value I would consider to be zero, but second value is too large, isn’t it? Since I subtract the mean of a
, I expected the mean of a^3
to be zero as well. Does the problem lie in
- the random number generator,
- the precision of the numerical values
- in my misunderstanding of the concepts of mean and expected value?
Probably just rounding error?
Have you tried computing the covariances and averaging those?
The sample mean itself is a random variable. While its expected value here is zero, the particular realizations will fluctuate around that expected value.
When I run the following many times:
from numpy import *
from numpy.random import *
k = 1000
s = 10000
mn = 0
mnPow = 0
for i in arange(k):
a = randn(s)
mn += mean(a)
mnPow += mean(a**3)
print "Mean X: ", mn/k
print "Mean X^3: ", mnPow/k
I get numbers for both means that fluctuate around zero.
EDIT:
If you plot the density of this, the means look gaussian itself:
Note that I’ve removed a = a-mean(a)
from your code since it’s erroneous. With it, mn
accumulates mean(a - mean(a))
which is mathematically zero due to linearity of expectation:
E[x - E[x]] = E[x] - E[E[x]] = E[x] - E[x] = 0
The only reason it comes out as slightly non-zero is due to rounding errors.