Vectorized implementation for `numpy.random.multivariate_normal`
Question:
I am trying to use numpy.random.multivariate_normal
to generate multiple samples where each sample is drawn from a multivariate Normal distribution with a different mean
and cov
. For example, if I would like to draw 2 samples, I tried
from numpy import random as rand
means = np.array([[-1., 0.], [1., 0.]])
covs = np.array([np.identity(2) for k in xrange(2)])
rand.multivariate_normal(means, covs)
but this results in ValueError: mean must be 1 dimensional
. Do I have to do a for loop for this? I thought that for functions like rand.binomial
this is possible.
Answers:
As @hpaulj suggested, you can generate samples from the standard multivariate normal distribution, and then use, say, einsum
and/or broadcasting to transform the samples. The scaling is done by multiplying the standard sample points by the square root of the covariance matrix. In the following, I use scipy.linalg.sqrtm
to compute the matrix square root, and numpy.einsum
to do the matrix multiplication.
import numpy as np
from scipy.linalg import sqrtm
import matplotlib.pyplot as plt
# Sequence of means
means = np.array([[-15., 0.], [15., 0.], [0., 0.]])
# Sequence of covariance matrices. Must be the same length as means.
covs = np.array([[[ 3, -1],
[-1, 2]],
[[ 1, 2],
[ 2, 5]],
[[ 1, 0],
[ 0, 1]]])
# Number of samples to generate for each (mean, cov) pair.
nsamples = 4000
# Compute the matrix square root of each covariance matrix.
sqrtcovs = np.array([sqrtm(c) for c in covs])
# Generate samples from the standard multivariate normal distribution.
dim = len(means[0])
u = np.random.multivariate_normal(np.zeros(dim), np.eye(dim),
size=(len(means), nsamples,))
# u has shape (len(means), nsamples, dim)
# Transform u.
v = np.einsum('ijk,ikl->ijl', u, sqrtcovs)
m = np.expand_dims(means, 1)
t = v + m
# t also has shape (len(means), nsamples, dim).
# t[i] holds the nsamples sampled from the distribution with mean means[i]
# and covariance cov[i].
plt.subplot(2, 1, 1)
plt.plot(t[...,0].ravel(), t[...,1].ravel(), '.', alpha=0.02)
plt.axis('equal')
plt.xlim(-25, 25)
plt.ylim(-8, 8)
plt.grid()
# Make another plot, where we generate the samples by passing the given
# means and covs to np.random.multivariate_normal. This plot should look
# the same as the first plot.
plt.subplot(2, 1, 2)
p0 = np.random.multivariate_normal(means[0], covs[0], size=nsamples)
p1 = np.random.multivariate_normal(means[1], covs[1], size=nsamples)
p2 = np.random.multivariate_normal(means[2], covs[2], size=nsamples)
plt.plot(p0[:,0], p0[:,1], 'b.', alpha=0.02)
plt.plot(p1[:,0], p1[:,1], 'g.', alpha=0.02)
plt.plot(p2[:,0], p2[:,1], 'r.', alpha=0.02)
plt.axis('equal')
plt.xlim(-25, 25)
plt.ylim(-8, 8)
plt.grid()
This method might not be any faster that looping over the means
and covs
arrays and calling multivariate_normal
once for each pair (mean, cov). The case where this method would give the most benefit is when you have many different means and covariances and are generating a small number of samples per pair. And even then, it might not be faster, because the script uses a Python loop over the covs
array to call sqrtm
for each covariance matrix. If performance is critical, test with your actual data.
Since I did not find the answer anywhere and I needed just one calculation of pdf(X)
for each mean, std
pair.
I vectorized directly from the formula (so it only works for pdf
(but other functions can be written similarly):
lpi = (2*np.pi)**3
def vectorized_normal_pdf(X, means, stds):
ndev = (X - means)/stds
exp = (ndev[:,None,:] @ (X - means)[:,:,None]).squeeze()
return np.exp(-0.5*exp)/np.sqrt(lpi*stds.prod(axis=1))
where,
all X
, means
, stds
are of shape [N, d]
(N multivariate data points with d values each)
and the output is [N]
I have verified that it gives the correct answer (within a small error of 1e-14, I don’t know why it is not equal, maybe they are using some numerical stability thing by adding a small epsilon in division)
and is much faster (with size of just 10^4 I got ~4300x speedup):
X = np.random.rand(10000, 3)
means = np.random.rand(10000, 3)
stds = np.random.rand(10000, 3)
>>> %timeit norm_pdf(X, means, stds)
684 µs ± 12.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> %timeit [multivariate_normal(means[i], stds[i]).pdf(X[i]) for i in range(10000)]
2.94 s ± 207 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> ( (res1 - res2) < 1e-14 ).all()
True
This is crucial for applications like Gaussian Mixture Models applied to images, as we need multiple Gaussians for each pixel process, so for a very small/low-res 240*320 image it is 76800 Gaussians.
Note that this does not handle covariance matrices (yet) but usually you can just use stds instead of the whole matrix
I am trying to use numpy.random.multivariate_normal
to generate multiple samples where each sample is drawn from a multivariate Normal distribution with a different mean
and cov
. For example, if I would like to draw 2 samples, I tried
from numpy import random as rand
means = np.array([[-1., 0.], [1., 0.]])
covs = np.array([np.identity(2) for k in xrange(2)])
rand.multivariate_normal(means, covs)
but this results in ValueError: mean must be 1 dimensional
. Do I have to do a for loop for this? I thought that for functions like rand.binomial
this is possible.
As @hpaulj suggested, you can generate samples from the standard multivariate normal distribution, and then use, say, einsum
and/or broadcasting to transform the samples. The scaling is done by multiplying the standard sample points by the square root of the covariance matrix. In the following, I use scipy.linalg.sqrtm
to compute the matrix square root, and numpy.einsum
to do the matrix multiplication.
import numpy as np
from scipy.linalg import sqrtm
import matplotlib.pyplot as plt
# Sequence of means
means = np.array([[-15., 0.], [15., 0.], [0., 0.]])
# Sequence of covariance matrices. Must be the same length as means.
covs = np.array([[[ 3, -1],
[-1, 2]],
[[ 1, 2],
[ 2, 5]],
[[ 1, 0],
[ 0, 1]]])
# Number of samples to generate for each (mean, cov) pair.
nsamples = 4000
# Compute the matrix square root of each covariance matrix.
sqrtcovs = np.array([sqrtm(c) for c in covs])
# Generate samples from the standard multivariate normal distribution.
dim = len(means[0])
u = np.random.multivariate_normal(np.zeros(dim), np.eye(dim),
size=(len(means), nsamples,))
# u has shape (len(means), nsamples, dim)
# Transform u.
v = np.einsum('ijk,ikl->ijl', u, sqrtcovs)
m = np.expand_dims(means, 1)
t = v + m
# t also has shape (len(means), nsamples, dim).
# t[i] holds the nsamples sampled from the distribution with mean means[i]
# and covariance cov[i].
plt.subplot(2, 1, 1)
plt.plot(t[...,0].ravel(), t[...,1].ravel(), '.', alpha=0.02)
plt.axis('equal')
plt.xlim(-25, 25)
plt.ylim(-8, 8)
plt.grid()
# Make another plot, where we generate the samples by passing the given
# means and covs to np.random.multivariate_normal. This plot should look
# the same as the first plot.
plt.subplot(2, 1, 2)
p0 = np.random.multivariate_normal(means[0], covs[0], size=nsamples)
p1 = np.random.multivariate_normal(means[1], covs[1], size=nsamples)
p2 = np.random.multivariate_normal(means[2], covs[2], size=nsamples)
plt.plot(p0[:,0], p0[:,1], 'b.', alpha=0.02)
plt.plot(p1[:,0], p1[:,1], 'g.', alpha=0.02)
plt.plot(p2[:,0], p2[:,1], 'r.', alpha=0.02)
plt.axis('equal')
plt.xlim(-25, 25)
plt.ylim(-8, 8)
plt.grid()
This method might not be any faster that looping over the means
and covs
arrays and calling multivariate_normal
once for each pair (mean, cov). The case where this method would give the most benefit is when you have many different means and covariances and are generating a small number of samples per pair. And even then, it might not be faster, because the script uses a Python loop over the covs
array to call sqrtm
for each covariance matrix. If performance is critical, test with your actual data.
Since I did not find the answer anywhere and I needed just one calculation of pdf(X)
for each mean, std
pair.
I vectorized directly from the formula (so it only works for pdf
(but other functions can be written similarly):
lpi = (2*np.pi)**3
def vectorized_normal_pdf(X, means, stds):
ndev = (X - means)/stds
exp = (ndev[:,None,:] @ (X - means)[:,:,None]).squeeze()
return np.exp(-0.5*exp)/np.sqrt(lpi*stds.prod(axis=1))
where,
all X
, means
, stds
are of shape [N, d]
(N multivariate data points with d values each)
and the output is [N]
I have verified that it gives the correct answer (within a small error of 1e-14, I don’t know why it is not equal, maybe they are using some numerical stability thing by adding a small epsilon in division)
and is much faster (with size of just 10^4 I got ~4300x speedup):
X = np.random.rand(10000, 3)
means = np.random.rand(10000, 3)
stds = np.random.rand(10000, 3)
>>> %timeit norm_pdf(X, means, stds)
684 µs ± 12.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> %timeit [multivariate_normal(means[i], stds[i]).pdf(X[i]) for i in range(10000)]
2.94 s ± 207 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> ( (res1 - res2) < 1e-14 ).all()
True
This is crucial for applications like Gaussian Mixture Models applied to images, as we need multiple Gaussians for each pixel process, so for a very small/low-res 240*320 image it is 76800 Gaussians.
Note that this does not handle covariance matrices (yet) but usually you can just use stds instead of the whole matrix