Discrepancy between log_prob and manual calculation

Question:

I want to define multivariate normal distribution with mean [1, 1, 1] and variance covariance matrix with 0.3 on diagonal. After that I want to calculate log likelihood on datapoints [2, 3, 4]

By torch distributions

import torch
import torch.distributions as td

input_x = torch.tensor([2, 3, 4])
loc = torch.ones(3)
scale = torch.eye(3) * 0.3
mvn = td.MultivariateNormal(loc = loc, scale_tril=scale)
mvn.log_prob(input_x)
tensor(-76.9227)

From scratch

By using formula for log likelihood:

enter image description here

We obtain tensor:

first_term = (2 * np.pi* 0.3)**(3)
first_term = -np.log(np.sqrt(first_term))
x_center = input_x - loc
tmp = torch.matmul(x_center, scale.inverse())
tmp = -1/2 * torch.matmul(tmp, x_center)
first_term + tmp 
tensor(-24.2842)

where I used fact that enter image description here

My question is – what’s the source of this discrepancy?

Asked By: John

||

Answers:

You are passing the covariance matrix to the scale_tril instead of covariance_matrix. From the docs of PyTorch’s Multivariate Normal

scale_tril (Tensor) – lower-triangular factor of covariance, with positive-valued diagonal

So, replacing scale_tril with covariance_matrix would yield the same results as your manual attempt.

In [1]: mvn = td.MultivariateNormal(loc = loc, covariance_matrix=scale)
In [2]: mvn.log_prob(input_x)
Out[2]: tensor(-24.2842)

However, it’s more efficient to use scale_tril according to the authors:

…Using scale_tril will be more efficient:

You can calculate the lower choelsky using torch.linalg.cholesky

In [3]: mvn = td.MultivariateNormal(loc = loc, scale_tril=torch.linalg.cholesky(scale))
In [4]: mvn.log_prob(input_x)
Out[4]: tensor(-24.2842)
Answered By: ndrwnaguib
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.