Handling matrix multiplication in log space in Python

Question:

I am implementing a Hidden Markov Model and thus am dealing with very small probabilities. I am handling the underflow by representing variables in log space (so x → log(x)) which has the side effect that multiplication is now replaced by addition and addition is handled via numpy.logaddexp or similar.

Is there an easy way to handle matrix multiplication in log space?

Asked By: BrandonHoughton

||

Answers:

This is the best way I could come up with to do it.

from scipy.special import logsumexp
def log_space_product(A,B):
    Astack = np.stack([A]*A.shape[0]).transpose(2,1,0)
    Bstack = np.stack([B]*B.shape[1]).transpose(1,0,2)
    return logsumexp(Astack+Bstack, axis=0)

The inputs A and B are the logs of the matrices A0 and B0 you want to multiply, and the functions returns the log of A0B0. The idea is that the i,j spot in log(A0B0) is the log of the dot product of the ith row of A0 and the jth column of B0. So it is the logsumexp of the ith row of A plus the jth column of B.

In the code, Astack is built so the i,j spot is a vector containing the ith row of A, and Bstack is built so the i,j spot is a vector containing the jth column of B. Thus Astack + Bstack is a 3D tensor whose i,j spot is the ith row of A plus the jth column of B. Taking logsumexp with axis = 0 then gives the desired result.

Answered By: Erik Parkinson

Erik’s response doesn’t seem to work for some non-square matrices (e.g. n*m times m*r). Here is a version that takes that into account:

def log_space_product(A,B):
    Astack = np.stack([A]*B.shape[1]).transpose(1,0,2)
    Bstack = np.stack([B]*A.shape[0]).transpose(0,2,1)
    return logsumexp(Astack+Bstack, axis=2)

where the i, j spot of A contains the i-th row of A and i, j spot of B contains the i-th column of B.

This happens because [A] * B.shape[1] is of shape (r, n, m) which is transposed into (n, r, m), and [B] * A.shape[0] is of shape (n, m, r) which is transposed into (n, r, m). We want their first two dimensions to be (n, r) because the result matrix needs to be of shape (n, r).

Took a while to figure out myself. Hope this helps anyone implementing a HMM!

Answered By: Andi Liu
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.