Compute correlations of several vectors

Question:

I have several pairs of vectors (arranged as two matrices) and I want to compute the vector of their pairwise correlation coefficients (or, better yet, angles between them – but since correlation coefficient is its cosine, I am using
numpy.corrcoef):

np.array([np.corrcoef(m1[:,i],m2[:,i])[0,1]
          for i in range(m1.shape[1])])

I wonder if there is a way to "vectorize" this, i.e., avoid calling corrcoef several times.

Asked By: sds

||

Answers:

Instead of using np.corrcoef, you can write your own function that does the same thing. The calculation for the correlation coefficient of two vectors is quite simple:

linear correlation coefficient

Applying that here:

def vec_corrcoef(X, Y, axis=1):
    Xm = np.mean(X, axis=axis, keepdims=True)
    Ym = np.mean(Y, axis=axis, keepdims=True)
    N = np.sum((X - Xm) * (Y - Ym), axis=axis)
    D = np.sqrt(np.sum((X - Xm)**2, axis=axis) * np.sum((Y - Ym)**2, axis=axis))
    return N / D

To test:

m1 = np.random.random((100, 10))
m2 = np.random.random(m1.shape)

a = vec_corrcoef(m1, m2)
b = [np.corrcoef(v1, v2)[0, 1] for v1, v2 in zip(m1, m2)]

print(np.allclose(a, b)) # True
Answered By: Pranav Hosangadi