How to get the 1st Principal Component by PCA using Python?

Question:

I have a set of 2D vectors presented in a n*2 matrix form.

I wish to get the 1st principal component, i.e. the vector that indicates the direction with the largest variance.

I have found a rather detailed documentation on this from Rice University.

Based on this, I have imported the data and done the following:

import numpy as np

dataMatrix = np.array(aListOfLists)   # Convert a list-of-lists into a numpy array.  aListOfLists is the data points in a regular list-of-lists type matrix.
myPCA = PCA(dataMatrix)   # make a new PCA object from a numpy array object

Then how may I get the 3D vector that is the 1st Principal Component?

Asked By: Sibbs Gambling

||

Answers:

PCA gives only 2d vecs from 2d data.

Look at the picture in Wikipedia PCA:
starting with a point cloud (dataMatrix) like that, and using matplotlib.mlab.PCA,
myPCA.Wt[0] is the first PC, the long one in the picture.

Answered By: denis

It isn’t obvious from your example that you are using matplotlib.mlab.PCA but if so, the documentation states that the returned object has an attribute Wt, which is “the weight vector for projecting a numdims point or array into PCA space”.

PCA returns the eigenvalues in descending order (you can tell by looking at the fracs attribute of the returned object). So the first principal component (first eigenvector) will be the first row of Wt.

As noted by @denis, your eigenvectors will be 2D (not 3D) since your input data are 2D.

Answered By: bogatron
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.