How to get weights for PCA

Question:

I am trying to find weights of PCA using skit-learn. However, none of the methods are working.

Codes:

import pandas as pd
import numpy as np
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
# load dataset into Pandas DataFrame
df = pd.read_csv(url, names=['sepal length','sepal width','petal length','petal width','target'])

from sklearn.preprocessing import StandardScaler
features = ['sepal length', 'sepal width', 'petal length', 'petal width']
# Separating out the features
x = df.loc[:, features].values
# Standardizing the features
x = StandardScaler().fit_transform(x)
from sklearn.decomposition import PCA
pca = PCA(n_components=1)
principalComponents = pca.fit_transform(x)

Finding weights

Method 1

weights = pca.components_*np.sqrt(pca.explained_variance_)
# recovering original data
pca_recovered = np.dot(weights, x)
### This output is not matching with PCA

Method 2

# Standardising the weights then recovering
weights1 = weights/np.sum(weights)
pca_recovered = np.dot(weights1, x)
### This output is not matching with PCA

Please help if I am doing anything wrong here. Or, something is missing in the package.

Asked By: Mithilesh Kumar

||

Answers:

Instead of

weights = pca.components_*np.sqrt(pca.explained_variance_)

If I use simply

weights = pca.components_

May be first time while I was trying, there was calculation error.

Answered By: Mithilesh Kumar

use

weight = pca.components_

but the output of

x.dot(pca.components_.T)

is different from

pca.fit_transform(x)

because pca output is standardized.
Use

tmp = x.dot(pca.components_.T)
tmp-tmp.mean(axis=0)

your will get the same output

Answered By: 马荣康
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.