pca

How to get weights for PCA

How to get weights for PCA Question: I am trying to find weights of PCA using skit-learn. However, none of the methods are working. Codes: import pandas as pd import numpy as np url = “https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data” # load dataset into Pandas DataFrame df = pd.read_csv(url, names=[‘sepal length’,’sepal width’,’petal length’,’petal width’,’target’]) from sklearn.preprocessing import StandardScaler features …

Total answers: 2

R internal handling of sparse matrices

R internal handling of sparse matrices Question: I have been comparing the performance of several PCA implementations from both Python and R, and noticed an interesting behavior: While it seems impossible to compute the PCA of a sparse matrix in Python (the only approach would be scikit-learn’s TruncatedSVD, yet it does not support the mean-centering …

Total answers: 1

Feature/Variable importance after a PCA analysis

Feature/Variable importance after a PCA analysis Question: I have performed a PCA analysis over my original dataset and from the compressed dataset transformed by the PCA I have also selected the number of PC I want to keep (they explain almost the 94% of the variance). Now I am struggling with the identification of the …

Total answers: 3

PCA on sklearn – how to interpret pca.components_

PCA on sklearn – how to interpret pca.components_ Question: I ran PCA on a data frame with 10 features using this simple code: pca = PCA() fit = pca.fit(dfPca) The result of pca.explained_variance_ratio_ shows: array([ 5.01173322e-01, 2.98421951e-01, 1.00968655e-01, 4.28813755e-02, 2.46887288e-02, 1.40976609e-02, 1.24905823e-02, 3.43255532e-03, 1.84516942e-03, 4.50314168e-16]) I believe that means that the first PC explains 52% …

Total answers: 2

Plot PCA loadings and loading in biplot in sklearn (like R's autoplot)

Plot PCA loadings and loading in biplot in sklearn (like R's autoplot) Question: I saw this tutorial in R w/ autoplot. They plotted the loadings and loading labels: autoplot(prcomp(df), data = iris, colour = ‘Species’, loadings = TRUE, loadings.colour = ‘blue’, loadings.label = TRUE, loadings.label.size = 3) https://cran.r-project.org/web/packages/ggfortify/vignettes/plot_pca.html I prefer Python 3 w/ matplotlib, scikit-learn, …

Total answers: 5

PCA projection and reconstruction in scikit-learn

PCA projection and reconstruction in scikit-learn Question: I can perform PCA in scikit by code below: X_train has 279180 rows and 104 columns. from sklearn.decomposition import PCA pca = PCA(n_components=30) X_train_pca = pca.fit_transform(X_train) Now, when I want to project the eigenvectors onto feature space, I must do following: “”” Projection “”” comp = pca.components_ #30×104 …

Total answers: 2

Python scikit learn pca.explained_variance_ratio_ cutoff

Python scikit learn pca.explained_variance_ratio_ cutoff Question: When choosing the number of principal components (k), we choose k to be the smallest value so that for example, 99% of variance, is retained. However, in the Python Scikit learn, I am not 100% sure pca.explained_variance_ratio_ = 0.99 is equal to “99% of variance is retained”? Could anyone …

Total answers: 3

Obtain eigen values and vectors from sklearn PCA

Obtain eigen values and vectors from sklearn PCA Question: How I can get the the eigen values and eigen vectors of the PCA application? from sklearn.decomposition import PCA clf=PCA(0.98,whiten=True) #converse 98% variance X_train=clf.fit_transform(X_train) X_test=clf.transform(X_test) I can’t find it in docs. 1.I am “not” able to comprehend the different results here. Edit: def pca_code(data): #raw_implementation var_per=.98 …

Total answers: 3

Recovering features names of explained_variance_ratio_ in PCA with sklearn

Recovering features names of explained_variance_ratio_ in PCA with sklearn Question: I’m trying to recover from a PCA done with scikit-learn, which features are selected as relevant. A classic example with IRIS dataset. import pandas as pd import pylab as pl from sklearn import datasets from sklearn.decomposition import PCA # load dataset iris = datasets.load_iris() df …

Total answers: 5

raise LinAlgError("SVD did not converge") LinAlgError: SVD did not converge in matplotlib pca determination

raise LinAlgError("SVD did not converge") LinAlgError: SVD did not converge in matplotlib pca determination Question: code : import numpy from matplotlib.mlab import PCA file_name = “store1_pca_matrix.txt” ori_data = numpy.loadtxt(file_name,dtype=’float’, comments=’#’, delimiter=None, converters=None, skiprows=0, usecols=None, unpack=False, ndmin=0) result = PCA(ori_data) this is my code. though my input matrix is devoid of the nan and inf, i …

Total answers: 10