pca
In sklearn.decomposition.PCA, why are components_ negative?
As you figured out in your answer, the results of a singular value decomposition (SVD) are not unique in terms of singular vectors. Indeed, if the SVD of X is \sum_1^r \s_i u_i v_i^\top : with the s_i ordered in decreasing fashion, then you can see that you can change the sign (i.e., “flip”) of … Read more
How to implement ZCA Whitening? Python
Here is a python function for generating the ZCA whitening matrix: def zca_whitening_matrix(X): “”” Function to compute ZCA whitening matrix (aka Mahalanobis whitening). INPUT: X: [M x N] matrix. Rows: Variables Columns: Observations OUTPUT: ZCAMatrix: [M x M] matrix “”” # Covariance matrix [column-wise variables]: Sigma = (X-mu)’ * (X-mu) / N sigma = np.cov(X, … Read more
Principal component analysis in Python
Months later, here’s a small class PCA, and a picture: #!/usr/bin/env python “”” a small class for Principal Component Analysis Usage: p = PCA( A, fraction=0.90 ) In: A: an array of e.g. 1000 observations x 20 variables, 1000 rows x 20 columns fraction: use principal components that account for e.g. 90 % of the … Read more
Add legend to scatter plot (PCA)
I recently proposed an easy way to add a legend to a scatter, see GitHub PR. This is still being discussed. In the meantime you need to manually create your legend from the unique labels in y. For each of them you’d create a Line2D object with the same marker as is used in the … Read more
PCA in matlab selecting top n components
Foreword I think you are falling prey to the XY problem, since trying to find 153.600 dimensions in your data is completely non-physical, please ask about the problem (X) and not your proposed solution (Y) in order to get a meaningful answer. I will use this post only to tell you why PCA is not … Read more
MATLAB is running out of memory but it should not be
For a data matrix of size n-by-p, PRINCOMP will return a coefficient matrix of size p-by-p where each column is a principal component expressed using the original dimensions, so in your case you will create an output matrix of size: 1036800*1036800*8 bytes ~ 7.8 TB Consider using PRINCOMP(X,’econ’) to return only the PCs with significant … Read more
Recovering features names of explained_variance_ratio_ in PCA with sklearn
This information is included in the pca attribute: components_. As described in the documentation, pca.components_ outputs an array of [n_components, n_features], so to get how components are linearly related with the different features you have to: Note: each coefficient represents the correlation between a particular pair of component and feature import pandas as pd import … Read more
Matlab – PCA analysis and reconstruction of multi dimensional data
Here’s a quick walkthrough. First we create a matrix of your hidden variables (or “factors”). It has 100 observations and there are two independent factors. >> factors = randn(100, 2); Now create a loadings matrix. This is going to map the hidden variables onto your observed variables. Say your observed variables have four features. Then … Read more