Chapter 4: Dimensionality reduction processing: pca and svd 1. pca 1. How is dimensionality reduction achieved by pca? 2. Code 2, SVD 1. svd 3. Dimensionality reduction and feature selection are both feature engineering techniques. What is the difference?

1. pca

The information measure used by PCA is the variance of the sample.
The larger the variance, the more information it carries:
Insert image description here

1. How is PCA dimensionality reduction implemented?

class sklearn.decomposition.PCA

  • Parameters:
    n_components: Dimensions required after dimensionality reduction. should be less than the number of features. If visualization is required, it is usually 2 or 3.
    copy:
    whiten:
    svd_solver:
    random_state:

Principle of dimensionality reduction: use two dimensions as an example:

If all points come from the line y=x, it is two-dimensional. If it is rotated 45° clockwise, it will be a straight line on the coordinate axis. At this point it becomes one-dimensional. Only the line y`=0 is needed. And the distance ratio of all points on this line has not changed. (That is, the original information is retained)

2.Code

from matplotlib import pyplot as pt
from sklran.datasets import load_iris as li
from sklean.decomposition import PCA pa pca

//提取数据集
iris=li()
y=iris.target
x=iris.data
print(x.shape)  //15042import pandas as pd
pd.DataFrame(x)  //在特征矩阵中,维度是特征的数量,4-2维 降维

//降维
pca=pca(n_components=2)
x_new=pca.fit_transform(x)

print(y)  //y是012三种类型
colors=["red","black","orange"]
print(iris.target_names)

pt.figure(figsize=(20,8))
for i in[0,1,2]:
 pt.scatter(x[y==i,0],x[y==i,1],alpha=.7,c=colors[i],label=iris.target_names[i])
pt.legend()
pt.title(;pca of iris dataset')
pt.show()

How to check the amount of information carried:

pca=pca().fit(x)
print(pca.explained_variance_ratio_)

Return information: Insert image description here
the ratio of each feature to the total features. When accumulated, we know the proportion of all information accounted for by these 4-dimensional features.

二、SVD

PCA and SVD are two different dimensionality reduction algorithms, but they both follow the above process to achieve dimensionality reduction. They just use different methods of matrix decomposition and different measurement indicators of information content in the two algorithms.

PCA uses variance as a measure of information content and eigenvalue decomposition to find the space V. When reducing the dimension, it will decompose the characteristic short matrix X into three matrices through a series of mathematical mysterious operations (for example, generating a covariance matrix), one of which has only a value on the diagonal is the variance matrix.

After the dimensionality reduction is completed, each new feature vector found by PCA is called a "principal component", and the discarded feature vectors are considered to have very little information, and this information is likely to be noise.

1.svd

singular value decomposition

3. Dimensionality reduction and feature selection are both feature engineering techniques. What is the difference?

There are three methods in feature engineering: feature extraction, feature creation and feature selection. Feature selection is to select the most informative features from existing features, and the features after selection are still interpretable.

The dimensionality reduction algorithm compresses existing features. The features after dimensionality reduction are not any features in the original feature matrix, but new features combined in some ways. Generally speaking, **before the new feature matrix is ​​generated, we cannot know what new feature vectors the dimensionality reduction algorithms have established. After the new feature matrix is ​​generated, it is not readable. **We cannot judge the new feature matrix. What features are combined from the original data? Although the new features contain the information of the original data, they are no longer what the original data represents. Dimensionality reduction algorithms are therefore a type of feature creation (or feature construction).

As you can imagine, PCA is generally not suitable for models that explore the relationship between features and labels (such as linear regression) , because the relationship between new features and labels that cannot be explained is not meaningful. In linear regression models, we use feature selection.

Guess you like

Origin blog.csdn.net/qq_53982314/article/details/131244829