Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What does PCA refer to in dimensionality reduction technology

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

Today, I will talk to you about what PCA refers to in dimensionality reduction technology, which may not be well understood by many people. in order to make you understand better, the editor has summarized the following content for you. I hope you can get something according to this article.

Dimensionality reduction is a process of removing redundant unimportant variables and leaving only the main variables that can hold information. This is usually achieved in two ways:

One is feature selection (Feature Selection) and the other is feature extraction (Feature Extraction)

In our actual work, we often encounter big data. These big data is not only a large sample size, often sometimes a lot of variables, there may be thousands of variables, or called features. Sometimes some features are not important at all and do not provide important information, they are just noises. In this case, it is very important to reduce the number of features.

For example, in image processing and analysis, there are usually many images, such as a large set of images obtained from different angles and different locations, and each image contains too many pixels, so dimensionality reduction is very important. especially when your task is not to detect every pixel in every image.

A very popular method of dimensionality reduction is the principal component analysis (Principal Component Analysis (PCA)) method, which is the first dimensionality reduction method I first learned. PCA is a mapping method, which maps the original features to a new space, and the features in the new space are expressed as a linear combination of the original features. In the new principal component space, the number of features will be greatly reduced, and studies have shown that PCA can well maintain the information provided by the original features, that is to say, although the dimension is greatly reduced, PCA can still retain the original information to the maximum extent.

PCA is a successful dimensionality reduction method, of course, it can also be used to Visualize data in high-dimensional space. But it also has some limitations, for example, some studies say that it is a mapping method, after mapping, the new features become a linear combination of the original features, so its explanation is not so strong. For example, if you work with a doctor, if you say linear combination, they may not care at all, what they want to know is the original characteristics.

Therefore, in view of the above limitations, in 2002, Isabelle Guyon et al published an article entitled "Gene Selection for Cancer Classification using Support Vector Machines". They proposed a new method of dimensionality reduction, which is Recursive feature elimination (RFE). This method does not do the linear transformation like PCA, but maintains the original features, and it also takes into account the relationship between the original features (interactions). After this method came out, it became very popular, as can be seen from its citation rate.

SVM-RFE was very popular at that time, and then there were other models of RFE, such as Random forest-RFE and so on. You can search it yourself. If you are interested, you are advised to run it again to see what his output is, and you will know it at a glance.

Another particularly popular dimensionality reduction method is t-Stochastic Neighbor Embedding (tSNE), which is a nonlinear dimensionality reduction method. We can use this method to reduce the number of features, that is, we can use it to make feature selection, so that the selected features can be used as the difference of machine learning model. Usually, we often use tSNE to data visualization the data.

The picture comes from here (http://www.nlpca.org/pca-principal-component-analysis-matlab.html)

After reading the above, do you have any further understanding of what PCA refers to in dimensionality reduction technology? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report