In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
How to understand the distance discrimination in R language classification algorithm, I believe that many inexperienced people do not know what to do about it. Therefore, this paper summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.
1. Analysis of the principle of distance discrimination
A judgment is made according to the distance between the sample to be determined and the known class sample. According to the known class sample information, the distance discriminant function is established, and then the attribute data of each sample to be determined is calculated generation by generation to get the distance value, according to which the sample is judged into the sample cluster of the category with the smallest distance value.
K-nearest neighbor algorithm is the most widely used distance discriminant method. His idea is that if most of the K most similar / adjacent samples in the feature space belong to a certain category, then the sample also belongs to this category.
The three solid sample points in the figure are surrounded by three known types of sample points represented by circular, triangular and square hollow points respectively. Now let's take Kraft 5, that is, circle the five sample points closest to the sample points to be classified, and then check their categories. Among the five points, there are more samples in which category the unknown sample belongs to. Easily available unknown samples (from left to right) belong to circle, triangle and square in turn.
When discriminating by K-nearest neighbor method, because it mainly depends on the information of the surrounding limited adjacent samples, rather than the method of discriminating class domain, it is more suitable than other methods for the sample set with more overlapping or overlapping class domains.
two。 Application in R language
In the K nearest neighbor (K-Nearest Neighbor,KNN) algorithm, we mainly use the class packet.
The knn (train,test,cl,k=1,1=0,prob=FALSE,use.all=TRUE) function.
On the other hand, in the weighted k-nearest neighbor Weighted KKNN, we mainly use the
Kknn (formula=formula (train), train,test,na.action=na.omit (), Kendall 7, ordered= distancestry 2) Kernel = "optimal", ykernel=NULL,scale=TRUE,contrasts=c ('unordered'= "contr.dummy", ordered= "contrl.rodinal")) function.
3. Discriminant analysis with iris dataset as an example
1) apply the model and observe the output
Library (kknn) fit_pre_kknn=kknn (Species~.,data_train,data_test [,-5], KJN 5) fit_pre_ KKN [1: length (fit_pre_kknn)]
2) testing the accuracy of the model
Table (data_test$Species, fit_pre_kknn$fitted.values) sum (as.numeric (as.numeric (fit_pre_kknn$fitted.values)! = as.numeric (data_test$Species) / nrow (data_test)
After reading the above, have you mastered how to understand the distance discrimination method in R language classification algorithm? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.