In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article Xiaobian for you to introduce in detail "Python based on KNN algorithm to achieve Iris data set classification", the content is detailed, the steps are clear, the details are handled properly, I hope this "Python based on KNN algorithm how to achieve Iris dataset classification" article can help you solve doubts, following the editor's ideas slowly in depth, together to learn new knowledge.
KNN model theory K nearest neighbor classification algorithm is not only a mature method in theory, but also one of the simplest machine learning algorithms. The idea of this method is that if most of the k most similar samples in the feature space (that is, the nearest neighbor in the feature space) belong to a certain category, then the sample also belongs to this category. In the KNN algorithm, the selected neighbors are all objects that have been correctly classified. In the decision-making of classification, this method only determines the category of the sample to be divided according to the category of the nearest sample or samples. Although the KNN method depends on the limit theorem in principle, it is only related to a very small number of adjacent samples in category decision-making. Because the KNN method mainly depends on the limited adjacent samples, rather than the method of discriminating the class domain, the KNN method is more suitable than other methods for the sample set with more cross or overlap of the class domain. KNN algorithm flow 1. Prepare the data and preprocess the data; 2. Select the appropriate data structure to store training data and test tuples; 3. Set parameters; 4. Maintains a priority queue of size k from largest to smallest by distance, which is used to store the nearest neighbor training tuple. Randomly select k tuples from the training tuple as the initial nearest neighbor tuple, calculate the distance from the test tuple to the k tuple, and store the training tuple label and distance in the priority queue. 5. Traversing the training tuple set, the distance between the current training tuple and the test tuple is calculated, and the distance L is compared with the maximum distance in the priority queue Lmax;6. To make a comparison. If L > = Lmax, the tuple is discarded and the next tuple is traversed. If L
< Lmax,删除优先级队列中最大距离的元组,将当前训练元组存入优先级队列;7. 遍历完毕,计算优先级队列中k 个元组的多数类,并将其作为测试元组的类别;8. 测试元组集测试完毕后计算误差率,继续设定不同的k值重新进行训练,最后取误差率最小的k 值。数据集准备Iris(鸢尾花)数据集是多重变量分析的数据集。数据集包含150行数据,分为3类,每类50行数据。每行数据包含4个属性:Sepal Length(花萼长度)、Sepal Width(花萼宽度)、Petal Length(花瓣长度)和Petal Width(花瓣宽度)。可通过这4个属性预测鸢尾花卉属于三个种类(Setosa,Versicolour,Virginica)中的哪一类。数据预处理 import pandas as pd df = pd.read_csv('iris.csv')# 读入数据 df.columns = ['Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species'] df.head()# 查看前5条数据数据集结果如下图所示:Df.describe () # View the data information and describe the dataset. The result is as follows:
K-nearest neighbor algorithm
Import numpy as np # uses K-nearest neighbor algorithm to cross-validate iris data from sklearn.datasets import load_iris import matplotlib.pyplot as plt # download dataset iris = load_iris () data = iris.data [: : 2] target = iris.target print (data.shape) # (150150) print (data [: 10]) print (target [: 10]) label= np.array (target) index_0 = np.where (label==0) plt.scatter (data [index _ 0], data [index _ 0], marker='x',color ='b' Index_1 = np.where (label==1) plt.scatter (index_1 = np.where (label==1) plt.scatter [index1], data [index1], marker='o',color = '1century plt.scatter =' 1) index_2 = np.where (label==2) plt.scatter (np.where [index2) plt.scatter], data = 'glossary page1] S = 15) plt.xlabel ('X1') plt.ylabel (' X2') plt.legend (loc = 'upper left') plt.show () here This article "how to classify iris data sets based on Python algorithm based on KNN" has been introduced. If you want to master the knowledge points of this article, you still need to practice and use it yourself. If you want to know more about related articles, welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.