Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to understand and implement KNN algorithm

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

Today, I will talk to you about how to understand and implement the KNN algorithm, which may not be well understood by many people. In order to make you understand better, the editor has summarized the following content for you. I hope you can get something according to this article.

Knn introduction

Neighbor algorithm, or K-nearest neighbor (kNN,k-NearestNeighbor) classification algorithm, is one of the simplest methods in data mining classification technology. The so-called K nearest neighbor means k nearest neighbors, which means that each sample can be represented by its nearest k neighbors. In ordinary life, we will subconsciously apply it to our judgment, such as rich and poor areas, to judge whether a person is rich or poor, according to his friends' judgment, is the use of kNN's ideas.

KNN is classified by measuring the distance between different eigenvalues. Its idea is that if most of the k most similar samples in the feature space (that is, the nearest neighbor in the feature space) belong to a certain category, then the sample also belongs to this category. K is usually an integer no more than 20. In the KNN algorithm, the selected neighbors are all objects that have been correctly classified. In the decision-making of classification, this method only determines the category of the sample to be divided according to the category of the nearest sample or samples.

In KNN, the distance between objects is calculated as a non-similarity index between objects, which avoids the problem of matching between objects. Here the distance is generally Euclidean distance.

Implementation of KNN algorithm

Mainly refer to Liu Jianping Pinard blog post K nearest neighbor method (KNN) principle summary, Liu Jianping Pinard blog post has a very profound insight into each algorithm, generally, when you do not understand Li Hang's "statistical learning method", it will suddenly enlighten you to read Liu Da's blog. His blog post mentioned that scikit-learn only uses brute force implementation (brute-force), KD tree implementation (KDTree) and ball tree (BallTree) implementation, so he only discusses the implementation principles of these algorithms in this article. The rest of the implementation methods such as BBF tree and MVP tree are not discussed. Children's shoes who need to have a deeper understanding of the algorithm, follow Liu Jianping's Pinard article ~

Actual combat code

This part mainly refers to the actual combat, and then mainly explains some specific implementation.

The following code imports the required libraries for the program to run

From numpy import *

Import operator

The following program mainly implements the function of generating test data.

Def createDataSet ():

Group = array ([[1.0cr 1.1], [1.0je 1.0], [0d0], [0pc0.1]])

Labels = ['Achilles Magna', 'Achilles'','B']

Return group,labels

Group,labels = createDataSet ()

Output:

In [2]: group

Out [2]: array ([[1,1.1])

[1., 1.]

[0. , 0. ]

[0. , 0.1]])

In [3]: labels

Out [3]: ['A','A','B','B']

The following code mainly implements the function of classification using knn

Def classify0 (inX,dataSet,labels,k):

DataSetSize = dataSet.shape [0]

# functions of tile extension Matrix

DiffMat = tile (inX, (dataSetSize,1))-dataSet

SqdiffMat = diffMat**2

SqDistances = sqdiffMat.sum (axis = 1)

Distances = sqDistances**0.5

SortedDistIndicies = distances.argsort ()

Print (sortedDistIndicies)

ClassCount= {}

For i in range (k):

VoteLabels = labels [sorted DistIndices [I]]

# dict.get gets the value of the specified key. None is returned by default. If the key value does not exist, it is different from dict ['key'] which directly returns error. It can also be specified. The value specified below is 0.

ClassCount [voteLabels] = classCount.get (voteLabels,0) + 1

Print (classCount)

# Python3.5: iteritems becomes items (python2 classCount.iteritems ())

# items can output (key,value) in dict

The key parameter in # sorted is passed into the function. Instead of getting the value, the operator.itemgetterr function defines a function that acts on the object to obtain the value.

# operator.itemgetter (1) is to get the second parameter in classCount.items ()

SortedClassCount = sorted (classCount.items (), key = operator.itemgetter (1), reverse = True)

Print (sortedClassCount)

Return sortedClassCount [0] [0]

Given the output, give the classification value

In [7]: classify0 ([0jue 0.2], group,labels,2)

[3 2 1 0]

{'Bamboo: 2}

[('Barrier, 2)]

Out [6]:'B'

In-depth interpretation of actual combat code

Argsort function

The argsort () function arranges the elements in x from small to large, extracts their corresponding index (index), and then outputs them to y.

The output is in the order from small to large.

Example:

Import numpy as np

A = np.array ([2jue 0pr 4 dint 1m 2m 4je 5])

A.argsort ()

The output is a sorted index from smallest to largest:

Out [12]: array ([1,3,0,4,2,5,6], dtype=int64)

The output is the index of list, and the order of list from small to large is extracted.

Sort interpretation

Dict.get vs dict ['key']

A = {'name':' wang'}

Dict ['key'] output

A ['age']

Out [16]: KeyError: 'age'

Dict.get output:

A.get ('age')

A.get ('age', 10)

Out [17]: 10

Dict ['key'] can only get the value that exists. If it does not exist, KeyError will be triggered.

Dict.get (key, default=None) returns a default value if it does not exist. If it is set, it is set, otherwise it is None

Sort and sorted functions in Python

Sorting a list with the sort function affects the list itself, but sorted does not

A = [1, 2, 1, 4, 4, 3, 5]

A.sort ()

AOut [18]: [1, 1, 2, 3, 4, 5]

The sort function changes the order of a

A = [1, 2, 1, 4, 4, 3, 5]

Sorted (a)

AOut [19]: [1, 2, 1, 4, 3, 5]

Sorted did not change the order of a

Sorted function

Sorted (iterable,cmp,key,reverse) (usage of pyhton2)

Python3 sorted has removed support for cmp.

List1 = [('david', 90), (' mary',90), ('sara',80), (' lily',95)]

Sorted (list1,cmp = lambda xQuery y: cmp (x [0], y [0]))

TypeError: 'cmp' is an invalid keyword argument for this function

Sort with the key function

Sorted (list1,key = lambda list1: list1 [0])

Out [23]: [('david', 90), (' lily', 95), ('mary', 90), (' sara', 80)]

List1 [0] means to sort with the first element in list

Sorted (list1,key = lambda list1: list1 [1])

Out [24]: [('sara', 80), (' david', 90), ('mary', 90), (' lily', 95)]

List1 [1] means to sort with the second element in list.

Three sorted interview questions

1) the application of key function

Students = [('john',' Aids, 15), ('jane',' Bones, 12), ('dave','B', 10)]

Sorted (students,key=lambda s: s [2]) # sort by age

2) sorting of multiple characters

Asdf234GDSdsf23' this is a string sort, collation: lowercase

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report