Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to recognize handwritten digits in KNN algorithm

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

Today, I will talk to you about how to identify handwritten numbers in the KNN algorithm, which may not be well understood by many people. in order to make you understand better, the editor has summarized the following content for you. I hope you can get something according to this article.

Today we will introduce how to use the KNN algorithm to recognize handwritten numbers.

1, handwritten digital data set

Handwritten digital data set is a data set for image processing, which depicts the numbers of [0,9]. We can use the KNN algorithm to identify these numbers.

MNIST is a complete handwritten digital data set, which contains 60000 training samples and 10000 test samples.

There is also a handwritten numeric data set that comes with sklearn:

It contains a total of 1797 data samples, each depicting a [0,9] number of 8 pixels.

Each sample consists of 65 numbers:

The first 64 digits are feature data, and the range of feature data is [0,16]

The last number is the target data, the range of the target data is [0,9]

Let's take a look at five samples:

0,0,5,13,9,1,0,0,0,0,13,15,10,15,5,0,0,3,15,2,0,11,8,0,0,4,12,0,0,8,8,0,0,5,8,0,0,9,8,0,0,4,11,0,1,12,7,0,0,2,14,5,10,12,0,0,0,0,6,13,10,0,0,0,00,0,0,12,13,5,0,0,0,0,0,11,16,9,0,0,0,0,3,15,16,6,0,0,0,7,15,16,16,2,0,0,0,0,1,16,16,3,0,0,0,0,1,16,16,6,0,0,0,0,1,16,16,6,0,0,0,0,0,11,16,10,0,0,10,0,0,4,15,12,0,0,0,0,3,16,15,14,0,0,0,0,8,13,8,16,0,0,0,0,1,6,15,11,0,0,0,1,8,13,15,1,0,0 0,9,16,16,5,0,0,0,0,3,13,16,16,11,5,0,0,0,0,3,11,16,9,0,20,0,7,15,13,1,0,0,0,8,13,6,15,4,0,0,0,2,1,13,13,0,0,0,0,0,2,15,11,1,0,0,0,0,0,1,12,12,1,0,0,0,0,0,1,10,8,0,0,0,8,4,5,14,9,0,0,0,7,13,13,9,0,0,30,0,0,1,11,0,0,0,0,0,0,7,8,0,0,0,0,0,1,13,6,2,2,0,0,0,7,15,0,9,8,0,0,5,16,10,0,16,6,0,0,4,15,16,13,16,1,0,0,0,0,3,15,10,0,0,0,0,0,2,16,4,0,0,4

To use this dataset, you need to load:

> from sklearn.datasets import load_digits > digits = load_digits ()

View the first image data:

> digits.images [0] array ([[0, 0, 5, 13, 9, 1, 0, 0.], [0, 0, 13, 15, 10, 15, 5, 0.], [0, 3, 15, 2, 0, 11, 8, 0.]. 0., 8., 8., 0.], [0., 5., 8., 0., 0., 9., 8., 0.], [0., 4., 11., 0., 1., 12., 7., 0.], [0., 2., 14., 5., 10., 12., 0., 0.], [0. 0, 6, 13, 10, 0, 0, 0.])

We can draw this image with matplotlib:

> import matplotlib.pyplot as plt > plt.imshow (digits.images [0]) > plt.show ()

The image drawn is as follows, representing 0:

2. The implementation of KNN algorithm by Sklearn

The neighbors module of the sklearn library implements KNN related algorithms, where:

The KNeighborsClassifier class is used to classify problems

KNeighborsRegressor class for regression problems

The construction methods of these two classes are basically the same. Here we mainly introduce the KNeighborsClassifier class. The prototype is as follows:

KNeighborsClassifier (n_neighbors=5, weights='uniform', algorithm='auto', leaf_size=30, pendant 2, metric='minkowski', metric_params=None, n_jobs=None, * * kwargs)

Let's take a look at the meaning of several important parameters:

N_neighbors: the K value in KNN, usually using the default value of 5.

Weights: there are three ways to determine the weight of a neighbor:

Weights=uniform, which means that all neighbors have the same weight.

Weights=distance, indicating that the weight is the reciprocal of the distance, that is, inversely proportional to the distance.

Custom functions, you can customize the weights corresponding to different distances, generally do not need to define their own functions.

Algorithm: used to set the algorithm for calculating neighbors, which has four ways:

Adjusting the leaf_size affects the construction of the tree and the speed of searching.

Compared with the KD tree, it uses linear scanning instead of fast retrieval by constructing the tree structure.

The disadvantage is that when the training set is large, the efficiency is very low.

Like the KD tree, it is a multi-dimensional data structure.

Ball trees are more suitable for situations with large dimensions.

KD tree is a kind of data structure in multi-dimensional space, which is convenient for data retrieval.

KD tree is suitable for cases with fewer dimensions. Generally, the dimension is not more than 20. If the dimension is greater than 20, the efficiency will decrease.

Algorithm=auto, automatically choose the appropriate algorithm according to the data.

Algorithm=kd_tree, using the KD tree algorithm.

Algorithm=ball_tree, using the ball tree algorithm.

Algorithm=brute, called brute force search.

Leaf_size: indicates the number of leaf nodes when constructing a KD tree or ball tree. The default is 30.

3. Construct KNN classifier

First load the dataset:

From sklearn.datasets import load_digitsdigits = load_digits () data = digits.data # feature set target = digits.target # target set

Split the dataset into a training set (75%) and a test set (25%):

From sklearn.model_selection import train_test_splittrain_x, test_x, train_y, test_y = train_test_split (data, target, test_size=0.25, random_state=33)

Construct the KNN classifier:

From sklearn.neighbors import KNeighborsClassifier# takes the default parameter knn = KNeighborsClassifier ()

Fit model:

Knn.fit (train_x, train_y)

Forecast data:

Predict_y = knn.predict (test_x)

Calculation model accuracy:

From sklearn.metrics import accuracy_scorescore = accuracy_score (test_y, predict_y) print score # 0.98

Finally, the accuracy of the model is 98%, and the accuracy is still good.

After reading the above, do you have any further understanding of how to recognize handwritten numbers in the KNN algorithm? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report