How to use KNN algorithm to recognize handwritten digits in Python-OpenCV 07/13 Update SLTechnology News&Howtos

How to use KNN algorithm to recognize handwritten digits in Python-OpenCV

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

Introduction to handwritten digital data set MNIST

In order to ensure the integrity, starting from the training data used by the algorithm, the training data is composed of MNIST handwritten numbers. The MNIST data set comes from the American National Institute of Standards and Technology and consists of 250 different handwritten numbers. The training set contains 60000 images, and the test set contains 10000 pictures, each with its own label, and the image size is 28028. Many machine learning libraries provide a way to load MNIST datasets, which are loaded using the keras library:

# Import keras library import keras# load data (train_dataset, train_labels), (test_dataset, test_labels) = keras.datasets.mnist.load_data () train_labels = np.array (train_labels, dtype=np.int32) # print dataset shape print (train_dataset.shape, test_dataset.shape) # Image Preview for i in range (40): plt.subplot (4,10, item1) plt.imshow (train_ dataset [I] Cmap='gray') plt.title (train_labels [I], fontsize=10) plt.axis ('off') plt.show ()

Benchmark Model-- recognition of handwritten digits using KNN algorithm

After loading the dataset, we try to use the KNN classifier to recognize digits. In the original method, we first use the original pixel values as features, so the size of the image descriptor is 28 × 28 = 784.

First, use keras to load all digital images. In order to understand the whole process of data training, we divide the loaded training data set into training data set + test data set, with each part accounting for 50%:

# load dataset (train_dataset, train_labels), (test_dataset, test_labels) = keras.datasets.mnist.load_data () train_labels = np.array (train_labels, dtype=np.int32) # use the original image as the descriptor def raw_pixels (img): return img.flatten () # data fragmentation shuffle = np.random.permutation (len (train_dataset)) train_dataset, train_labels = train_ dataset [shuffle] Train_ labels [shuffle] # calculates the descriptor for each image Here the feature descriptor is the original pixel raw_descriptors = [] for img in train_dataset: raw_descriptors.append (np.float32 (raw_pixels (img) raw_descriptors = np.squeeze (raw_descriptors) # split the data into training and test data (50% each) # therefore, 30000 digits are used to train the classifier Test the trained classifier partition = int (30000 * len (raw_descriptors)) raw_descriptors_train, raw_descriptors_test = np.split (raw_descriptors, [partition]) labels_train, labels_test = np.split (train_labels, [partition])

Now we can train the KNN model using the knn.train () method and test it with the get_accuracy () function:

# training KNN model knn = cv2.ml.KNearest_create () knn.train (raw_descriptors_train, cv2.ml.ROW_SAMPLE, labels_train) # Test kNN model k = 5ret, result, neighbours, dist = knn.findNearest (raw_descriptors_test, k) # calculate the accuracy def get_accuracy (predictions, labels) based on real and predicted values: acc = (np.squeeze (predictions) = = labels). Mean () return acc * 100acc = get_accuracy (result) Labels_test) print ("Accuracy: {}" .format (acc))

We can see that when K = 5, the KNN model can achieve 96.48% accuracy, but we can still improve it to achieve higher performance.

The effect of improved Model 1: color-Parameter K on the accuracy of handwritten digit recognition

We already know that in the KNN algorithm, an important parameter that affects the performance of the algorithm is K, so we can first try to use different K values to see its impact on the accuracy of handwritten digit recognition.

To compare the accuracy of the model with different K values, we first need to create a dictionary to store the accuracy of testing different K values:

From collections import defaultdictresults = defaultdict (list)

Next, calculate the knn.findNearest () method, change the K parameter, and store the result in the dictionary:

# K ranges from (1,9) for k in range (1,10): ret, result, neighbours, dist = knn.findNearest (raw_descriptors_test, k) acc = get_accuracy (result, labels_test) print ("{}" .format ("% .2f"% acc)) results ['50'] .append (acc)

Finally, draw the result:

Ax = plt.subplot (1,1,1) ax.set_xlim (0,10) dim = np.arange (1,10) for key in results: ax.plot (dim, results [key], linestyle='--', marker='o', label= "50%") plt.legend (loc='upper left', title= "% training") plt.title ('Accuracy of the K-NN model varying K') plt.xlabel ("number of k") plt.ylabel ("accuracy") plt.show ()

The running result of the program is shown in the following figure:

As shown in the figure above, the accuracy obtained by changing the K parameter is also different, so the best performance can be obtained by adjusting the K parameter in the application.

Improved model 2: the influence of the amount of training data on the accuracy of handwritten digit recognition

In machine learning, using more data to train classifiers usually improves the performance of the model, because the classifier can better learn the structure of features. In KNN classifier, increasing the number of training will also increase the probability of finding correct matching of test data in the feature space.

Next, we modify the percentage of images used to train and test the model to observe the effect of the amount of training data on the accuracy of handwritten digit recognition:

# dividing training data set and test data set split_values = np.arange (0.1,1 Results = defaultdict (list) # create model knn = cv2.ml.KNearest_create () # effect of different amount of training data on the accuracy of handwritten digit recognition for split_value in split_values: # divide the data set into training and test data sets partition = int (split_value * len (raw_descriptors)) raw_descriptors_train, raw_descriptors_test = np.split (raw_descriptors [partition]) labels_train, labels_test = np.split (train_labels, [partition]) # train KNN model print ('Training KNN model-raw pixels as features') knn.train (raw_descriptors_train, cv2.ml.ROW_SAMPLE, labels_train) # and test different K values for each partition to affect for k in range (1,10): ret, result, neighbours, dist = knn.findNearest (raw_descriptors_test K) acc = get_accuracy (result, labels_test) print ("{}" .format ("% .2f"% acc)) results [int (split_value * 100)] .append (acc)

The percentage of digital images of the training algorithm is 10%, 20%,... , 90%, the number percentage of the test algorithm is 90%, 80%,... , 10%, and finally, draw the result:

Ax = plt.subplot (1,1,1) ax.set_xlim (0,10) dim = np.arange (1,10) for key in results: ax.plot (dim, results [key], linestyle='--', marker='o', label=str (key) + "%") plt.legend (loc='upper left' Title= "% training") plt.title ('Accuracy of the KNN model varying both k and the percentage of images to train/test') plt.xlabel ("number of k") plt.ylabel ("accuracy") plt.show ()

As can be seen from the above picture, with the increase of the number of training images, the accuracy will also increase. Therefore, when the conditions permit, the performance of the model can be improved by increasing the amount of training data.

Although we can see that the accuracy has reached more than 97%, we can't stop there.

The effect of improved Model 3Mel-preprocessing on the accuracy of handwritten digit recognition

In the above examples, we all use the original pixel values as features to train the classifier. In machine learning, the one before training the classifier can usually preprocess the input data to improve the training performance of the classifier, so then we apply preprocessing to see its effect on the accuracy of handwritten digit recognition.

The preprocessing function desew () is as follows:

Def deskew (img): M = cv2.moments (img) if abs (m ['mu02']) < 1e-2: return img.copy () skew = m [' mu11'] / m ['mu02'] M = np.float32 ([[1, skew,-0.5 * SIZE_IMAGE * skew], [0,1,0]]) img = cv2.warpAffine (img, M, (SIZE_IMAGE, SIZE_IMAGE) Flags=cv2.WARP_INVERSE_MAP | cv2.INTER_LINEAR) return img

The desew () function unskews the number by using its second moment. More specifically, the measure of deflection can be calculated from the ratio of two central moments (mu11/mu02). The calculated skew is used to calculate the affine transformation so as to eliminate the digital skew. Next, compare the effect of the image before and after preprocessing:

For i in range (10): plt.subplot (2,10, iTun1) plt.imshow (train_dataset [I], cmap='gray') plt.title (train_labels [I], fontsize=10) plt.axis ('off') plt.subplot (2,10, iFe11) plt.imshow (deskew (train_ dataset [I]), cmap='gray') plt.axis (' off') plt.show ()

The original digital image is shown in the first line of the following figure, and the preprocessed digital image is shown in the second line:

Through the application of this preprocessing, the accuracy of recognition is improved, and the accuracy curve is shown below.

We can see that the accuracy of the preprocessed classifier can even be close to 98%. Considering that we only use a simple KNN model, the effect is very good, but we can further improve the performance of the model.

Improved model 4Mel-using advanced descriptors as image features to improve the accuracy of KNN algorithm

In the above example, we have been using the original pixel value as the feature descriptor. In machine learning, a common method is to use more advanced descriptors, and then we will use directional gradient histogram (Histogram of Oriented Gradients, HOG) as image features to improve the accuracy of KNN algorithm.

Feature descriptor is a representation of an image, which simplifies the image by extracting useful information describing basic features such as shape, color or texture. In general, feature descriptors convert images into feature vectors of length n. HOG is a popular feature descriptor for computer vision.

Next, define the get_hog () function to get the HOG descriptor:

(train_dataset, train_labels), (test_dataset, test_labels) = keras.datasets.mnist.load_data () SIZE_IMAGE = train_dataset.shape [1] train_labels = np.array (train_labels, dtype=np.int32) def get_hog (): hog = cv2.HOGDescriptor ((SIZE_IMAGE, SIZE_IMAGE), (8,8), (4,4), (8,8), 9,1,-1,0,0.2,1,64 True) print ("hog descriptor size: {}" .format (hog.getDescriptorSize ()) return hog

Then use HOG features to train the KNN model

Hog = get_hog () hog_descriptors = [] for img in train_dataset: hog_descriptors.append (hog.compute (deskew (img) hog_descriptors = np.squeeze (hog_descriptors)

The accuracy of the model completed by training, as shown in the following figure:

Through the above improvement process, we can see that a good way to write a machine learning model is to start with the basic baseline model for solving the problem. then iterate to improve the model by adding better preprocessing, more advanced feature descriptors, or other machine learning techniques. Finally, if conditions permit, more data can be collected for training and testing the model.

Complete code

The final complete code is shown below, and the rest of the code in the improvement process can be obtained by simply modifying the following code as explained above:

Import cv2import numpy as npimport matplotlib.pyplot as pltfrom collections import defaultdictimport keras (train_dataset, train_labels), (test_dataset, test_labels) = keras.datasets.mnist.load_data () SIZE_IMAGE = train_dataset.shape [1] train_labels = np.array (train_labels, dtype=np.int32) def get_accuracy (predictions Labels): acc = (np.squeeze (predictions) = = labels). Mean () return acc * 100def raw_pixels (img): return img.flatten () def deskew (img): M = cv2.moments (img) if abs (m ['mu02']) < 1e-2: return img.copy () skew = m [' mu11'] / m ['mu02'] M = np.float32 ([[1, skew,-0.5 * SIZE_IMAGE * skew] [0,1,0]) img = cv2.warpAffine (img, M, (SIZE_IMAGE, SIZE_IMAGE), flags=cv2.WARP_INVERSE_MAP | cv2.INTER_LINEAR) return imgdef get_hog (): hog = cv2.HOGDescriptor ((SIZE_IMAGE, SIZE_IMAGE), (8,8), (4,4), (8,8), 9,1,-1,0,0.2,1,64 True) print ("hog descriptor size: {}" .format (hog.getDescriptorSize ()) return hogshuffle = np.random.permutation (len (train_dataset)) train_dataset, train_labels = train_ dataset Train_ labels [shuffle] # Advanced image descriptor hog = get_hog () hog_descriptors = [] for img in train_dataset: hog_descriptors.append (hog.compute (deskew (img) hog_descriptors = np.squeeze (hog_descriptors) # data partition split_values = np.arange (0.1,1 Results = defaultdict (list) # create KNN model knn = cv2.ml.KNearest_create () for split_value in split_values: partition = int (split_value * len (hog_descriptors)) hog_descriptors_train, hog_descriptors_test = np.split (hog_descriptors, [partition]) labels_train, labels_test = np.split (train_labels) [partition]) print ('Training KNN model-HOG features') knn.train (hog_descriptors_train, cv2.ml.ROW_SAMPLE, labels_train) # Storage accuracy for k in np.arange (1,10): ret, result, neighbours, dist = knn.findNearest (hog_descriptors_test, k) acc = get_accuracy (result) Labels_test) print ("{}" .format (".2f"% acc)) results [int (split_value * 100)] .append (acc) fig = plt.figure (figsize= (12,5) plt.suptitle ("k-NN handwritten digits recognition", fontsize=14, fontweight='bold') ax = plt.subplot (1,1,1) ax.set_xlim (0,10) dim = np.arange (1,10) for key in results: ax.plot (dim, results [key] Linestyle='--', marker='o', label=str (key) + "%") plt.legend (loc='upper left', title= "% training") plt.title ('Accuracy of the k-NN model varying both k and the percentage of images to train/test with pre-processing' 'and HoG features') plt.xlabel ("number of k") plt.ylabel ("accuracy") plt.show ()

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.