How to realize the handwritten digit recognition function of MNIST in Python 07/13 Update SLTechnology News&Howtos

How to realize the handwritten digit recognition function of MNIST in Python

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly shows you "how to achieve MNIST handwritten digit recognition function in Python", the content is simple and clear, hope to help you solve your doubts, the following let Xiaobian lead you to study and learn "how to achieve MNIST handwritten digit recognition function in Python" this article.

Dataset introduction

MNIST data set is a very classic data set in the field of machine learning, which consists of 60000 training samples and 10000 test samples. Each sample is a 28 * 28 pixel grayscale handwritten digital picture, and is built into keras. In this paper, the Keras (Keras Chinese documents) neural network API under Tensorflow is used to build the network.

Before you begin, recall the general workflows of machine learning (√ means this article is used, x means this article is not used)

1. Define questions and collect datasets (√)

two。 Select indicators to measure success (√)

3. Determine the method of evaluation (√)

4. Prepare data (√)

5. Develop a better model than the benchmark (×)

6. Expand the scale of the model (×)

7. Model regularization and adjustment parameters (×)

On the selection of activation function and loss function of the last layer

Let's start with the text.

1. Data preprocessing

To import the data first, use the mnist.load () function

Let's take a look at its source declaration:

Def load_data (path='mnist.npz'): "Loads the [MNIST dataset] (http://yann.lecun.com/exdb/mnist/). This is a dataset of 60000 28x28 grayscale images of the 10 digits, along with a test set of 10000 images. More info can be found at the [MNIST homepage] (http://yann.lecun.com/exdb/mnist/). Arguments: path: path where to cache the dataset locally (relative to `~ / .keras / datasets`). Returns: Tuple of Numpy arrays: `(x_train, y_train), (x_test, y_test)`. Uint8 arrays of grayscale image data with shapes (num_samples, 28,28). * * y_train, yearly testing: uint8 arrays of digit labels (integers in range 0-9) with shapes (num_samples,). "

As you can see, it contains the download link to the dataset, as well as the declaration of the dataset size, size, and data type, and the function returns two tuples of four numpy array.

Import the dataset and reshape to the desired shape, and then standardize the processing.

To_categorical (), which is built into keras, is one-hot encoding-- each tag is represented as an all-zero vector, and only the corresponding element of the tag index is 1. 0.

Eg: col=10

[0meme1re9]-> [1pyrrine 0jort0jr0jrr0jr0j0jrr0j0j0j0j0pj0j0re0], [0pr 0pr 0pr 0re0pl] [0pr 0pr 0pr 0pl]

We can implement it manually:

Def one_hot (sequences,col): resuts=np.zeros ((len (sequences), col)) # for I sequence in enumerate (sequences): # resuts [I for i in range] = 1 for i in range (len (sequences)): for j in range (len (sequences [I])): resuts [I] [j]] = 1 return resuts

The following is the preprocessing process

Def data_preprocess (): (train_images, train_labels), (test_images, test_labels) = mnist.load_data () train_images = train_images.reshape ((60000, 28, 28, 1)) train_images = train_images.astype ('float32') / 255 # print (train_images [0]) test_images = test_images.reshape ((10000, 28, 28) 1) test_images = test_images.astype ('float32') / 255train_labels = to_categorical (train_labels) test_labels = to_categorical (test_labels) return train_images,train_labels,test_images,test_labels2. Network building

What we build here is a convolution neural network, which is a simple linear stack with some convolution, pooling and full connection. We know that multiple linear layer stacks still implement linear operations, and adding layers does not expand the hypothetical space (a set of all possible linear transformations from input data to output data), so you need to add nonlinearity or activation functions. Relu is the most commonly used activation function. You can also use prelu or elu.

Def build_module (): model = models.Sequential () # first convolution layer The first layer needs to point out the input_shape shape model.add (layers.Conv2D (32, (3)), activation='relu', input_shape= (28 28) # the second layer maximum pooling layer model.add (layers.MaxPooling2D (2) # the third layer convolution layer model.add (layers.Conv2D (64, (3)) Activation='relu')) # layer 4 maximum pooling layer model.add (layers.MaxPooling2D ((2Magne2) # layer 5 convolution layer model.add (layers.Conv2D (64, (3jing3), activation='relu')) # layer 6 Flatten layer Tile 3D tensor into vector model.add (layers.Flatten ()) # seventh layer full connection layer model.add (layers.Dense (64, activation='relu')) # eighth layer softmax layer, classify model.add (layers.Dense (10, activation='softmax')) return model

Use model.summary () to view the built network structure:

3. Network configuration

After the network is built, there is still a key step to set up the configuration. For example: optimizer-the specific method of network gradient descent to update parameters, loss function-measuring the distance between the generated value and the target value, evaluation indicators, and so on. You can configure this by passing the model.compile () parameter.

Let's take a look at the source analysis of model.compile ():

Def compile (self, optimizer='rmsprop', loss=None, metrics=None, loss_weights=None, weighted_metrics=None, run_eagerly=None, steps_per_execution=None, * * kwargs): "" Configures the model for training.

About the optimizer

Optimizer: string (optimizer name) or optimizer instance.

String format: for example, using the default parameters of the optimizer

Parameters are passed by the instance optimizer:

Keras.optimizers.RMSprop (lr=0.001, rho=0.9, epsilon=None, decay=0.0) model.compile (optimizer='rmsprop',loss='mean_squared_error')

It is recommended to use the default parameters of the optimizer (except for the learning rate lr, which can be adjusted freely)

Parameters:

Lr: float > = 0. Learning rate. Rho: float > = 0. The decay rate of the moving mean of the square of the RMSProp gradient. Epsilon: float > = 0. Fuzzy factor. If None, the default is K.epsilon (). Decay: float > = 0. The decay value of learning rate after each parameter update.

Similarly, there are many optimizers, such as SGD, Adagrad, Adadelta, Adam, Adamax, Nadam, etc.

On the loss function

Depending on the specific task, generally speaking, the loss function should be able to describe the task very well. such as

1. Regression problem

It is hoped that the output value of the neural network is closer to the ground-truth, so it would be more appropriate to select the loss that can describe the distance, such as L1 Loss, MSE Loss and so on.

two。 Classification problem

It is hoped that the output category of the neural network is the same as that of the ground-truth, so it would be more appropriate to select the loss that can describe the category distribution, such as cross_entropy.

For specific common choices, please see the selection of loss function at the beginning of the article.

About indicators

General use to view the above list. Let's talk about the custom evaluation function: it should be passed in at compile time. This function takes (y_true, y_pred) as the input parameter and returns a tensor as the output.

Import keras.backend as Kdef mean_pred (y_true, y_pred): return K.mean (y_pred) model.compile (optimizer='rmsprop', loss='binary_crossentropy', metrics= ['accuracy', mean_pred]) 4. Network training and testing

1. Training (fitting)

Using model.fit (), it can accept a list of parameters

Def fit (self, x=None, y=None, batch_size=None, epochs=1, verbose=1, callbacks=None, validation_split=0., validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0, steps_per_epoch=None, validation_steps=None Validation_batch_size=None, validation_freq=1, max_queue_size=10, workers=1, use_multiprocessing=False):

This source code has more than 300 presidents, the specific interpretation will be next time.

We divide the training data, transmit them in small batches of 64 samples, and iterate 5 times for all the data.

Model.fit (train_images, train_labels, epochs = 5, batch_size=64)

two。 test

Use the model.evaluates () function

Test_loss, test_acc = model.evaluate (test_images, test_labels)

The return declaration about the test function:

Returns: Scalar test loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute `model.metrics_ names` will give you the display labels for the scalar outputs.5. Map the changes of loss and accuracy with epochs

Model.fit () returns a History object that contains a history member that records all the data about the training process.

We use matplotlib.pyplot to draw, as detailed in the complete code below.

Returns: a `History` object. Its `History.history` attribute is a record of training loss values and metrics values at successive epochs, as well as validation loss values and validation metrics values (if applicable). Def draw_loss (history): loss=history.history ['loss'] epochs=range (1 Magi len (loss) + 1) plt.subplot (1 Magi 2) # first picture plt.plot (epochs,loss,'bo') Label='Training loss') plt.title ("Training loss") plt.xlabel ('Epochs') plt.ylabel (' Loss') plt.legend () plt.subplot (1 accuracy' 2) # second picture accuracy=history.history ['accuracy'] plt.plot (epochs,accuracy,'bo' Label='Training accuracy') plt.title ("Training accuracy") plt.xlabel ('Epochs') plt.ylabel (' Accuracy') plt.suptitle ("Train data") plt.legend () plt.show () 6. Complete code from tensorflow.keras.datasets import mnistfrom tensorflow.keras import modelsfrom tensorflow.keras import layersfrom tensorflow.keras.utils import to_categoricalimport matplotlib.pyplot as pltimport numpy as npdef data_preprocess (): (train_images, train_labels), (test_images, test_labels) = mnist.load_data () train_images = train_images.reshape ((60000, 28, 28) 1)) train_images = train_images.astype ('float32') / 255 # print (train_images [0]) test_images = test_images.reshape ((10000, 28,28,1)) test_images = test_images.astype (' float32') / 255train_labels = to_categorical (train_labels) test_labels = to_categorical (test_labels) return train_images,train_labels,test_images Test_labels# build network def build_module (): model = models.Sequential () # layer 1 convolution layer model.add (layers.Conv2D (32, (3Magi 3), activation='relu', input_shape= (28cr 28je 1) # layer 2 maximum pooling layer model.add (layers.MaxPooling2D (2jue 2) # layer 3 convolution layer model.add (layers.Conv2D (64, (3je 3)) Activation='relu')) # layer 4 maximum pooling layer model.add (layers.MaxPooling2D ((2Magne2) # layer 5 convolution layer model.add (layers.Conv2D (64, (3jing3), activation='relu')) # layer 6 Flatten layer Tile 3D tensor into vector model.add (layers.Flatten ()) # seventh layer full connection layer model.add (layers.Dense (64, activation='relu')) # eighth layer softmax layer Classify model.add (layers.Dense (10, activation='softmax')) return modeldef draw_loss (history): loss=history.history ['loss'] epochs=range (1 plt.subplot (loss) + 1) plt.subplot (1 min2) # the first picture plt.plot (epochs,loss,'bo') Label='Training loss') plt.title ("Training loss") plt.xlabel ('Epochs') plt.ylabel (' Loss') plt.legend () plt.subplot (1 accuracy' 2) # second picture accuracy=history.history ['accuracy'] plt.plot (epochs,accuracy,'bo' Label='Training accuracy') plt.title ("Training accuracy") plt.xlabel ('Epochs') plt.ylabel (' Accuracy') plt.suptitle ("Train data") plt.legend () plt.show () if _ name__=='__main__': train_images,train_labels,test_images,test_labels=data_preprocess () model=build_module () print (model.summary ()) model.compile (optimizer='rmsprop' Loss=' categorical_crossentropy', metrics= ['accuracy']) history=model.fit (train_images, train_labels, epochs = 5, batch_size=64) draw_loss (history) test_loss, test_acc = model.evaluate (test_images, test_labels) print (' test_loss=',test_loss,' test_acc =', test_acc)

Changes of loss and accuracy during iterative training

Because the data set is relatively simple, the accuracy of random neural network design in the test set can reach 99.2%.

The above is all the contents of the article "how to realize the handwritten digit recognition function of MNIST in Python". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.