Python Neural Network TensorFlow method course of handwritten digit recognition based on CNN convolution 04/16 Update SLTechnology News&Howtos

Python Neural Network TensorFlow method course of handwritten digit recognition based on CNN convolution

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article mainly explains the "Python neural network TensorFlow based on CNN convolution to recognize handwritten numbers tutorial", the content of the article is simple and clear, easy to learn and understand, the following please follow the editor's ideas slowly in depth, together to study and learn "Python neural network TensorFlow based on CNN convolution to recognize handwritten numbers tutorial" it!

Catalogue

Basic theory

1. Training CNN convolution neural network

1. Load data

2. Change the data dimension

3. Normalization

4. Single hot coding

5. Build CNN convolution neural network.

5-1. First layer: the first convolution layer

5-2. Second layer: second convolution layer

5-3. Flattening

Layer 3: the first fully connected layer

5-5. Layer 4: the second fully connected layer (output layer)

6. Compilation

7. Training

8. Save the model

Code

Second, identify your own handwritten numbers (images)

1. Load data

2. Load the trained model

3. Load the digital picture written by yourself and set the size

4. Convert grayscale image

5. Turn to white characters on black background and data normalization

6. Convert to four-dimensional data

7. Forecast

8. Display the image

Effect display

Code

Basic theory

The first layer: convolution layer.

The second layer: convolution layer.

The third layer: full connection layer.

The fourth layer: the output layer.

The original handwritten digital picture in the picture is a 28 × 28 picture, and it is black and white, so the number of channels of the picture is 1, the input data is 28 × 28 × 1, and if it is a color picture, the number of channels of the picture is 3.

The network structure is a four-layer convolution neural network (when calculating the number of layers of the neural network, the weighted layer can be regarded as one layer, and the pooling layer cannot be counted as a separate layer) (the pooling calculation is carried out in the convolution layer).

Convolution of multiple feature images is equivalent to feature extraction of multiple feature images at the same time.

The more the number of feature images, the more the number of features extracted by the convolution network, if the number of feature maps is set too little, it is easy to underfit, if the number of feature maps is set too much, it is easy to over-fit, so it needs to be set to an appropriate value.

1. Train CNN convolution neural network 1, load data # 1, load data mnist = tf.keras.datasets.mnist (train_data, train_target), (test_data, test_target) = mnist.load_data () 2, change data dimension

Note: in TensorFlow, you need to change the data into a 4-dimensional format when convolution.

The four dimensions are: the number of data, picture height, picture width, picture channel number.

# 3. Normalization (helps to improve training speed) train_data = train_data/255.0test_data = test_data/255.03, normalization # 3, normalization (helps to improve training speed) train_data = train_data/255.0test_data = test_data/255.04, single hot code # 4, single hot code train_target = tf.keras.utils.to_categorical (train_target, num_classes=10) test_target = tf.keras.utils.to_categorical (test_target) Num_classes=10) # 10 results 5. Build CNN convolution neural network model = Sequential () 5-1, first layer: first convolution layer

The first convolution layer: convolution layer + pool layer.

# 5-1. First layer: convolution layer + pooled layer # first convolution layer model.add (Convolution2D (input_shape = (2828)), filters = 32, kernel_size = 5, strides = 1, padding = 'same' Activation = 'relu') # number of convolution layer input data filters convolution core size step fill data (same padding) activation function # first pooled layer # pool_sizemodel.add (MaxPooling2D (pool_size = 2, strides = 2, padding =' same' ) # pooling layer (maximum pooling) pooling window size step filling mode 5-2, second layer: second convolution layer # 5-2, second layer: convolution layer + pooling layer # second convolution layer model.add (Convolution2D (64,5, strides=1, padding='same') Activation='relu')) # 64: number of filters 5: convolution window size # second pooling layer model.add (MaxPooling2D (2,2, 'same')) 5-3, flattening

Change the data into (64, 7, 7, 64) data: (64, 7, 7, 64).

Flatten flattening:

# 5-3. Flattening (equivalent to (64)) data-> (64) model.add (Flatten ()) 5-4, layer 3: the first full connection layer # 5-4, layer 3: the first full connection layer model.add (Dense (1024) Activation=' relu') model.add (Dropout (0.5)) 5-5, layer 4: second full connection layer (output layer) # 5-5, layer 4: second full connection layer (output layer) model.add (Dense (10, activation='softmax')) # 10: number of output neurons 6, compilation

Set optimizer, loss function, label.

# 6. Compile model.compile (optimizer=Adam (lr=1e-4), loss='categorical_crossentropy', metrics= ['accuracy']) # Optimizer (adam) loss function (Cross Entropy loss function) label 7, training # 7, training model.fit (train_data, train_target, batch_size=64, epochs=10, validation_data= (test_data) Test_target) 8. Save model # 8, save model model.save ('mnist.h6')

Effect:

Epoch 1/10

938 take 938 [=]-142s 151ms/step-loss: 0.3319-accuracy: 0.9055-val_loss: 0.0895-val_accuracy: 0.9728

Epoch 2/10

938 take 938 [=]-158s 169ms/step-loss: 0.0911-accuracy: 0.9721-val_loss: 0.0515-val_accuracy: 0.9830

Epoch 3/10

938 take 938 [=]-146s 156ms/step-loss: 0.0629-accuracy: 0.9807-val_loss: 0.0389-val_accuracy: 0.9874

Epoch 4/10

938 take 938 [=]-120s 128ms/step-loss: 0.0498-accuracy: 0.9848-val_loss: 0.0337-val_accuracy: 0.9889

Epoch 5/10

938 take 938 [=]-119s 127ms/step-loss: 0.0424-accuracy: 0.9869-val_loss: 0.0273-val_accuracy: 0.9898

Epoch 6/10

938 take 938 [=]-129s 138ms/step-loss: 0.0338-accuracy: 0.9897-val_loss: 0.0270-val_accuracy: 0.9907

Epoch 7/10

938 take 938 [=]-124s 133ms/step-loss: 0.0302-accuracy: 0.9904-val_loss: 0.0234-val_accuracy: 0.9917

Epoch 8/10

938 take 938 [=]-132s 140ms/step-loss: 0.0264-accuracy: 0.9916-val_loss: 0.0240-val_accuracy: 0.9913

Epoch 9/10

938 take 938 [=]-139s 148ms/step-loss: 0.0233-accuracy: 0.9926-val_loss: 0.0235-val_accuracy: 0.9919

Epoch 10/10

938 take 938 [=]-139s 148ms/step-loss: 0.0208-accuracy: 0.9937-val_loss: 0.0215-val_accuracy: 0.9924

It can be found that after 10 times of training, the effect has reached 99%, which is quite good.

Code # handwritten digit recognition-- CNN neural network training import osos.environ ['TF_CPP_MIN_LOG_LEVEL'] =' 2' import tensorflow as tffrom tensorflow.keras.models import Sequentialfrom tensorflow.keras.layers import Dense,Dropout,Convolution2D,MaxPooling2D,Flattenfrom tensorflow.keras.optimizers import Adam # 1, loading data mnist = tf.keras.datasets.mnist (train_data, train_target), (test_data Test_target) = mnist.load_data () # 2. Change the data dimension train_data = train_data.reshape (- 1,28,28,1) test_data = test_data.reshape (- 1,28,28,1) # Note: in TensorFlow When doing convolution, you need to change the data into a 4-dimensional format # these four dimensions are: data quantity, picture height, picture width Number of image channels # 3, normalization (helps to speed up training) train_data = train_data/255.0test_data = test_data/255.0 # 4, train_target = tf.keras.utils.to_categorical (train_target, num_classes=10) test_target = tf.keras.utils.to_categorical (test_target) Num_classes=10) # 10 results # 5. Build CNN convolution neural network model = Sequential () # 5-1, first layer: convolution layer + pooled layer # the first convolution layer model.add (Convolution2D (input_shape = (2828pp.1), filters = 32, kernel_size = 5, strides = 1, padding = 'same' Activation = 'relu') # number of convolution layer input data filters convolution core size step fill data (same padding) activation function # first pooled layer # pool_sizemodel.add (MaxPooling2D (pool_size = 2, strides = 2, padding =' same' ) # pooling layer (maximum pooling) pooling window size step filling method # 5-2, second layer: convolution layer + pooling layer # second convolution layer model.add (Convolution2D (64,5, strides=1, padding='same', activation='relu')) # 64: number of filters 5: convolution window size # second pooling layer model.add (MaxPooling2D (2,2) (same')) # 5-3, flattening (equivalent to (64)) data-> (64)) model.add (Flatten ()) # 5-4, layer 3: the first full connection layer model.add (Dense (1024, activation = 'relu')) model.add (Dropout (0.5)) # 5-5, layer 4: the second full connection layer (output layer) model.add (Dense (10) Activation='softmax') # 10: number of output neurons # 6, compiled model.compile (optimizer=Adam (lr=1e-4), loss='categorical_crossentropy', metrics= ['accuracy']) # Optimizer (adam) loss function (Cross Entropy loss function) label # 7, training model.fit (train_data, train_target, batch_size=64, epochs=10, validation_data= (test_data) (test_target)) # 8. Save the model model.save ('mnist.h6') 2, identify your own handwritten numbers (image) 1, load data # 1, load data mnist = tf.keras.datasets.mnist (x_train, y_train), (x_test, y_test) = mnist.load_data ()

Picture of the dataset (one):

2. Load the trained model # 2, load the trained model model = load_model ('mnist.h6') 3, load the digital picture written by yourself and set the size # 3, load the digital picture written by yourself and set the size img = Image.open (' 6.jpg') # set the size (consistent with the image of the dataset) img = img.resize ((28,28))

4. Convert grayscale image # 4, convert grayscale image gray = np.array (img.convert ('L')) # .convert ('L'): convert grayscale image

You can find that it is very different from the black and white words in the dataset, so let's reverse it:

5. Turn to white characters on black background and data normalization

The data in the MNIST dataset is all white words on a black background, and the value is between 0: 1.

# 5. Convert white words to black background, normalize data to gray _ inv = (255-gray) / 255.06, and convert to four-dimensional data

CNN neural network prediction needs four-dimensional data.

# 6. Convert 4-D data (needed for CNN prediction) image = gray_inv.reshape (1) 28pc 28) 7, Forecast # 7, Forecast prediction = model.predict (image) # Forecast prediction = np.argmax (prediction,axis=1) # find the maximum value print ('Forecast result:', prediction)

8. Display image # 8, display # set plt chart f, ax = plt.subplots (3,3, figsize= (7,7)) # display dataset image ax [0] [0] .set _ title ('train_model') ax [0] [0] .axis (' off') ax [0] [0] .imshow (x_train [18]) 'gray') # displays the original image ax [0] [1] .set _ title (' img') ax [0] [1] .axis ('off') ax [0] [1] .imshow (img,' gray') # shows grayscale image (black and white) ax [0] [2] .set _ title ('gray') ax [0] [2] .axis (' off') ax [0] [2] .imshow (gray) 'gray') # display grayscale image (black background white text) ax [1] [0] .set _ title (' gray') ax [1] [0] .axis ('off') ax [1] [0] .imshow (gray_inv,' gray') plt.show () effect display

Code # identifies its own handwritten digits (image prediction) import osos.environ ['TF_CPP_MIN_LOG_LEVEL'] =' 2'import tensorflow as tffrom tensorflow.keras.models import load_modelimport matplotlib.pyplot as pltfrom PIL import Imageimport numpy as np # 1, loading data mnist = tf.keras.datasets.mnist (x_train, y_train), (x_test Y_test) = mnist.load_data () # 2, load the trained model model = load_model ('mnist.h6') # 3, load the digital picture written by yourself and set the size img = Image.open (' 5.jpg') # set the size (same as the picture of the dataset) img = img.resize ((28) Gray = np.array (img.convert ('L')) # .convert (L'): convert to grayscale image # 5, convert to black background white character, normalize data gray _ inv = (255-gray) / 255.0 # 6, convert 4D data (needed for CNN prediction) image = gray_inv.reshape ((1m 28) 28)) # 7, predict prediction = model.predict (image) # predict prediction = np.argmax (prediction) Axis=1) # find the maximum print ('forecast result:', prediction) # 8, display # set plt chart f, ax = plt.subplots (2,2, figsize= (5,5)) # display dataset image ax [0] [0] .set _ title ('train_model') ax [0] [0] .axis (' off') ax [0] [0] .imshow (x_train [18]) 'gray') # displays the original image ax [0] [1] .set _ title (' img') ax [0] [1] .axis ('off') ax [0] [1] .imshow (img,' gray') # shows grayscale image (black and white) ax [1] [0] .set _ title ('gray') ax [1] [0] .axis (' off') ax [1] [0] .imshow (gray) 'gray') # display grayscale image (black background white text) ax [1] [1] .set _ title (f'predict: {prediction}') ax [1] [1] .axis ('off') ax [1] .imshow (gray_inv,' gray') plt.show () Thank you for your reading The above is the content of "Python neural network TensorFlow handwritten number recognition method course based on CNN convolution". After the study of this paper, I believe you have a deeper understanding of the Python neural network TensorFlow handwritten number recognition method tutorial based on CNN convolution, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.