Analysis of the principle of CNN convolution Neural Network and the Application of Image recognition 07/16 Update SLTechnology News&Howtos

Analysis of the principle of CNN convolution Neural Network and the Application of Image recognition

2025-07-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

Today, I will talk to you about the principle of CNN convolution neural network and the example analysis of image recognition application, which may not be well understood by many people. In order to make you understand better, the editor has summarized the following content for you. I hope you can get something according to this article.

CNN Notes: popular understanding of convolution Neural Networks-- understanding the relationship between different input channels and convolution Kernel channels

# coding=utf-8from tensorflow.examples.tutorials.mnist import input_dataimport tensorflow as tfmnist = input_data.read_data_sets ("MNIST_data/", one_hot=True) # read the picture data set sess = tf.InteractiveSession () # create session# one, function declaration part def weight_variable (shape): # normal distribution, standard deviation is 0.1the default maximum is 1, and the minimum is-1 The mean is 0 initial = tf.truncated_normal (shape, stddev=0.1) return tf.Variable (initial) def bias_variable (shape): # create a structure for a shape matrix or an array shape declares its rows and rows, initialize all values as 0.1 initial = tf.constant (0.1, shape=shape) return tf.Variable (initial) def conv2d (x, W): # convolution traversal steps in all directions is 1 SAME: automatically padding 0 outside the edge, traversing multiplicative return tf.nn.conv2d (x, W, strides= [1,1,1,1], padding='SAME') def max_pool_2x2 (x): # pooling convolution result (conv2d) pooling layer uses kernel size of 2 to 2, steps are also 2, and the surrounding complement is 0, taking the maximum value. The amount of data has been reduced by four times return tf.nn.max_pool (x, ksize= [1,2,2,1], strides= [1,2,2,1], padding='SAME') # II, defining the input and output structure # declares a placeholder, and None indicates that the number of input images varies. Image resolution xs = tf.placeholder (tf.float32, [None, 28: 28]) # category is 0-9, a total of 10 categories, corresponding to the output classification result ys = tf.placeholder (tf.float32, [None, 10]) keep_prob = tf.placeholder (tf.float32) # x_image makes xs reshape into the shape of 28" 281because it is a gray image. So the channel is 1. As the training input,-1 represents a variable number of pictures x_image = tf.reshape (xs, [- 1,28,28,1]) # III, build the network and define the algorithm formula, that is, the calculation of forward # # the first layer convolution operation # # the first and second parameter is worth convolution core size, that is, patch, the third parameter is the number of image channels, and the fourth parameter is the number of convolution kernels Represents how many convolution feature images will appear W_conv1 = weight_variable ([5,5,1,32]) # has a corresponding offset for each convolution kernel. B_conv1 = bias_variable ([32]) # Image multiplied by convolution kernel, plus paranoia, convolution result 28x28x32 h_conv1 = tf.nn.relu (conv2d (x_image, W_conv1) + b_conv1) # pooling result 14x14x32 convolution result multiplied by pooled convolution kernel h_pool1 = max_pool_2x2 (h_conv1) # # second layer convolution operation # 32 channel convolution Convolution 64 features w_conv2 = weight_variable ([5Jing 5Jing 32J 64]) # 64 paranoid data b_conv2 = bias_variable ([64]) # Note h_pool1 is the result of pooling above # convolution result 14x14x64 h_conv2 = tf.nn.relu (conv2d (hackers pool1 recording wicked conv2) + b_conv2) # pooling result 7x7x64 h_pool2 = max_pool_2x2 (h_conv2) # the original image size is 28028, the first round image is reduced to 14014, a total of 32 images, and after the second round the image is reduced to 7x7 A total of 64 # # layer 3 full connection operation # two-dimensional tensor, the patch of the first parameter 7-7-64, can also be thought of as a convolution of only one row of 7-7-64 data. The second parameter represents the convolution number of 1024 W_fc1 = weight_variable ([77,764, 1024]) # 1024 paranoid data b_fc1 = bias_variable ([1024]) # reshape the result of layer 2 convolution pooling to only one row of 77,764 data # [n_samples, 7, 7, 64]-> > [n_samples 7 / 7 / 64] h_pool2_flat = tf.reshape (h_pool2, [- 1,7 / 7 / 64]) # convolution operation The result is 1x 1x 1024, a single row multiplied by a single column equals 1x 1 matrix. Matmul realizes the most basic matrix multiplication, which is different from the ergodic multiplication of tf.nn.conv2d. It is automatically considered to be the forward vector and the column vector h_fc1 = tf.nn.relu (tf.matmul (h_pool2_flat, W_fc1) + b_fc1) # dropout operation, reducing overfitting is actually reducing the weight scale of some inputs in the upper layer. Even set it to 0, raise the weight of some inputs, or even set to 2, to prevent the evaluation curve from oscillating. I think it is necessary to # use placeholders when the sample is small. Dropout can automatically determine scale, and it can also be customized, such as 0.5. according to tensorflow documents, the real value used in the program is 1max 0.5, that is, some inputs are multiplied by 2. At the same time, some inputs are multiplied by 0 keep_prob = tf.placeholder (tf.float32) h_fc1_drop = tf.nn.dropout (fancifc1 philosophy keepprob) # perform dropout operations on the convolution results # # layer 4 output operation # two-dimensional tensor, 1-dimensional 1024 matrix convolution, a total of 10 convolution Corresponding to the final classification of our initial ys length 10 W_fc2 = weight_variable ([1024, 10]) b_fc2 = bias_variable ([10]) #, the result is that both 1 / 1 / 10 / 10 softmax and sigmoid are based on the logistic classification algorithm, one is multi-classification and the other is two-category y_conv=tf.nn.softmax (tf.matmul (h_fc1_drop, W_fc2) + b_fc2) # 4 Define loss (minimum error probability), select optimization loss, cross_entropy =-tf.reduce_sum (ys * tf.log (y_conv)) # define cross entropy as loss function train_step = tf.train.DradientDescentOptimizer (0.5). Minimize (cross_entropy) # call optimizer is actually trying to minimize cross_entropy by feeding data # V Start data training and evaluation correct_prediction = tf.equal (tf.argmax (yearly convolution 1), tf.argmax (ys,1) accuracy = tf.reduce_mean (tf.cast (correct_prediction) Tf.float32) tf.global_variables_initializer () .run () for i in range (20000): batch = mnist.train.next_batch (50) if i0 = 0: train_accuracy = accuracy.eval (feed_dict= {x:batch [0], ys: batch [1] Keep_prob: 1.0}) print ("step% d, training accuracy% g"% (I, train_accuracy)) train_step.run (feed_dict= {x: batch [0], ys: batch [1], keep_prob: 0.5}) print ("test accuracy% g"% accuracy.eval (feed_dict= {x: mnist.test.images, ys: mnist.test.labels) Keep_prob: 1. 0}) after reading the above content Do you have any further understanding of the principle of CNN convolution neural network and the example analysis of image recognition application? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.