How to train MNIST handwritten digit recognition model by Tensorflow 04/23 Update SLTechnology News&Howtos

How to train MNIST handwritten digit recognition model by Tensorflow

2025-04-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

Today, I will talk to you about how Tensorflow trains the MNIST handwritten digit recognition model. Many people may not know much about it. In order to make you understand better, the editor has summarized the following contents for you. I hope you can get something according to this article.

Import tensorflow as tffrom tensorflow.examples.tutorials.mnist import input_dataINPUT_NODE = 784 # input layer nodes = picture pixels = 28x28=784OUTPUT_NODE = 10 # output layer nodes = number of picture categories LAYER1_NODE = 500 # number of hidden layer nodes, only one hidden layer BATCH_SIZE = 100 # the number of data in a training package, the smaller the number, the closer to the random gradient drop The larger the gradient, the closer the gradient decreases LEARNING_RATE_BASE = 0.8 # basic learning rate LEARNING_RATE_DECAY = 0.99 # learning rate decay rate REGULARIZATION_RATE = 0.0001 # regularization term coefficient TRAINING_STEPS = 30000 # number of training wheels MOVING_AVG_DECAY = 0.99 # sliding average decay rate # define an auxiliary function to give the input and all parameters of the neural network Calculate the forward propagation results of the neural network def inference (input_tensor, avg_class, weights1, biases1, weights2, biases2): # when no moving average class is provided Directly use the current value of the parameter if avg_class = = None: # to calculate the forward propagation result of the hidden layer layer1 = tf.nn.relu (tf.matmul (input_tensor, weights1) + biases1) # calculate the forward propagation result of the output layer return tf.matmul (layer1, weights2) + biases2 else: # first calculate the moving average of the variable Then calculate the forward propagation result layer1 = tf.nn.relu (tf.matmul (input_tensor, avg_class.average (weights1)) + avg_class.average (biases1)) return tf.matmul (layer1, avg_class.average (weights2)) + avg_class.average (biases2) # process def train (mnist): X = tf.placeholder (tf.float32, [None, INPUT_NODE], name='x-input') y = tf.placeholder (tf.float32, [None, OUTPUT_NODE] Name='y-input') # generate hidden layer parameters weights1 = tf.Variable (tf.truncated_normal ([INPUT_NODE, LAYER1_NODE], stddev=0.1)) biases1 = tf.Variable (tf.constant (0.1, shape= [Layer1 _ NODE])) # generate output layer parameters weights2 = tf.Variable (tf.truncated_normal ([LAYER1_NODE, OUTPUT_NODE], stddev=0.1)) biases2 = tf.Variable (tf.constant (0.1) Shape= [output _ NODE]) # calculate the forward propagation result Do not use the parameter moving average avg_class=None y = inference (x, None, weights1, biases1, weights2, biases2) # to define the variable of the number of training rounds, which is specified as the variable of untrainable global_step = tf.Variable (0, trainable=False) # given the sliding average decay rate and the number of training rounds Initialize the moving average class variable_avgs = tf.train.ExponentialMovingAverage (MOVING_AVG_DECAY, global_step) # use the moving average variables_avgs_op = variable_avgs.apply (tf.trainable_variables ()) # on all trainable variables representing neural network parameters to calculate the forward propagation result avg_y = inference (x, variable_avgs, weights1, biases1, weights2) after using the moving average Biases2) # calculate the cross entropy as the loss function cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits (logits=y, labels=tf.argmax (y _) 1) cross_entropy_mean = tf.reduce_mean (cross_entropy) # calculate the L2 regularization loss function regularizer = tf.contrib.layers.l2_regularizer (REGULARIZATION_RATE) regularization = regularizer (weights1) + regularizer (weights2) loss = cross_entropy_mean + regularization # set the learning rate learning_rate = tf.train.exponential_decay (LEARNING_RATE_BASE, global_step, # current iterative rounds mnist.train.num_examples / BATCH_SIZE # number of iterations of all training data LEARNING_RATE_DECAY) # optimized loss function train_step = tf.train.GradientDescentOptimizer (learning_rate). Minimize (loss, global_step=global_step) # back propagation simultaneously updates neural network parameters and their moving average with tf.control_dependencies ([train_step Variables_avgs_op]: train_op = tf.no_op (name='train') # check whether the forward propagation result of the neural network using the moving average model is correct correct_prediction = tf.equal (tf.argmax (avg_y, 1), tf.argmax (yearly, 1)) accuracy = tf.reduce_mean (tf.cast (correct_prediction) Tf.float32)) # initialize the session and start training with tf.Session () as sess: tf.global_variables_initializer (). Run () # prepare to validate the data Used to judge the stopping condition and training effect validate_feed = {x: mnist.validation.images, yearly: mnist.validation.labels} # prepare the test data The final evaluation criterion test_feed = {x: mnist.test.images, yearly: mnist.test.labels} # iterative training neural network for i in range (TRAINING_STEPS): if i00 = = 0: validate_acc = sess.run (accuracy, feed_dict=validate_feed) print ("After% d training step (s), validation accuracy using average"model is% g" (I, validate_acc)) xs Ys = mnist.train.next_batch (BATCH_SIZE) sess.run (train_op, feed_dict= {x: xs, yearly: ys}) # final accuracy of testing the model on the test set after training test_acc = sess.run (accuracy, feed_dict=test_feed) print ("After% d training steps, test accuracy using average model"is% g"% (TRAINING_STEPS) Test_acc) # main program entry def main (argv=None): mnist = input_data.read_data_sets ("/ tmp/data", one_hot=True) train (mnist) # Tensorflow main program entry if _ _ name__ = ='_ main__': tf.app.run ()

The output is as follows:

Extracting / tmp/data/train-images-idx3-ubyte.gzExtracting / tmp/data/train-labels-idx1-ubyte.gzExtracting / tmp/data/t10k-images-idx3-ubyte.gzExtracting / tmp/data/t10k-labels-idx1-ubyte.gzAfter 0 training step (s), validation accuracy using average model is 0.0462After 1000 training step (s), validation accuracy using average model is 0.9784After 2000 training step (s), validation accuracy using average model is 0.9806After 3000 training step (s), validation accuracy using average model is 0.9798After 4000 training step (s) Validation accuracy using average model is 0.9814After 5000 training step (s), validation accuracy using average model is 0.9826After 6000 training step (s), validation accuracy using average model is 0.9828After 7000 training step (s), validation accuracy using average model is 0.9832After 8000 training step (s), validation accuracy using average model is 0.9838After 9000 training step (s), validation accuracy using average model is 0.983After 10000 training step (s), validation accuracy using average model is 0.9836After 11000 training step (s), validation accuracy using average model is 0.9822After 12000 training step (s) Validation accuracy using average model is 0.983After 13000 training step (s), validation accuracy using average model is 0.983After 14000 training step (s), validation accuracy using average model is 0.9844After 15000 training step (s), validation accuracy using average model is 0.9832After 16000 training step (s), validation accuracy using average model is 0.9844After 17000 training step (s), validation accuracy using average model is 0.9842After 18000 training step (s), validation accuracy using average model is 0.9842After 19000 training step (s), validation accuracy using average model is 0.9838After 20000 training step (s) Validation accuracy using average model is 0.9834After 21000 training step (s), validation accuracy using average model is 0.9828After 22000 training step (s), validation accuracy using average model is 0.9834After 23000 training step (s), validation accuracy using average model is 0.9844After 24000 training step (s), validation accuracy using average model is 0.9838After 25000 training step (s), validation accuracy using average model is 0.9834After 26000 training step (s), validation accuracy using average model is 0.984After 27000 training step (s), validation accuracy using average model is 0.984After 28000 training step (s) Validation accuracy using average model is 0.9836After 29000 training step (s), validation accuracy using average model is 0.9842After 30000 training steps, test accuracy using average model is 0.9839

After reading the above, do you have any further understanding of how Tensorflow trains the MNIST handwritten digit recognition model? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.