How to realize digit recognition by Pytorch 07/15 Update SLTechnology News&Howtos

How to realize digit recognition by Pytorch

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

This article will explain in detail how to achieve digital recognition in Pytorch. The editor thinks it is very practical, so I share it for you as a reference. I hope you can get something after reading this article.

Python implementation code:

Import torchimport torch.nn as nnimport torch.optim as optimfrom torchvision import datasets, transformsimport torchvisionfrom torch.autograd import Variablefrom torch.utils.data import DataLoaderimport cv2# download training set train_dataset = datasets.MNIST (root='E:mnist', train=True, transform=transforms.ToTensor (), download=True) # download test set test_dataset = datasets.MNIST (root='E:mnist' Train=False, transform=transforms.ToTensor () Download=True) # dataset parameter is used to specify the name of the dataset we load # batch_size parameter sets the number of picture data in each package # during the loading process, the data will be randomly out of order and packaged batch_size = 6 to create a data iterator # load training set train_loader = torch.utils.data.DataLoader (dataset=train_dataset Batch_size=batch_size, shuffle=True) # load test set test_loader = torch.utils.data.DataLoader (dataset=test_dataset, batch_size=batch_size) Shuffle=True) # convolution layer using torch.nn.Conv2d# activation layer using torch.nn.ReLU# pooling layer using torch.nn.MaxPool2d# full connection layer using torch.nn.Linearclass LeNet (nn.Module): def _ _ init__ (self): super (LeNet, self). _ _ init__ () self.conv1 = nn.Sequential (nn.Conv2d (1,6,3) 1,2), nn.ReLU (), nn.MaxPool2d (2,2) self.conv2 = nn.Sequential (nn.Conv2d (6,16,5), nn.ReLU (), nn.MaxPool2d (2,2)) self.fc1 = nn.Sequential (nn.Linear (16 * 5 * 5120) Nn.BatchNorm1d, nn.ReLU () self.fc2 = nn.Sequential (nn.Linear (120,84), nn.BatchNorm1d (84), nn.ReLU (), nn.Linear (84,10)) # the final result must be 10 Because the option for numbers is 0-9 def forward (self, x): X = self.conv1 (x) # print ("1:", x.shape) # 1: torch.Size ([64,6,30,30]) # max pooling # 1: torch.Size ([64,6,15,15]) x = self.conv2 (x) # print ("2:") X.shape) # 2: torch.Size ([64,16,5,5]) # flattening parameters x = x.view (x.size () [0],-1) x = self.fc1 (x) x = self.fc2 (x) return xdef test_image_data (images) Labels): # initial output as a digital image sequence # integrates an image sequence into a picture (make_grid turns the picture into three channels by default Default is 0) # images: torch.Size ([64,1,28,28]) img = torchvision.utils.make_grid (images) # img: torch.Size ([3,242,242]) # sets the channel dimension to the third dimension img = img.numpy (). Transpose (1,2,0) # img: torch.Size ([242,242,3]) # reduce image contrast std = [0.5,0.5] Mean = [0.5,0.5,0.5] img = img * std + mean # print (labels) cv2.imshow ('win2' Img) key_pressed = cv2.waitKey (0) # initialize device information device = torch.device ('cuda' if torch.cuda.is_available () else' cpu') # Learning rate LR = 0.00initialize network net = LeNet (). To (device) # loss function uses cross entropy criterion = nn.CrossEntropyLoss () # optimization function uses Adam adaptive optimization algorithm optimizer = optim.Adam (net.parameters (), lr=LR ) epoch = 1if _ _ name__ = ='_ main__': for epoch in range (epoch): print ("GPU:", torch.cuda.is_available ()) sum_loss = 0.0 for I, data in enumerate (train_loader): inputs, labels = data # print (inputs.shape) # torch.Size ([64,1,28) 28]) # copy the data in memory to gpu video memory to inputs, labels = Variable (inputs). Cuda () Variable (labels). Cuda () # returns the gradient to zero optimizer.zero_grad () # passes the data into the network and performs the forward operation outputs = net (inputs) # to get the loss function loss = criterion (outputs Labels) # backpropagation loss.backward () # update optimizer.step () # print (loss) sum_loss + = loss.item () if I% 100 = 99: print ('[% djue% d] loss:%.03f'% (epoch + 1, I + 1) Sum_loss / 100)) sum_loss = 0.0 # convert the model to test mode net.eval () correct = 0 total = 0 for data_test in test_loader: _ images, _ labels = data_test # copy the data in memory to gpu memory to images Labels = Variable (_ images). Cuda (), Variable (_ labels). Cuda () # Image prediction result output_test = net (images) # torch.Size ([64,10]) # find the maximum prediction index _, predicted = torch.max (output_test) from each row 1) # Image Visualization # print ("predicted:", predicted) # test_image_data (_ images, _ labels) # quantity of predicted data total + = labels.size (0) # predicted correct quantity correct + = (predicted = = labels). Sum () print ("correct1:" Correct) print ("Test acc: {0}" .format (correct.item () / total)) this is the end of the article on "how Pytorch implements digital recognition". Hope that the above content can be helpful to you, so that you can learn more knowledge, if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.