How to train image classification model in PyTorch and TensorFlow 07/11 Update SLTechnology News&Howtos

How to train image classification model in PyTorch and TensorFlow

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

In this issue, the editor will bring you about how to train the image classification model in PyTorch and TensorFlow. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.

Author | PULKIT SHARMA compilation | Flin Source | analyticsvidhya

Introduction

Image classification is one of the most important applications of computer vision. Its applications range from object classification in self-driving cars to blood cell recognition in the medical industry, from defective object identification in manufacturing to the establishment of a system that can classify people who wear masks or not. In all these industries, image classification is used in one way or another. How do they do it? Which framework do they use?

We will learn how to build a basic image classification model in PyTorch and TensorFlow. We'll start with a brief overview of PyTorch and TensorFlow. Then, we will use MNIST handwritten digits to classify data sets, and use CNN (convolution neural network) in PyTorch and TensorFlow to build an image classification model.

Overview of PyTorch

Overview of TensorFlow

Understand the problem statement: MNIST

Implementation of convolution Neural Network (CNN) in PyTorch

Implementation of convolution Neural Network (CNN) in TensorFlow

Overview of PyTorch

PyTorch is becoming more and more popular in the deep learning community and is widely used by deep learning practitioners. PyTorch is a Python package that provides Tensor computing. In addition, tensors is a multidimensional array, just as NumPy's ndarrays can run on GPU.

A unique feature of PyTorch is that it uses dynamic computing graphs. PyTorch's Autograd software package generates calculation graphs from tensors and automatically calculates gradients. Rather than predefined graphics with specific functions.

PyTorch provides us with a framework for building compute diagrams anytime, anywhere, and even making changes at run time. In particular, this is useful when we don't know how much memory is needed to create a neural network.

You can use PyTorch to deal with various deep learning challenges. Here are some challenges:

Image (detection, classification, etc.)

Text (classification, generation, etc.)

Reinforcement learning

Overview of TensorFlow

TensorFlow was developed by researchers and engineers on the Google Brain team. It is a far cry from the most commonly used software libraries in the field of deep learning (although other software libraries are rapidly catching up).

One of the biggest reasons why TensorFlow is so popular is that it supports multiple languages to create deep learning models, such as Python,C + and R. It provides detailed documentation and guidance.

TensorFlow contains many components. The following are two outstanding representatives:

TensorBoard: use data flow diagrams to help effectively visualize data

TensorFlow: useful for rapid deployment of new algorithms / experiments

TensorFlow is currently running version 2.0, which will be officially released in September 2019. We will also implement CNN in version 2.0.

If you want to learn more about this new version of TensorFlow, please check out the TensorFlow 2.0 Deep Learning tutorial

Https://www.analyticsvidhya.com/blog/2020/03/tensorflow-2-tutorial-deep-learning

I hope you now have a basic understanding of both PyTorch and TensorFlow. Now, let's try to use these two frameworks to build a deep learning model and understand its internal work. Before that, let's first take a look at the problem statement that we will solve in this article.

Understand the problem statement: MNIST

Before we begin, let's take a look at the dataset. In this article, we will solve the popular MNIST problem. This is a number recognition task, in which we must classify the images of handwritten numbers into one of the 10 categories 0 to 9.

In the MNIST dataset, we have digital images taken from various scanned documents, and the dimensions are standardized and centered. Each image is then a square of 28 x 28 pixels (a total of 784 pixels). The standard split of the data set is used to evaluate and compare the model, of which 60000 images are used to train the model and a separate 10000 image set is used to test the model.

Now, we also know about the dataset. So let's build an image classification model using CNN in PyTorch and TensorFlow. We will start with the implementation in PyTorch. We will implement these models in google colab, which provides free GPU to run these deep learning models.

I hope you are familiar with convolutional neural networks (CNN). If not, please feel free to refer to the following article:

Learning a comprehensive course of convolution neural networks from scratch: https://www.analyticsvidhya.com/blog/2018/12/guide-convolutional-neural-network-cnn

Implementation of convolution Neural Network (CNN) in PyTorch

Let's import all the libraries first:

# importing the librariesimport numpy as npimport torchimport torchvisionimport matplotlib.pyplot as pltfrom time import timefrom torchvisionimport datasets, transformsfrom torchimport nn, optim

We also check the version of PyTorch on Google colab:

# version of pytorchprint (torch.__version__)

Therefore, I am using version 1.5.1 of PyTorch. If you use any other version, you may receive some warnings or errors, so you can update to this version of PyTorch. We will perform some transformations on the image, such as normalizing pixel values, so let's define these transformations as well:

# transformations to be applied on imagestransform = transforms.Compose ([transforms.ToTensor (), transforms.Normalize ((0.5,), (0.5,)),])

Now, let's load the training and test sets for the MNIST dataset:

# defining the training and testing settrainset = datasets.MNIST ('. / data', download=True, train=True, transform=transform) testset = datasets.MNIST ('. /', download=True, train=False, transform=transform)

Next, I define the training and test loader, which will help us load the training and test sets in batches. I define the batch size as 64:

# defining trainloader and testloadertrainloader = torch.utils.data.DataLoader (trainset, batch_size=64, shuffle=True) testloader = torch.utils.data.DataLoader (testset, batch_size=64, shuffle=True)

First, let's take a look at the summary of the training set:

# shape of training datadataiter = iter (trainloader) images, labels = dataiter.next () print (images.shape) print (labels.shape)

So, in each batch, we have 64 images, each with a size of 28pc28, and for each image, we have a corresponding label. Let's visualize the training image and see what it looks like:

# visualizing the training imagesplt.imshow (images [0] .numpy () .squeeze (), cmap='gray')

It is an image of the number 0. Similarly, let's visualize the test set image:

# shape of validation datadataiter = iter (testloader) images, labels = dataiter.next () print (images.shape) print (labels.shape)

In the test set, we also have batches of size 64. Now let's define the architecture

Define the model schema

We will use the CNN model here. So let's define and train the model:

# defining the model architectureclass Net (nn.Module): def _ init__ (self): super (Net, self). _ _ init__ () self.cnn_layers = nn.Sequential (# Defining a 2D convolution layer nn.Conv2d (1,4, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d (4), nn.ReLU (inplace=True), nn.MaxPool2d (kernel_size=2, stride=2) # Defining another 2D convolution layer nn.Conv2d (4, 4, kernel_size=3, stride=1, padding=1), nn.BatchNorm2d (4), nn.ReLU (inplace=True), nn.MaxPool2d (kernel_size=2, stride=2),) self.linear_layers = nn.Sequential (nn.Linear (4 * 7 * 7,10)) # Defining the forward pass def forward (self X): X = self.cnn_layers (x) x = x.view (x.size (0),-1) x = self.linear_layers (x) return x

We also define the optimizer and loss function, and then we will look at a summary of the model:

# defining the modelmodel = Net () # defining the optimizeroptimizer = optim.Adam (model.parameters (), lr=0.01) # defining the loss functioncriterion = nn.CrossEntropyLoss () # checking if GPU is availableif torch.cuda.is_available (): model = model.cuda () criterion = criterion.cuda () print (model)

Therefore, we have two convolution layers, which will help to extract features from the image. The characteristics of these convolution layers are passed to fully connected layers, which classify images into their respective categories. Now that our model architecture is ready, let's train this model for ten periods:

For i in range (10): running_loss = 0 for images, labels in trainloader: if torch.cuda.is_available (): images = images.cuda () labels = labels.cuda () # Training pass optimizer.zero_grad () output = model (images) loss = criterion (output Labels) # This is where the model learns by backpropagating loss.backward () # And optimizes its weights here optimizer.step () running_loss + = loss.item () else: print ("Epoch {}-Training loss: {}" .f ormat (iTun1, running_loss/len (trainloader)

You will see that training decreases as time goes on. This means that our model is a centralized learning model from training. Let's check the performance of the model on the test set:

# getting predictions on test set and measuring the performancecorrect_count, all_count = 0, 0for images,labels in testloader: for i in range (len (labels)): if torch.cuda.is_available (): images = images.cuda () labels = labels.cuda () img = images [I] .view (1,1,28 28) with torch.no_grad (): logps = model (img) ps = torch.exp (logps) probab = list (ps.cpu () [0]) pred_label = probab.index (max (probab)) true_label = labels.cpu () [I] if (true_label = = pred_label): correct_count + = 1 all_count + = 1print ("Number Of Images Tested =", all_count) print ("\ nModel Accuracy =" (correct_count/all_count))

As a result, we tested a total of 10000 images, and the accuracy of the model in predicting the label of the test images was about 96%.

This is how you can build convolution neural networks in PyTorch. In the next section, we will look at how to implement the same architecture in TensorFlow.

Implementation of convolution Neural Network (CNN) in TensorFlow

Now, let's use convolutional neural networks in TensorFlow to solve the same MNIST problem. As usual, we will start by importing the library:

# importing the librariesimport tensorflow as tffrom tensorflow.keras import datasets, layers, modelsfrom tensorflow.keras.utils import to_categoricalimport matplotlib.pyplot as plt

Check the version of TensorFlow we are using:

# version of tensorflowprint (tf.__version__)

Therefore, we are using version 2.2.0 of TensorFlow. Now let's load the MNIST dataset using tensorflow.keras 's dataset class:

(train_images, train_labels), (test_images, test_labels) = datasets.mnist.load_data (path='mnist.npz') # Normalize pixel values to be between 0 and 1train_images, test_images = train_images / 255.0, test_images / 255.0

Here, we have loaded the training and the test set for the MNIST dataset. In addition, we have standardized the pixel values of training and test images. Next, let's visualize some images from the dataset:

# visualizing a few imagesplt.figure (figsize= (10jue 10)) for i in range (9): plt.subplot (3pens 3penny 1) plt.xticks ([]) plt.yticks ([]) plt.grid (False) plt.imshow (train_images [I], cmap='gray') plt.show ()

This is what our dataset looks like. We have images of handwritten numbers. Let's take a look at the shape of the training and test set:

# shape of the training and test set (train_images.shape, train_labels.shape), (test_images.shape, test_labels.shape)

Therefore, we have 60000 images of 28 times 28 in the training set and 10000 images of the same shape in the test set. Next, we will resize the image and encode the target variable with one click:

# reshaping the imagestrain_images = train_images.reshape ((60000, 28, 28, 1)) test_images = test_images.reshape ((10000, 28, 28, 1)) # one hot encoding the target variabletrain_labels = to_categorical (train_labels) test_labels = to_categorical (test_labels) defines the model architecture

Now we will define the architecture of the model. We will use the same schema defined in Pytorch. So, our model will be a combination of two convolution layers and a maximum pooling layer, and then we will have a Flatten layer and finally a fully connected layer with 10 neurons, because we have 10 classes.

# defining the model architecturemodel = models.Sequential () model.add (layers.Conv2D (4, (3,3), activation='relu', input_shape= (28,28,1) model.add (layers.MaxPooling2D ((2,2), strides=2)) model.add (layers.Conv2D (4, (3,3), activation='relu')) model.add (layers.MaxPooling2D ((2,2), strides=2)) model.add (layers.Flatten ()) model.add (layers.Dense (10, activation='softmax'))

Let's take a quick look at a summary of the model:

# summary of the modelmodel.summary ()

All in all, we have two convolution layers, two maximum pool layers, a Flatten layer and a fully connected layer. The total number of parameters in the model is 1198. Now that our model is ready, we will compile it:

# compiling the modelmodel.compile (optimizer='adam', loss='categorical_crossentropy', metrics= ['accuracy'])

We are using the Adam optimizer, and you can change it as well. The loss function is set to classification cross-entropy because we are solving a multi-class classification problem and the metric is' accuracy'. Now let's train the model for 10 periods.

# training the modelhistory = model.fit (train_images, train_labels, epochs=10, validation_data= (test_images, test_labels))

All in all, at first, the training loss was about 0.46, but after 10 periods, the training loss was reduced to 0.08. The accuracy of training and verification after 10 periods is 97.31% and 97.48%, respectively.

So, this is how we can train CNN in TensorFlow.

The above is the editor to share with you how to train the image classification model in PyTorch and TensorFlow. If you happen to have similar doubts, please refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.