What is the foundation of building models for the introduction to zero foundation of PyTorch? 04/18 Update SLTechnology News&Howtos

What is the foundation of building models for the introduction to zero foundation of PyTorch?

2025-04-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article introduces the relevant knowledge of "what is the basis of building a model in the introduction to PyTorch Zero Foundation". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

I. the construction of neural network

The construction of neural network in PyTorch is generally based on the model of Module class, which makes the model construction more flexible. The Module class is a model construction class provided by the nn module, which is the base class of all neural network modules, and we can inherit it to define the model we want.

The following inherits the Module class to construct a multi-layer perceptron. The defined MLP class overloads the init and forward functions of the Module class. They are used to create model parameters and define forward calculations, respectively. Forward calculation is also called forward propagation.

#-*-coding: utf-8-*-"Created on Sat Oct 16 09:43:21 2021@author: 86493"import torchfrom torch import nnclass MLP (nn.Module): # declare layers with model parameters Two full connection layers def _ init__ (self, * * kwargs) are declared here: # call the constructor of the MLP parent class Block for the necessary initialization # so that other functions super (MLP, self). _ init__ (* * kwargs) self.hidden = nn.Linear can be specified when constructing the instance. Self.act = nn.ReLU () self.output = nn.Linear (256,10) # defines the forward calculation of the model # that is, how to return the required model output def forward (self, x) based on the input x: O = self.act (self.hidden (x)) return self.output (o) X = torch.rand (2) Net = MLP () print (net) print ('-'* 60) print (net (X))

The result is:

MLP (

(hidden): Linear (in_features=784, out_features=256, bias=True)

(act): ReLU ()

(output): Linear (in_features=256, out_features=10, bias=True)

)

Tensor ([0.1836, 0.1946, 0.0924,-0.1163,-0.2914,-0.1103,-0.0839,-0.1274)

0.1618,-0.0601]

[0.0738, 0.2369, 0.0225,-0.1514,-0.3787,-0.0551,-0.0836,-0.0496

0.1481, 0.0139]], grad_fn=)

Note:

(1) the above MLP class does not need to define the back propagation function, the system will automatically generate the backward function needed for back propagation by automatically finding the gradient.

(2) the net object obtained after the data X is instantiated into the MLP class will do a forward calculation, and net (X) will call the call function of the MLP class inherited from the parent class Module-this function calls the forward function of the subclass MLP defined by us to complete the forward propagation calculation.

(3) the Module class is not named Layer (layer) or Model (model), etc., because this class is a component that can be constructed freely, and its subclass can be either a layer (such as inheriting the linear layer of the parent class nn), a model (the subclass MLP in this place), or a part of the model.

2. Common layers in neural networks

There are full connection layer, convolution layer, pooling layer and loop layer, etc. Let's learn to use Module to define layer.

2.1 layers without model parameters

The MyLayer class constructed below defines a layer that subtracts the mean from the input by inheriting the Module class, and defines the calculation of the layer in the "forward function". This layer contains model parameters.

#-*-coding: utf-8-*-"Created on Sat Oct 16 10:19:59 2021@author: 86493"import torchfrom torch import nnclass MyLayer (nn.Module): def _ init__ (self, * * kwargs): # call the parent method super (MyLayer, self). _ init__ (* * kwargs) def forward (self, x): return x-x.mean () # Test Instantiate the layer and do the forward calculation layer = MyLayer () layer1 = layer (torch.tensor ([1,2,3,4,5], dtype = torch.float)) print (layer1)

The result is:

Tensor ([- 2.,-1., 0., 1., 2.])

2.2 layers with model parameters

You can customize a custom layer with model parameters. The model parameters can be learned through training.

The Parameter class is actually a subclass of Tensor, and if a Tensor is Parameter, it is automatically added to the parameter table of the model. So when customizing the layer with model parameters, we should define the parameters as Parameter. In addition to directly defining the Parameter class, we can also use ParameterList and ParameterDict to define the parameters table and dictionary, respectively.

PS: the following shows that torch.mm multiplies two matrices, such as

#-*-coding: utf-8-* "Created on Sat Oct 16 10:56:03 2021@author: 86493"import torcha = torch.randn (2,3) b = torch.randn (3,2) print (torch.mm (a, b)) # same effect print (torch.matmul (a, b)) # tensor ([[1.8368, 0.4065], # [2.7972, 2.3096]]) # tensor ([[1.8368, 0.4065]] # [2.7972, 2.3096]) (1) Code Chestnut-*-coding: utf-8-*-"" Created on Sat Oct 16 10:33:04 2021@author: 86493 "" import torchfrom torch import nnclass MyListDense (nn.Module): def _ _ init__ (self): super (MyListDense) Self). _ init__ () # the meaning of 3 randn self.params = nn.ParameterList ([nn.Parameter (torch.randn (4,4) for i in range (3)]) self.params.append (nn.Parameter (torch.randn (4,1)) def forward (self) X): for i in range (len (self.params)): # mm refers to matrix multiplication x = torch.mm (x, self.params [I]) return x net = MyListDense () print (net)

Printed as follows:

MyListDense (

(params): ParameterList (

(0): Parameter containing: [torch.FloatTensor of size 4x4]

(1): Parameter containing: [torch.FloatTensor of size 4x4]

(2): Parameter containing: [torch.FloatTensor of size 4x4]

(3): Parameter containing: [torch.FloatTensor of size 4x1]

)

(2) Code Chestnut 2

Use the variable dictionary this time:

#-*-coding: utf-8-* "Created on Sat Oct 16 11:03:29 2021@author: 86493"import torchfrom torch import nnclass MyDictDense (nn.Module): def _ _ init__ (self): super (MyDictDense, self). _ _ init__ () self.params = nn.ParameterDict ({'linear1': nn.Parameter (torch.randn (4,4))" 'linear2': nn.Parameter (torch.randn (4,1)}) # add self.params.update ({' linear3': nn.Parameter (torch.randn (4,2))}) def forward (self, x, choice = 'linear1'): return torch.mm (x, self.params [choice]) net = MyDictDense () print (net)

Printed as follows:

MyDictDense (

(params): ParameterDict (

(linear1): Parameter containing: [torch.FloatTensor of size 4x4]

(linear2): Parameter containing: [torch.FloatTensor of size 4x1]

(linear3): Parameter containing: [torch.FloatTensor of size 4x2]

)

2.3 two-dimensional convolution layer

The two-dimensional convolution layer makes a cross-correlation operation between the input and the convolution kernel, and adds a standard deviation to get the output. The model parameters of convolution layer include convolution kernel and standard deviation. When training the model, we usually initialize the convolution kernel randomly, and then iterate the convolution kernel and deviation constantly.

The convolution layer whose convolution window is p × Q p\ times q p × Q is called p × Q p\ times q p × Q convolution layer. Similarly, p × Q p\ times q p × Q convolution kernel or p × Q p\ times q p × Q convolution kernel shows that the height and width of convolution kernel are p p p and q q q, respectively.

(1) filling can increase the height and width of the output. This is often used to make the output the same height and width as the input.

(2) the stride can reduce the height and width of the output, for example, the height and width of the output are only the height and width of the input (integers greater than 1).

#-*-coding: utf-8-* "Created on Sat Oct 16 11:20:57 2021@author: 86493"import torchfrom torch import nn# convolution operation (2D cross correlation) def corr2d (X, K): h, w = K.shape X, K = X.float (), K.float () Y = torch.zeros ((X.shape [0]-h + 1) X.shape [1]-w + 1) for i in range (Y.shape [0]): for j in range (Y.shape [1]): y [I, j] = (x [I: I + h, j: J + w] * K). Sum () return Y # 2D convolution layer class Conv2D (nn.Module): def _ init__ (self, kernel_size): super (Conv2D Self). _ init__ () self.weight = nn.Parameter (torch.randn (kernel_size)) self.bias = nn.Parameter (torch.randn (1)) def forward (self, x): return corr2d (x, self.weight) + self.bias conv2d = nn.Conv2d (in_channels = 1, out_channels = 1, kernel_size = 3 Padding = 1) print (conv2d)

Get:

Conv2d (1,1, kernel_size= (3,3), stride= (1,1), padding= (1,1))

Padding refers to filling elements (usually 0 elements) on both sides of the input height and width.

Next chestnut: create a 2D convolution layer with a height and width of 3, and set the number of fillers on both sides of the input height and width as 1, respectively. Given an input with a height and width of 8, the output height and width will also be 8.

#-*-coding: utf-8-* "Created on Sat Oct 16 11:54:29 2021@author: 86493"import torch from torch import nn# defines a function to calculate the convolution layer # ascending and reducing dimension def comp_conv2d (conv2d, X) corresponding to input and output left: # (1,1) represents batch size and number of channels X = X.view ((1) 1) + X.shape) Y = conv2d (X) # exclude the first 2 dimensions that you don't care about: batch and channel return Y.view (Y.Shape [2:]) # Note here one row or column is filled on both sides. So fill 2 rows or columns of conv2d = nn.Conv2d (in_channels = 1, out_channels = 1, kernel_size = 3, padding = 1) X = torch.rand (8,8) endshape = comp_conv2d (conv2d, X). Shapeprint (endshape) # uses convolution kernels with a height of 5 and a width of 3 The number of padding on both sides of the height and width is 2 and 1conv2d = nn.Conv2d (in_channels = 1, out_channels = 1, kernel_size = (5,3), padding = (2,1)) endshape2 = comp_conv2d (conv2d, X) .shapeprint (endshape2)

The result is:

Torch.Size ([8,8])

Stride

In the two-dimensional cross-correlation operation, the convolution window starts at the top left of the input array and slides on the input array from left to right and from top to bottom. We call the number of strokes and the number of slides per slide as stride.

# stride strideconv2d = nn.Conv2d (in_channels = 1, out_channels = 1, kernel_size = (3,5), padding = (0,1), stride = (3,4)) endshape3 = comp_conv2d (conv2d, X) .shapeprint (endshape3) # torch.Size ([2,2]) 2.4 pooled layer

The pooling layer calculates the output of the elements in a fixed shape window (also known as pooling window) of the input data each time. The same as the convolution layer calculates the cross-correlation between the input and the core, and the pooling layer directly calculates the maximum or average value of the elements in the pooling window. This operation is also called maximum pooling or average pooling, respectively.

In two-dimensional maximum pooling, the pooling window starts at the top left of the input array and slides over the input array in the order of left to right and top to bottom. When the pooled window slides to a location, the maximum value of the input subarray in the window is the element in the corresponding position in the output array.

Let's implement the forward calculation of the pooled layer in the pool2d function.

Maximum pooling:

#-*-coding: utf-8-* "Created on Sat Oct 16 18:49:27 2021@author: 86493"import torchfrom torch import nndef pool2d (x, pool_size, mode = 'max'): pairh, pairw = pool_size Y = torch.zeros ((X.shape [0]-pamphh + 1)" X.shape [1]-pairw + 1) for i in range (Y.shape [0]): for j in range (Y.shape [1]): if mode = = 'max': Y [I, j] = X [I: I + paww, j: J + prunw] .max () elif mode = =' avg': Y [I J] = X [I: I + pairh, j: J + prunw] .mean () return YX = torch.Tensor ([0,1,2], [3,4,5], [6,7,8]]) end = pool2d (X, (2,2)) # default is maximum pooling # end = pool2d (X, (2,2), mode = 'avg') print (end)

Tensor ([[4, 5.]

[7, 8.])

Average pooling:

#-*-coding: utf-8-* "Created on Sat Oct 16 18:49:27 2021@author: 86493"import torchfrom torch import nndef pool2d (x, pool_size, mode = 'max'): pairh, pairw = pool_size Y = torch.zeros ((X.shape [0]-pamphh + 1)" X.shape [1]-pairw + 1) for i in range (Y.shape [0]): for j in range (Y.shape [1]): if mode = = 'max': Y [I, j] = X [I: I + paww, j: J + prunw] .max () elif mode = =' avg': Y [I J] = X [I: I + pairh, j: J + prunw] .mean () return YX = torch.FloatTensor ([0,1,2], [3,4,5], [6,7,8]]) # end = pool2d (X, (2,2)) # default is maximum pooling end = pool2d (X, (2,2), mode = 'avg') print (end)

The results are as follows. Note that if mode is in avg mode (average pooling), do not write X = torch.tensor ([[0, 1, 2], [3, 4, 5], [6, 7, 8]]), otherwise an error Can only calculate the mean of floating types will be reported. Got Long instead. . Change tensor to Tensor or FloatTensor (Tensor is the abbreviation of FloatTensor).

Tensor ([[2.,3.]

[5, 6.])

Third, LeNet model chestnut

The typical training process of a neural network is as follows:

1 define a neural network that contains some learnable parameters (or weights)

two。 Iterate over the input dataset

3. Processing input through the network

4. Calculate loss (the distance between the output and the correct answer)

5. Parameters that backpropagate the gradient to the network

6. To update the weight of a network, a simple rule is generally used: weight = weight-learning_rate * gradient

#-*-coding: utf-8-*-"Created on Sat Oct 16 19:21:19 2021@author: 86493"import torchimport torch.nn as nnimport torch.nn.functional as Fclass LeNet (nn.Module): # you need to put the layer in the network with learnable parameters in the constructor _ _ init__ def _ init__ (self): super (LeNet, self). _ _ init__ () # input image channel:1 Output channel:6 # 5 convolutional kernel self.conv1 = nn.Conv2d (1,6,5) self.conv2 = nn.Conv2d (6,16,5) # an affine operation:y = Wx + b self.fc1 = nn.Linear (16 * 515,120) self.fc2 = nn.Linear (120,84) self.fc3 = nn.Linear (84,10) def forward (self X): # 2 * 2 maximum pool x = F.max_pool2d (F.relu (self.conv1 (x)), (2,2)) # if it is a square matrix Then you can use only one number to define x = F.max_pool2d (F.relu (self.conv2 (x)), 2) # do flatten x = x.view (- 1) Self.num_flat_features (x)) x = F.relu (self.fc1 (x)) x = F.relu (self.fc2 (x)) x = self.fc3 (x) return x def num_flat_features (self, x): # remove the batch dimension Get all the other dimensions size = x.size () [1:] num_features = 1 # multiply the dimensions just obtained for s in size: num_features * = s return num_featuresnet = LeNet () print (net) # the learnable parameters of a model can be returned params = list (net. Parameters () print ("len of params:" Len (params)) # print ("params:\ n", params) print (params [0] .size ()) # conv1 weight print ('-'* 60) # randomly a 32 × 32 inputinput = torch.randn (1,1,32,32) out = net (input) print ("output of the network is:" Out) print ('-'* 60) # back propagation of random gradients net.zero_grad () # gradient cache end = out.backward (torch.randn (1,10)) print (end) # None

The results of print are:

LeNet (

(conv1): Conv2d (1,6, kernel_size= (5,5), stride= (1,1))

(conv2): Conv2d (6,16, kernel_size= (5,5), stride= (1,1))

(fc1): Linear (in_features=400, out_features=120, bias=True)

(fc2): Linear (in_features=120, out_features=84, bias=True)

(fc3): Linear (in_features=84, out_features=10, bias=True)

)

Len: 10 of params

Torch.Size ([6, 1, 5])

The output of the network is: tensor ([0.0904, 0.0866, 0.0851,-0.0176, 0.0198, 0.0530, 0.0815, 0.0284)

-0.0216,-0.0425], grad_fn=)

None

Three reminders:

(1) you only need to define the forward function. The backward function is automatically defined when using autograd, and the backward function is used to calculate the derivative. We can use any operation and calculation for the tensor in the forward function.

(2) it is better to net.zero_grad () before backward, that is, to clear the gradient cache of all parameters.

(3) torch.nn only supports small batch processing (mini-batches). The whole torch.nn package only supports the input of small batch samples, not the input of a single sample. For example, nn.Conv2d accepts a 4-dimensional tensor, that is, if nSamples x nChannels x Height x Width is a separate sample, you only need to use input.unsqueeze (0) to add a "fake" batch size dimension.

Torch.Tensor: a multidimensional array that supports automatic derivation operations such as backward (), while preserving the gradient of the tensor.

Nn.Module: neural network module. It is a convenient way to encapsulate parameters, and has the functions of moving parameters to GPU, exporting, loading and so on.

Nn.Parameter: a tensor that is automatically registered as a parameter when assigned to a Module as an attribute.

Autograd.Function: implements the definition of automatic derivation forward and back propagation, creating at least one Function node for each Tensor, which connects to the function that created the Tensor and encodes its history.

4. AlexNet model chestnut

#-*-coding: utf-8-* "Created on Sat Oct 16 21:00:39 2021@author: 86493"import torchfrom torch import nnclass AlexNet (nn.Module): def _ _ init__ (self): super (AlexNet, self). _ _ init__ () self.conv = nn.Sequential (# in_channels,out_channels,kernel_size,stride,padding nn.Conv2d (1,96,11,4)" Nn.ReLU (), # kernel_size, stride nn.MaxPool2d (3,2) # see convolution window But use padding=2 to make the input and output have the same height and width # and increase the number of output channels nn.Conv2d (96,256,5,1,2), nn.ReLU (), nn.MaxPool2d (3,2), # three consecutive convolution layers, and then use a smaller convolution window # except for the final convolution layer Further increase the output # Note: no pooling layer is used after the first two convolution layers to reduce the input height and width nn.Conv2d (256,384,3,1,1), nn.ReLU (), nn.Conv2d (384,383,3,1,1), nn.ReLU (), nn.Conv2d (384,256,3,1,1) Nn.ReLU (), nn.MaxPool2d (3,2)) # the output of the full connection layer here is several times larger than that in LeNet. # use discard layer to alleviate over-fitting self.fc = nn.Sequential (nn.Linear (256x5 * 5,0.5), nn.ReLU (), nn.Dropout (4096), nn.Linear (4096, 4086), nn.ReLU (), nn.Dropout (4096), # output layer Fash-MNIST will be used next time, so here the category is set to 10, # instead of 1000 nn.Linear (4096, 10),) def forward (self, img): feature = self.conv (img) output = self.fc (feature.view (img.shape [0],-1)) return output net = AlexNet () print (net)

You can see the structure of the network:

AlexNet ((conv): Sequential (0): Conv2d (1,96, kernel_size= (11,11), stride= (4,4)) (1): ReLU () (2): MaxPool2d (kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False) (3): Conv2d (96,256, kernel_size= (5,5), stride= (1,1), padding= (2,2)) (4): ReLU () (5): MaxPool2d (kernel_size=3, stride=2, padding=0, dilation=1) Ceil_mode=False) (6): Conv2d (256,384, kernel_size= (3,3), stride= (1,1), padding= (1,1)) (7): ReLU () (8): Conv2d (384,383, kernel_size= (3,3), stride= (1,1), padding= (1,1)) (9): ReLU () (10): Conv2d (384,256, kernel_size= (3,3), stride= (1,1), padding= (1) 1) (11): ReLU () (12): MaxPool2d (kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False) (fc): Sequential ((0): Linear (in_features=6400, out_features=4096, bias=True) (1): ReLU () (2): Dropout (paired 0.5, inplace=False) (3): Linear (in_features=4096, out_features=4086, bias=True) (4): ReLU () (5): Dropout (paired 0.5) Inplace=False) (6): Linear (in_features=4096, out_features=10, bias=True) "what are the basics of building models for PyTorch Zero Basics"? Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.