How to analyze the initialization of Network parameters in Pytorch Foundation 04/21 Update SLTechnology News&Howtos

How to analyze the initialization of Network parameters in Pytorch Foundation

2025-04-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

How to analyze the initialization of network parameters in the Pytorch foundation, many novices are not very clear about this. In order to help you solve this problem, the following editor will explain it in detail. People with this need can come and learn. I hope you can get something.

Parameter access and traversal:

For model parameters, we can access the

Since Sequential is inherited from Module, you can use the parameter () or named_parameters method of the Module clock to access all the parameters

For example, for a network built with Sequential, you can traverse directly using the following for loop:

For name, param in net.named_parameters (): print (name, param.size ())

Of course, you can also use indexes to access by layer, because the network itself is also built by layer:

For name, param in net [0] .named _ parameters (): print (name, param.size (), type (param))

When we get the parameter information for a certain layer, we can use the data () and grad () functions to access the values and gradients:

Weight_0 = list (net [0] .parameters ()) [0] print (weight_0.data) print (weight_0.grad) # pre-backpropagation gradient is NoneY.backward () print (weight_0.grad) parameter initialization problem:

When we use the for loop to get the parameters of each layer, we can set the initial values of w and offset b in the following form:

For name, param in net.named_parameters (): if 'weight' in name: init.normal_ (param, mean=0, std=0.01) print (name, param.data) for name, param in net.named_parameters (): if' bias' in name: init.constant_ (param, val=0) print (name, param.data)

Of course, we can also make custom settings for initialization functions:

Def init_weight_ (tensor): with torch.no_grad (): tensor.uniform_ (- 10,10) tensor * = (tensor.abs () > = 5). Float () for name, param in net.named_parameters (): if 'weight' in name: init_weight_ (param) print (name, param.data)

Pay attention to the problem of torch.no_grad () here.

This form indicates that the parameter does not change with the backward and is often used when the local network parameters are fixed.

As shown in the connection: about no_grad ()

Shared parameters:

You can customize the Module class to call the same layer implementation multiple times in forward

As shown in the code in the previous section:

Class FancyMLP (nn.Module): def _ init__ (self, * * kwargs): super (FancyMLP, self). _ init__ (* * kwargs) self.rand_weight = torch.rand ((20,20), requires_grad=False) # untrainable parameter (constant parameter) self.linear = nn.Linear (20,20) def forward (self X): X = self.linear (x) # using the constant parameters created And the relu function and mm function x = nn.functional.relu (torch.mm (x, self.rand_weight.data) + 1) # in nn.functional multiplex the full connection layer. Equivalent to the shared parameter x = self.linear (x) # control flow between the two fully connected layers, we need to call the item function to return scalars to compare while x.norm (). Item () > 1: X / = 2 if x.norm (). Item () < 0.8: X * = 10 return x.sum ()

So as you can see, it is equivalent to calling the same Linear instance twice in the same network at the same time, so the parameter sharing is realized in disguise.

Suo'yi note that if the layers passed into the Sequential module are the same Module instance, they share parameters

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.