Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to solve the mismatch between pytorch loading pre-training model and its own model

2025-01-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the pytorch loading pre-training model and their own model does not match how to solve the relevant knowledge, the content is detailed and easy to understand, simple and fast operation, has a certain reference value, I believe that after reading this pytorch loading pre-training model and their own model mismatch how to solve the article will have a harvest, let's take a look.

Two ordered dictionaries are different.

The parameters of the model and the pth file are both ordered dictionaries (OrderedDict), so changing the keys in the dictionary to a list can iterate through the for loop to find the difference.

Model = ResNet18 (1) model_dict1 = torch.load ('resnet18.pth') model_dict2 = model.state_dict () model_list1 = list (model_dict1.keys ()) model_list2 = list (model_dict2.keys () len1 = len (model_list1) len2 = len (model_list2) minlen = min (len1) Len2) for n in range (minlen): if model_ shape 1 [model _ List1 [n] .shape! = model_ List2 [model _ List2 [n] .shape: err = 1 precautions for building your own model

When building the network, it is necessary to match the dictionary order of the pth file, the dictionary order, weight size (shape) and variable naming must be exactly the same as the pth file. If only the variable names are different, you can use a similar method to reassign the weights of the model.

Model = ResNet18 (1) model_dict1 = torch.load ('resnet18.pth') model_dict2 = model.state_dict () model_list1 = list (model_dict1.keys ()) model_list2 = list (model_dict2.keys () len1 = len (model_list1) len2 = len (model_list2) minlen = min (len1) Len2) for n in range (minlen): if model_ listing 1 [model _ List1 [n]]. Shape! = model_ listing 2 [model _ list2 [n] .shape: continue model_ listing 1 [model _ List1 [n]] = model_ listing 2 [model _ List2 [n]] model.load_state_dict (model_dict2)

For the complete code, see build resnet18 network and load torchvision with its own weight.

The new improved code model_dict1 = torch.load ('yolov5.pth') model_dict2 = model.state_dict () model_list1 = list (model_dict1.keys ()) model_list2 = list (model_dict2.keys ()) len1 = len (model_list1) len2 = len (model_list2) m, n = 0, 0while True: if m > = len1 or n > = len2: break layername1, layername2 = model_list1 [m], model_ List2 [n] W1, w2 = model_ room1 [layername1] Model_ if w1.shape 2 [layername2] if w1.shape! = w2.shape: continue model_ room2 [layername2] = model_ room1 [layername1] m + = 1 n + = 1model.load_state_dict (model_dict2)

If the model does not match, add 1 to m or n manually after running line 14 of the statement, depending on your situation.

Add: some pits of pytorch: using the characteristics of some layers of the pre-trained vgg model to report errors, such as tensor mismatch

Look at the code ~ # if you are going to take the output of the second fully connected layer of VGG19, you need to build a class that contains all the convolution layers of VGG # and all networks to the second fully connected layer with their corresponding parameters class Classification_att (nn.Module): def _ init__ (self, rgb_range): super (Classification_att) Self). _ init__ () self.vgg19 = models.vgg19 (pretrained=True) vgg = models.vgg19 (pretrained=True). Features conv_modules = [m for m in vgg] self.vgg_conv = nn.Sequential (* conv_modules [: 37]) classfi = models.vgg19 (pretrained=True). Classifier classif_modules = [n for n in classfi] self.vgg_class = nn.Sequential (* classif_modules [: 4]) vgg_mean = (0.485 0.456, 0.406) vgg_std = (0.229 * rgb_range, 0.224 * rgb_range, 0.225 * rgb_range) self.sub_mean = common.MeanShift (rgb_range, vgg_mean) Vgg_std) for p in self.vgg_conv.parameters (): p.requires_grad = False for p in self.vgg_class.parameters (): p.requires_grad = False self.classifi = nn.Sequential (nn.Linear (4096, 1024), nn.ReLU (True), nn.Linear (1024, 256), nn.ReLU (True) Nn.Linear (256,64),) def forward (self, x): X = F.interpolate (x, size= [224,224], scale_factor=None, mode='bilinear', align_corners=False) x = self.sub_mean (x) x = self.vgg_conv (x) x = self.vgg_class (x) # execute this error report It is said that tensors do not match

The reason is that the output of the convolution layer cannot be directly connected to the fully connected layer, even if the total size of the output tensor is the same.

Looking at the pytorch source code of vgg, it is found that x = self.features (x) x = self.avgpool (x) x = torch.flatten (x, 1) x = self.classifier (x) # your own code does not have torch.flatten (x, 1), so your own is missing one step x = torch.flatten (x, 1).

Just make it up!

This is the end of the article on "how to solve the mismatch between pytorch loading pre-training model and your own model". Thank you for reading! I believe you all have a certain understanding of the knowledge of "how to solve the mismatch between pytorch loading pre-training model and your own model". If you want to learn more, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report