What is the process of Pytorch transformation, Caffe transformation and om model transformation? 07/12 Update SLTechnology News&Howtos

What is the process of Pytorch transformation, Caffe transformation and om model transformation?

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article introduces how Pytorch transforms Caffe and then transforms om model transformation process. The content is very detailed. Interested friends can use it for reference. I hope it will be helpful to you.

Standard network

Baseline:PytorchToCaffe

The main function codes are as follows:

PytorchToCaffe+-- Caffe | +-caffe.proto | +-layer_param.py+-- example | +-- resnet_pytorch_2_caffe.py+-- pytorch_to_caffe.py

For direct use, you can refer to resnet_pytorch_2_caffe.py, and if all the operations in the network have been implemented in Baseline, you can directly convert to the Caffe model.

Add Custom Action

If you encounter an operation that is not implemented, it should be considered in two situations.

There are corresponding operations in Caffe

Take arg_max as an example to share how to add operations.

First, check the parameters of the corresponding layer in Caffe: caffe.proto is the definition of the layer and parameters of the corresponding version of caffe. You can see that ArgMax defines three parameters: out_max_val, top_k and axis:

Message ArgMaxParameter {/ / If true produce pairs (argmax, maxval) optional bool out_max_val = 1 [default = false]; optional uint32 top_k = 2 [default = 1]; / / The axis along which to maximise-- may be negative to index from the / / end (e.g.1 for the last axis). / / By default ArgMaxLayer maximizes over the flattened trailing dimensions / / for each index of the first / num dimension. Optional int32 axis = 3;}

It is consistent with the parameters in the boundary of Caffe operator.

Layer_param.py builds an example of the parameter class during the concrete transformation, and realizes the transfer of operation parameters from Pytorch to Caffe:

Def argmax_param (self, out_max_val=None, top_k=None, dim=1): argmax_param = pb.ArgMaxParameter () if out_max_val is not None: argmax_param.out_max_val = out_max_val if top_k is not None: argmax_param.top_k = top_k if dim is not None: argmax_param.axis = dim self.param.argmax_param.CopyFrom (argmax_param)

The Rp class is defined in pytorch_to_caffe.py to implement the transformation from Pytorch operation to Caffe operation:

Class Rp (object): def _ init__ (self, raw, replace, * * kwargs): self.obj = replace self.raw = raw def _ _ call__ (self, * args, * * kwargs): if not NET_INITTED: return self.raw (* args * * kwargs) for stack in traceback.walk_stack (None): if 'self' in stack [0] .f _ locals: layer = stack [0] .f _ locals [' self'] if layer in layer_names: log.pytorch_layer_name = layer_ nameslayer [layer] print ('984' Layer_ namespace break out = self.obj (self.raw, * args, * * kwargs) return out

When adding an action, replace the action with the Rp class:

Torch.argmax = Rp (torch.argmax, torch_argmax)

Next, you will implement this operation specifically:

Def torch_argmax (raw, input, dim=1): X = raw (input, dim=dim) layer_name = log.add_layer (name='argmax') top_blobs = log.add_blobs ([x], name='argmax_blob'.format (type)) layer = caffe_net.Layer_param (name=layer_name, type='ArgMax', bottom= [log.blobs (input)] Top=top_blobs) layer.argmax_param (dim=dim) log.cnet.add_layer (layer) return x

That is, the conversion from argmax operation Pytorch to Caffe is realized.

There is no direct corresponding operation in Caffe

If the operation to be converted does not have a direct corresponding layer implementation in Caffe, there are two main solutions:

1) decompose unsupported operations into supported operations in Pytorch:

For example, nn.InstanceNorm2d, instance normalization is done with BatchNorm during conversion, and affine=True or track_running_stats=True is not supported. The default use_global_stats:false is use_global_stats:false, but use_global_stats must be true when om is converted, so you can go to Caffe, but it is not friendly to switch to om.

InstanceNorm is a normalization operation on each Channel of featuremap, so you can achieve that the nn.InstanceNorm2d is:

Class InstanceNormalization (nn.Module): def _ _ init__ (self, dim, eps=1e-5): super (InstanceNormalization Self). _ init__ () self.gamma = nn.Parameter (torch.FloatTensor (dim)) self.beta = nn.Parameter (torch.FloatTensor (dim)) self.eps = eps self._reset_parameters () def _ reset_parameters (self): self.gamma.data.uniform_ () self.beta.data.zero_ () def _ call__ (self) X): n = x.size (2) * x.size (3) t = x.view (x.size (0), x.size (1), n) mean = torch.mean (t, 2) .unsqueeze (2). Unsqueeze (3). Expand_as (x) var = torch.var (t) Expand_as (x) gamma_broadcast = self.gamma.unsqueeze (1) .unsqueeze (1) .unsqueeze (0). Expand_as (x) beta_broadcast = self.beta.unsqueeze (1) .unsqueeze (1) .unsqueeze (0). Expand_as (x) out = (x-mean) / torch.sqrt (var + self.eps) out = out * gamma_broadcast + beta_broadcast return out

However, in verifying the boundary of the HiLens Caffe operator, it is found that the om model transformation does not support the summation or mean operation outside the Channle dimension. In order to avoid this operation, we can re-implement nn.InstanceNorm2d through the supported operators:

Class InstanceNormalization (nn.Module): def _ init__ (self, dim, eps=1e-5): super (InstanceNormalization, self). _ init__ () self.gamma = torch.FloatTensor (dim) self.beta = torch.FloatTensor (dim) self.eps = eps self.adavg = nn.AdaptiveAvgPool2d (1) def forward (self, x): n, c, h W = x.shape mean = nn.Upsample (scale_factor=h) (self.adavg (x)) var = nn.Upsample (scale_factor=h) (self.adavg ((x-mean). Pow (2)) gamma_broadcast = self.gamma.unsqueeze (1) .unsqueeze (1) .unsqueeze (0). Expand_as (x) beta_broadcast = self.beta.unsqueeze (1) .unsqueeze (1) .unsqueeze (0). (X) out = (x-mean) / torch.sqrt (var + self.eps) out = out * gamma_broadcast + beta_broadcast return out

After verification, it is equivalent to the original operation and can be converted to Caffe model.

2) by making use of existing operations in Caffe:

In the process of converting Pytorch to Caffe, it is found that if there is an operation involving constants such as featuremap + 6, there will be the problem of not finding blob in the conversion process. Let's first look at the specific conversion method of add operation in pytorch_to_caffe.py:

Def _ add (input, * args): X = raw__add__ (input, * args) if not NET_INITTED: return x layer_name = log.add_layer (name='add') top_blobs = log.add_blobs ([x], name='add_blob') if log.blobs (args [0]) = None: log.add_blobs ([args [0]]) Name='extra_blob') else: layer = caffe_net.Layer_param (name=layer_name, type='Eltwise', bottom= [log.blobs (input), log.blobs (args [0])], top=top_blobs) layer.param.eltwise_param.operation = 1 # sum is 1 log.cnet.add_layer (layer) return x

You can see that when blob does not exist, we only need to modify it under the condition of log.blobs (args [0]) = = None. A natural idea is to use the Scale layer to implement add operation:

Def _ add (input, * args): X = raw__add__ (input, * args) if not NET_INITTED: return x layer_name = log.add_layer (name='add') top_blobs = log.add_blobs ([x], name='add_blob') if log.blobs (args [0]) = None: layer = caffe_net.Layer_param (name=layer_name, type='Scale') Bottom= [log.blobs (input)], top=top_blobs) layer.param.scale_param.bias_term = True weight = torch.ones ((input.shape [1])) bias = torch.tensor (args [0]) .squeeze () .expand_as (weight) layer.add_data (weight.cpu () .data.numpy () Bias.cpu () .data.numpy () log.cnet.add_layer (layer) else: layer = caffe_net.Layer_param (name=layer_name, type='Eltwise', bottom= [log.blobs (input), log.blobs (args [0])] Top=top_blobs) layer.param.eltwise_param.operation = 1 # sum is 1 log.cnet.add_layer (layer) return x

Similarly, the simple multiplication of featuremap * 6 can be achieved in the same way.

Stepped on the pit

Pooling:Pytorch defaults to ceil_mode=false,Caffe and defaults to ceil_mode=true, which may lead to dimension changes. If there is a size mismatch, you can check whether the Pooling parameters are correct. In addition, although it is not seen in the document, the model can be converted after kernel_size > 32, but the error will be reported, and the Pooling operation can be divided into two layers.

Upsample: the Upsample layer scale_factor parameter in the om boundary operator must be int, not size. If the existing model parameter is size, it will finish the process of converting Pytorch to Caffe normally, but the Upsample parameter is empty at this time. If the parameter is size, you can consider converting it to scale_factor or using Deconvolution.

The output_padding parameter in Transpose2d:Pytorch will be added to the output size, but Caffe will not, and the output feature image will become relatively smaller. At this time, the featuremap after deconvolution will become a little larger, which can be clipped through the Crop layer to make it the same size as the Pytorch corresponding layer. In addition, the speed of deconvolution reasoning in om is slow, it is best not to use it, you can use Upsample+Convolution instead.

There are many kinds of Pad operations in Pad:Pytorch, but symmetric pad in H and W dimensions can only be performed in Caffe. If there is an asymmetric pad operation such as h = F.pad (x, (1,2,1,2), "constant", 0) in the Pytorch network, the solution is as follows:

If there is no subsequent dimension mismatch in the layer of asymmetric pad, you can first determine the impact of pad on the results. Some tasks are little affected by pad, so they do not need to be modified.

If there is a problem of dimension mismatch, you can consider Crop after full pad according to larger parameters, or combine the front and back (0,0,1,1) and (1,1,0,0) pad into one (1,1,1,1), depending on the specific network structure.

If it is a pad on the Channel dimension, such as F.pad (x, (0,0,0,0,0,0, channel_pad), "constant", 0), you can consider cat to featuremap after zero convolution:

Zero = nn.Conv2d (in_channels, self.channel_pad, kernel_size=3, padding=1, bias=False) nn.init.constant (self.zero.weight, 0) pad_tensor = zero (x) x = torch.cat ([x, pad_tensor], dim=1)

Some operations can be transferred to Caffe, but om does not support all the operations of standard Caffe. If you want to go to om again, you should confirm the boundary operator against the document.

On Pytorch transformation Caffe and then transform om model transformation process is shared here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.