How to use Batch Normalization folding to speed up model reasoning 04/16 Update SLTechnology News&Howtos

How to use Batch Normalization folding to speed up model reasoning

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

In this issue, the editor will bring you about how to use Batch Normalization folding to speed up model reasoning. The article is rich in content and analyzed and described from a professional point of view. I hope you can get something after reading this article.

Guide reading

How to remove the batch normalization layer to accelerate the neural network.

Introduction

Batch Normalization is a technology that normalizes the input of each layer and makes the training process faster and more stable. In practice, it is an additional layer, which we usually add after the computational layer and before nonlinearity. It consists of two steps:

First subtract the average value, and then divide by its standard deviation further through γ scaling, through β offset, these are the parameters of the batch normalization layer. When the network does not need data, the mean is 0 and the standard deviation is 1.

Batch normalization has high efficiency in neural network training, so it has been widely used. But how useful is it in reasoning?

Once the training is over, each Batch normalization layer has a specific set of gamma and beta, as well as μ and σ, which is calculated using an exponentially weighted average during the training process. This means that in the reasoning process, Batch normalization is like a simple linear transformation of the results of the upper layer (usually convolution).

Since convolution is also a linear transformation, it also means that the two operations can be merged into a single linear transformation! This removes some unnecessary parameters, but also reduces the number of operations to be performed during reasoning.

What do you do in practice?

With a little mathematical knowledge, we can easily rearrange the convolution to deal with batch normalization. As a reminder, the operation of convoluting an input x before batch normalization can be expressed as:

So, if we rearrange the W and b of the convolution, consider the parameters of batch normalization, as follows:

We can get rid of the batch normalization layer and still get the same result!

Note: in general, there is no bias in the layer before the batch normalization layer, because it is useless and a waste of parameters, because any constant will be offset by batch normalization.

How does this work?

We will try two common architectures:

VGG16ResNet50 using batch norm

For demonstration purposes, we use ImageNet dataset and PyTorch. Both networks will train five epoch to see the changes in the number of parameters and reasoning time.

1. VGG16

Let's start by training VGG16's 5 epoch (the final accuracy is not important):

Number of parameters:

The initial reasoning time for a single image is:

If batch normalization folding is used, we have:

And:

8448 parameters were removed, and even better, almost 0.4 milliseconds faster! Most importantly, this is completely lossless and there is absolutely no change in performance:

Let's see what it looks like in the case of Resnet50!

2. Resnet50

Similarly, we started to train it with five epochs:

The initial number of parameters is:

The reasoning time is:

After folding with batch normalization, there are:

And:

Now, we have 26560 parameters removed, and even more surprisingly, hi, reasoning time has been reduced by 1.5ms, and performance has not degraded at all.

The above is how to use Batch Normalization folding to speed up model reasoning. If you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.