How to perform full parsing from Inception v1 to Inception v4 04/16 Update SLTechnology News&Howtos

How to perform full parsing from Inception v1 to Inception v4

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article introduces you how to carry out full analysis from Inception v1 to Inception v4, the content is very detailed, interested friends can refer to, hope to be helpful to you.

The following describes the main members of the Inception family, including Inception v1, Inception v2, Inception v3, Inception v4, and Inception-ResNet. Their computational efficiency and parametric efficiency are the highest in all convolution architectures.

Inception network is an important milestone in the history of CNN classifier. Before the advent of Inception, most popular CNN simply stacked more and more convolution layers, making the network deeper and deeper, hoping for better performance.

For example, AlexNet,GoogleNet, VGG-Net, ResNet and so on all improve the accuracy by deepening the level and depth of the network.

The most important feature of GoogLeNet is the use of Inception module, which aims to design a network with excellent local topology, that is, to perform multiple convolution operations or pooling operations on the input image in parallel, and to splice all the output results into a very deep feature graph. Because different convolution operations and pooling operations such as 1-1, 3-3 or 5-5 can obtain different information of the input image, parallel processing of these operations and combining all the results will get better image representation.

Common versions of Inception are:

Inception v1Inception v2 and Inception v3Inception v4 and Inception-ResNet

Each version is an iterative evolution of the previous version. Understanding the upgrade of Inception network can help us to build custom classifiers and optimize the speed and accuracy.

Inception v1

Inception v1 first appeared in the paper "Going deeper with convolutions". The author proposes a deep convolution neural network Inception, which achieves the best classification and detection performance in ILSVRC14 at that time.

The main features of Inception v1 are as follows: first, mining the role of 11 convolution kernels *, reducing parameters and improving the effect; second, let the model decide how big convolution kernels to use.

1 * 1 convolution

1 * 1 convolution can not only reduce the number of parameters of neural network, but also compress the number of channels, which greatly improves the computational efficiency.

Combine convolution kernels of different sizes

Combining different convolution kernels together can not only increase the receptive field, but also improve the robustness of the neural network. Stacking convolution cores of different sizes in one layer means that the layer can produce the effect of convolution cores of different sizes, and it also means that there is no need to artificially choose how to roll this layer. the network will learn what kind of convolution (or pooling) is the best.

The following are the basic components of the convolution neural network Inception module:

Inception v2

Inception v2 and Inception v3 come from the same paper "Rethinking the Inception Architecture for Computer Vision". The author proposes a series of correction methods that can increase accuracy and reduce computational complexity.

Decompose 5 * 5 convolution into two 3 * 3 convolution

Decompose the convolution of 5 × 5 into two 3 × 3 convolution operations to improve the speed of calculation. In this way, only about (3x3 + 3x3) / (5x5) = 72% of the computing overhead can be effectively used. The following figure shows the effectiveness of this replacement.

So the upgraded Inception module is shown below:

The 5 × 5 convolution in the leftmost previous version of the Inception module becomes a stack of two 3 × 3 convolution.

The convolution kernel size of nhydn is decomposed into 1 × n and n × 1 convolution.

For example, a convolution of 3 × 3 is equivalent to performing a convolution of 1 × 3 and then a convolution of 3 × 1. It is also possible to use only about (1x3 + 3x1) / (3x3) = 67% of the computing overhead. The following figure shows the validity of this replacement. The author further imagines that any nxn conv can be replaced by two convs layers, 1xn and nx1, to save computing and memory.

The updated Inception module is shown in the following figure:

If nicked 3 here, it is consistent with the previous image. The leftmost 5x5 convolution can be represented as two 3x3 convolution, which in turn can be represented as 1x3 and 3x1 convolution.

The filter banks in the module are extended (that is, they become wider rather than deeper) to address representative bottlenecks. If the module does not expand the width, but becomes deeper, then the dimensions will be too much reduced, resulting in information loss. As shown in the following figure:

Inception v3

Inception v3 incorporates all the upgrades mentioned earlier in Inception v2 and uses:

RMSProp optimizer; Factorized 7x7 convolution; auxiliary classifier uses BatchNorm; tag smoothing (a regularization term added to the loss formula to prevent the network from being overconfident in a certain category, that is, preventing overfitting). Inception v2 and Inception v3 final model Inception v4

Inception v4 and Inception-ResNet are proposed in the same paper "Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning".

Inception v4 network structure Inception v4

First of all, the stem branch, you can directly look at the structure diagram of the paper:

Then they have three main Inception modules and Reduction modules, called A, B, and C (unlike Inception v2, these modules are indeed named A, B, and C). They look very similar to Inception v2 (or v3) variants.

Inception v4 introduces a dedicated "reduced block" (reduction block), which is used to change the width and height of the grid. Earlier versions did not explicitly use reduced blocks, but they also implemented their functionality.

Reduction block A (size reduction from 35x35 to 17x17) and reduction block B (size reduction from 17x17 to 8x8). Here we refer to the same hyperparameter settings in this paper (VMagneI _ I _ k).

Look directly at its network structure:

Inception-ResNet

In this paper, the author combines Inception architecture with residual connection (Residual). It is clearly proved by experiments that the combination of residual connection can significantly accelerate the training of Inception. There is also some evidence that residual Inception networks are slightly more expensive than Inception networks without residual connections at similar costs. The author also achieved an top-5 error rate of 3.08% on the test set of the ImageNet Classification Challenge through the integration of three residuals and an Inception v4 model.

(from left) Inception modules A, B, C in Inception ResNet. Notice that the pooling layer is replaced by the residual join, and there is additional 1x1 convolution before the residual addition operation.

The pooling operation of the main inception module is replaced by residual connection. However, you can still find these operations in reduced blocks. Reduction block An is the same as the reduction block in Inception v4.

For details of the network structure of each module of Inception-resnet A, B, C, see the original paper.

Attenuation factor for deep network structure design

If the number of convolution cores exceeds 1000, the residual unit deeper in the network architecture will cause the network to crash. Therefore, in order to increase stability, the author scales the residual activation value by a scale of 0.1 to 0.3.

The activation value is scaled by a constant to prevent the network from crashing.

Comparison of the accuracy of Inception-ResNet v1 structure results on how to share the full analysis from Inception v1 to Inception v4 here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.