Example Analysis of convolution layer + Activation function + pooling layer + full connection layer in SpringBoot Integration MybatisPlus 07/01 Update SLTechnology News&Howtos

Example Analysis of convolution layer + Activation function + pooling layer + full connection layer in SpringBoot Integration MybatisPlus

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)05/31 Report--

This article is about the example analysis of convolution layer + activation function + pooled layer + full connection layer in SpringBoot integration MybatisPlus. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

1. Convolution layer

Convolution is an effective method to extract image features. A square convolution kernel is usually used to traverse every pixel on the picture. Each pixel value corresponding to the coincidence region of the picture and the convolution kernel is multiplied by the weight of the corresponding points in the convolution kernel, and then summed, plus offset, and finally a pixel value in the output image is obtained.

The picture is divided into grayscale image and color image, and the convolution core can be single or multiple, so the convolution operation is divided into the following three cases:

1.1 single channel input, single convolution kernel

Here, the single channel refers to the input as a grayscale image, and the number of convolution kernels with single convolution kernel value is 1.

Above is the grayscale image of 5x5x1, 1 represents a single channel, 5x5 represents the resolution, a total of 5 rows and 5 columns of gray values. If you use a convolution kernel of 3x3x1 to convolute the grayscale image of this 5x5x1 and offset the item bread1, then the convolution is calculated as: (- 1) x1x0x0x0x2+ (- 1) x5x0x4x1x2+ (- 1) x3+0x4+1x5+1=1 (please don't forget to add bias 1).

1.2 Multi-channel input, single convolution kernel

In most cases, the input image is a color picture composed of three colors of RGB, and the input image contains red, green and blue data, and the depth (number of channels) of the convolution kernel should be equal to the number of channels of the input image, so use the convolution kernel of 3x3x3, and the last 3 means to match the three channels of the input image, so that the convolution kernel has three channels, and each channel will randomly generate nine parameters to be optimized. There are 27 parameters w and one bias b to be optimized.

Note: this is still the case of a single convolution kernel, but a convolution kernel can have multiple channels. By default, the number of channels in the convolution core is equal to the number of channels in the input picture.

1.3 Multi-channel input, multi-convolution kernel

Multi-channel input and multi-convolution kernel are the most common forms in deep neural networks. Refers to the case of multi-channel input and multiple convolution cores. Then the convolution process is actually very simple, taking 3-channel input and 2 convolution kernels as an example:

The main results are as follows: (1) first, a convolution kernel is convoluted with a 3-channel input, which is the same as a multi-channel input and a single convolution kernel to get a 1-channel output output1. Also take out the second convolution kernel and do the same operation to get the second output output2

(2) stack the output1 of the same size with the output2 to get a 2-channel output output.

For a more intuitive understanding, the following illustration is given:

In the picture, enter X: [1] means to enter a 3-channel picture with a height h and a width of w.

The convolution kernel W: [k ~ (1) Q ~ (3) Q ~ 2] means that the size of the convolution kernel is 3 ~ 3, the number of channels is 3 and the number of convolution kernels is 2.

Summary:

(1) after convolution operation, the number of output channels = the number of convolution kernels

(2) the number of convolution kernels and the number of channels of convolution kernels are different. The number of convolution cores per layer will be given when designing the network, but the number of channels of convolution cores may not be given. By default, the number of channels in the convolution core = the number of channels entered, because this is a necessary condition for convolution operations.

1.4 populating padding

In order to get a satisfactory output image size after the convolution operation, padding is often used to fill the input. Fill 0 around the picture by default.

(1) padding='same' filled with all-zero

When using same, the original image will be filled with full zero automatically, and when the step size is 1, the output image can be guaranteed to be the same size as the input image.

Output size calculation formula: input length / step size (rounded up)

The implementation in TensorFlow is as follows: (here the number of convolution kernels: 48, convolution kernels size: 3, step size: 1, full filling as an example)

Layers.Conv2D (48, kernel_size=3, strides=1, padding='same')

(2) do not populate padding='valid'

When using valid, convolution is done without any filling.

Output size calculation formula: (input length-core length) / step + 1 (rounded down)

The implementation in TensorFlow is as follows:

Layers.Conv2D (48, kernel_size=3, strides=1, padding='valid')

(3) Custom padding

Generally, it is populated from top to bottom, left and right, and the number of columns populated on the left and right is generally the same, and the number of rows pw padding ph on and under should also be the same. As shown in the following figure:

Output size calculation formula:

Where h _ (th) w is the height and width of the original image, k is the size of the convolution kernel, and s is the step size.

In TensorFlow2.0, during custom padding, the format of the padding parameter is set as follows:

Padding= [[0jol 0], [top, bottom], [left, right], [0jol 0]]

# for example, to fill a unit in the left and right sides of the top and bottom, the implementation is as follows: layers.Conv2D (48, kernel_size=3, strides=1, padding= [[0recol 0], [1je 1], [1je 1], [0je 0]]) 2. Pooled layer

In the convolution layer, the height and width of the feature graph can be reduced exponentially by adjusting the step size parameter s, thus reducing the number of parameters of the network. In fact, in addition to setting the step size, there is a special network layer that can achieve size reduction, which is the pooling layer (Pooling layer) that we are going to introduce.

The pooling layer is also based on the idea of local correlation and obtains new element values by sampling or aggregating information from a group of locally related elements. Usually we use two kinds of pooling for downsampling:

(1) maximum Pool (Max Pooling)

Select the largest element value from the set of locally related elements.

(2) average pooling (Average Pooling)

Calculates the average from a set of locally related elements and returns.

3. Activation function

Activation function is also an indispensable part of neural network, there are several commonly used activation functions, specific how to choose the appropriate activation function can refer to my blog: neural network construction: activation function summary

4. Fully connected layer

The full connection layer is referred to as FC. It is called full connection because each neuron is connected to every neuron in the adjacent layer. As shown in the following figure, it is a simple two-layer fully connected network, with input characteristics and output as the result of prediction.

The number of parameters in the full connection layer can be calculated directly, and the formula is as follows:

According to the two-layer fully connected network built above, there are nearly 400000 parameters to be optimized in order to train black-and-white images with a resolution of 28x28=784 alone. In real life, high-resolution color images have more pixels and are red, green and blue three-channel information. There are too many parameters to be optimized, which can easily lead to over-fitting of the model. In order to avoid this phenomenon, the original pictures are generally not fed directly into the fully connected network in practical applications.

In practical application, the convolution features of the original image will be extracted first, and the extracted features will be fed to the fully connected network, and then the fully connected network will calculate the classification evaluation value.

Update:

In 2015, Google researcher Sergey and others designed the BN layer based on parameter standardization. After the BN layer is proposed, it is widely used in various deep network models, which makes the setting of the super parameters of the network more free. at the same time, the convergence speed of the network is faster and the performance is better.

Thank you for reading! On the "SpringBoot integration MybatisPlus convolution layer + activation function + pooling layer + full connection layer example analysis" this article is shared here, I hope the above content can be of some help to you, so that you can learn more knowledge, if you think the article is good, you can share it out for more people to see it!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.