What are the relevant knowledge points of CNN 07/06 Update SLTechnology News&Howtos

What are the relevant knowledge points of CNN

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the relevant knowledge of CNN, the content is detailed and easy to understand, the operation is simple and fast, and it has certain reference value. I believe that after reading this CNN related knowledge point, there will be some gains. Let's take a look at it together.

CNN(Convolutional Neural Network)

Convolutional neural networks are used in image recognition. Here is a rough introduction to the operation principle of CNN. A long time ago, image recognition actually used traditional methods, such as SVM. At ImageNet 12, Geoffrey Hinton, Ilya Sutskever and Alex Krizhevsky from the University of Toronto presented a deep convolutional neural network algorithm called AlexNet, which had a pattern recognition error rate as low as 16%, more than 40% lower than the second place. It can be said that artificial intelligence is approaching humans for the first time in "looking at specific pictures."

It should be mentioned that the deep learning course mentioned above was taught by Geoffrey Hinton.

Neural networks in image recognition generally use convolutional neural networks, i.e. CNN.

hierarchy

General neural networks use a fully connected approach, which will lead to too many parameters. Both Cnn and rnn modify the network structure to achieve specific functions.

A fully connected neural network is as follows:

Hidden layers can actually be 0~N layers, if there is no hidden layer, and the activation function uses a logical function, then it becomes a logistic regression model. There are only two layers here.

Convolutional neural networks have the following levels

1) Data input layer

2) Convolutional computation layer

3) Relu activation layer

4) pool layer

5) Fully connected layers

6) Output layer

data input layer

The data input layer is the same as the fully connected layer, which is layer 0. Generally speaking, pre-processing of data is required before graphics image processing, such as scaling the size of the image to a consistent size, mainly the following processing methods:

1) To average, such as a picture of pixels, RGB form, the range is between 0 and 256. Subtracting the mean from each of them yields a dimensionality mean of 0.

2) Normalization, that is, scaling down dimensional values to within one amplitude.

3) Dimension reduction

convolution computation layer

This should be a relatively key place for CNN. First, let's look at convolution operation. Convolution is a mathematical operation on two real variable functions. I don't want to get too complicated here, like

Here x is the input and w is the kernel function that needs to transform x into other features. The following is an illustration of a two-dimensional plane:

This is a convolution operation performed, where we obviously know that the parameters are shared. During a convolution operation, these kernels remain unchanged. Becomes the data for each sliding window.

Here are three related concepts:

1) Step size, as shown in the figure above is 1.

2) Sliding window, above is a 2*2 size.

3) Fill the value, because in the case of a step greater than 1, it may cause the sliding window to move to the right, and there is no value on the right. At this time, the value needs to be filled, and the value of 0 is generally used. Another reason is that when the step size is 1, if you don't fill it, it will cause the output dimension to be lower than the input dimension, so after several iterations, there will be no input...

Visualize as shown below, fill in a circle of 0 around.

Notice that this is just one neuron in the next layer, and if there are n, then there are n convolutions.

Relu activation layer

After convolution operation, nonlinear transformation is needed through activation function. The usual activation function is as follows:

1)Sigmoid

2)Relu

3)Tanh

4)Elu

5)Maxout

6)Leaky relu

You can refer to: en.wikipedia.org/wiki/Activation_function

Here are some lessons from CNN:

1) CNN should try not to use sigmoid, because sigmoid will cause gradient disappearance problem.

2) Use relu first

3) Then use Leaky Relu

4) If not, use maxout

pooling layer

Because there are many pixels in the picture, if you don't compress it, it will lead to too many parameters, and it will be overfitting. So you need to compress the image size. How do you compress it? We have to downsample. As shown below:

This is a bit like convolution calculation, but there is no kernel. is generally made of

1）Max

2）Average

Two downsampling methods, but the most used is max. Because average brings in new eigenvalues, not so good.

fully connected layer

After several convolution layers, the pooling layer, in the output layer before the layer, plus a fully connected layer, for the final output.

output layer

Finally, the data is output again through an activation function.

The overall hierarchy is as follows:

In general, in the structure of CNN, there will be several convolutional layers, then through the activation layer, then pooling, then continue to convolution and so on, and finally fully connect the output data. However, some do not use full connection as the last layer, but replace it with a one-dimensional convolutional layer.

regularization

In deep learning, optimization algorithms are basically SGD. Regularization is not like the traditional way L1 or L2. Instead, the neural network nodes are discarded with a certain probability, which is called dropout.

As shown in the figure below, randomly disable some nodes for each operation. Alternatively, it can be understood that the selection of neural nodes obeys a probability distribution.

Overfitting can be prevented by dropping out

1) The number of neurons is less, and the parameters are reduced, so the risk of overfitting can be reduced according to the VC dimension of machine learning.

2) By randomly discarding some neurons, different neural networks can be formed, and finally an aggregation neural network can be achieved, just like the idea of random forests.

typical structure

Common CNN structures are:

1) LeNet, first used for digital recognition

AlexNet, winner of the 2012 Vision Contest

3) ZF Net, winner of the Vision Competition 2013

Google Net, 2014

VGG Net, 2014

6) ResNet, 2015, 152 stories.

The content of this article on "What are the relevant knowledge points of CNN" is introduced here. Thank you for reading! I believe everyone has a certain understanding of "what are the relevant knowledge points of CNN". If you still want to learn more knowledge, you are welcome to pay attention to the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.