Example Analysis of pytorch Deep convolution Neural Network AlexNet in Python programming 07/12 Update SLTechnology News&Howtos

Example Analysis of pytorch Deep convolution Neural Network AlexNet in Python programming

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article is about Python programming pytorch deep convolution neural network AlexNet example analysis. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

In 2012, AlexNet was born out of nowhere. It proves for the first time that the features learned can go beyond the features of manual design. It breaks the current situation of computer vision research at one stroke. AlexNet used eight-layer convolutional neural networks and won the ImageNet Image recognition Challenge in 2012 by a wide margin.

The following figure shows the architecture from LeNet (left) to AlexNet (right).

The design concepts of AlexNet and LeNet are very similar, but there are also the following differences:

AlexNet is much deeper than the relatively small LeNet5.

AlexNet uses ReLU instead of sigmoid as its activation function.

Capacity control and preprocessing

AlexNet controls the model complexity of the full connection layer through dropout, while LeNet only uses weight attenuation. In order to further expand the data, AlexNet added a large number of image enhancement data during training, such as flipping, cropping and discoloration. This makes the model more robust and the larger sample size effectively reduces overfitting.

Import torchfrom torch import nnfrom d2l import torch as d2lnet = nn.Sequential (# here, we use a larger window of 11x11 to capture objects # at the same time, the stride is 4 to reduce the height and width of the output # in addition The number of output channels is much larger than that of LeNet nn.Conv2d (1,96, kernel_size=11, stride=4, padding=1), nn.ReLU (), nn.MaxPool2d (kernel_size=3, stride=2) # reduces the convolution window and uses a fill of 2 to make the input consistent with the height and width of the output. And increase the number of output channels nn.Conv2d (96,256, kernel_size=5, padding=2), nn.ReLU (), nn.MaxPool2d (kernel_size=3, stride=2) # use three consecutive convolution layers and smaller convolution windows # in addition to the last convolution layer, the number of output channels further increases # after the first two convolution layers The aggregation layer is not used to reduce the input height and width nn.Conv2d (256,384, kernel_size=3, padding=1), nn.ReLU (), nn.Conv2d (384,384, kernel_size=3, padding=1), nn.ReLU (), nn.Conv2d (384,384, kernel_size=3, padding=1), nn.ReLU (), nn.MaxPool2d (kernel_size=3, stride=2), nn.Flatten (), # here The output of the full connection layer is several times that of LeNet. Use the dropout layer to reduce overfitting nn.Linear (6400, 4096), nn.ReLU (), nn.Dropout (pendant 0.5), nn.Linear (4096, 4096), nn.ReLU (), nn.Dropout (pendant 0.5), and finally the output layer. Since Fashion-MNIST is used here, the category digit 10 nn.Linear (4096, 10) is used.

We construct a single channel data with a height and width of 224 to observe the shape of the output of each layer. It matches the AlexNet architecture in the nearest diagram above.

X = torch.randn (1,1,224,224) for layer in net: X = layer (X) print (layer.__class__.__name__,'Output shape:\ tasking, X.shape) Conv2d Output shape: torch.Size ([1,96,54,54]) ReLU Output shape: torch.Size ([1,96,54,54]) MaxPool2d Output shape: torch.Size ([1,96,26,26]) Conv2d Output shape: torch.Size ([1,256,26]) 26]) ReLU Output shape: torch.Size ([1,256,26,26]) MaxPool2d Output shape: torch.Size ([1,256,12,12]) Conv2d Output shape: torch.Size ([1,384,12,12]) ReLU Output shape: torch.Size ([1,384,12,12]) Conv2d Output shape: torch.Size ([1,384,12,12]) ReLU Output shape: torch.Size ([1,384,12,12]) Conv2d Output shape: torch.Size ([1,256,12]) 12]) ReLU Output shape: torch.Size ([1,256,12,12]) MaxPool2d Output shape: torch.Size ([1,256,5,5]) Flatten Output shape: torch.Size ([1, 6400]) Linear Output shape: torch.Size ([1, 4096]) ReLU Output shape: torch.Size ([1, 4096]) Dropout Output shape: torch.Size ([1, 4096]) Linear Output shape: torch.Size ([1, 4096]) ReLU Output shape: torch.Size ([1 4096]) Dropout Output shape: torch.Size ([1, 4096]) Linear Output shape: torch.Size ([1, 10]) read dataset

AlexNet is directly applied to Fashion-MNIST recognition here, but there is a problem here, that is, the resolution of Fashion-MNIST image (28 × 28\ times28 28 × 28 pixels) is lower than that of ImageNet image. To solve this problem, we increased them to 224 × 224 224\ times224 224 × 224 (usually this is not a wise thing to do, but we are doing this here to use the AlexNet structure effectively). We use the resize parameter in the d2l.load_data_fashion_mnist function to perform this adjustment.

Batch_size = 128train_iter, test_iter = d2l.load_data_fashion_mnist (batch_size, resize=224)

Now we can start to train AlexNet. Compared with LeNet, the main change here is to train with a smaller learning rate, because the network is deeper and wider, the image resolution is higher, and the training convolution is more expensive to reach into the network.

Lr, num_epochs = 0.01, 10d2l.train_ch7 (net, train_iter, test_iter, num_epochs, lr, d2l.try_gpu ()) loss 0.330, train acc 0.879, test acc 0.8774163.0 examples/sec on cuda:0

Thank you for reading! On the "Python programming pytorch deep convolution neural network AlexNet example analysis" this article is shared here, I hope the above content can be of some help to you, so that you can learn more knowledge, if you think the article is good, you can share it out for more people to see it!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.