How to realize Resnet by PaddlePaddle dynamic Diagram 07/04 Update SLTechnology News&Howtos

How to realize Resnet by PaddlePaddle dynamic Diagram

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

In this issue, the editor will bring you about how the PaddlePaddle dynamic map realizes Resnet. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.

Dataset:

View the dataset images. In iChallenge-PM, there are fundus pictures of patients with pathological myopia and pictures of patients with non-pathological myopia. The naming rules are as follows:

Pathological myopia (PM): the file name starts with P

Non-pathological myopia (non-PM):

High approximation (high myopia): the file name starts with H

Normal eyes (normal): the file name starts with N

We take the picture of the pathological patient as the positive sample, labeled 1, and the picture of the non-pathological patient as the negative sample, labeled 0. Select two pictures from the data set, extract features through LeNet, build a classifier, classify the positive and negative samples, and display the pictures.

ResNet

ResNet, the champion of the 2015 ImageNet competition, reduced the recognition error rate to 3.6%, which even exceeded the accuracy of normal eyes.

Through the previous several classical model learning, we can find that with the continuous development of in-depth learning, the number of layers of the model is more and more, and the network structure is becoming more and more complex. So whether to deepen the network structure, we will certainly get better results? Theoretically, it is assumed that the newly added layers are identical mappings, as long as the original layer learns the same parameters as the original model, then the deep model structure can achieve the effect of the original model structure. In other words, the solution of the original model is only a subspace of the solution of the new model, and a better result should be found in the space of the solution of the new model than that of the corresponding subspace of the original model. However, practice shows that after increasing the number of layers of the network, the training error often increases instead of decreasing.

Kaiming He et al proposed residual network ResNet to solve the above problems, and its basic idea is shown in figure 6.

Figure 6 (a): it shows that when adding a network, x is mapped to a yearly F (x) yearly F (x) yellowF (x) output.

Figure 6 (b): an improvement has been made to figure 6 (a) to output yearly F (x) + xy=F (x) + xy=F (x) + x. Instead of learning the representation of the output feature y directly, you learn y − xy-xy − x.

If you want to learn the representation of the original model, you only need to set all the parameters of F (x) to 0, then y=xy=xy=x is an identity mapping.

F (x) = y − xF (x) = y-xF (x) = y − x is also called residual term. If the mapping of x → yx\ rightarrow yx → y is close to identity mapping, it is also easier to learn the complete mapping form in figure 6 (b) than in figure 6 (a).

Figure 6: residual block design idea

The structure of figure 6 (b), also known as residual block, is the basis of the residual network. Enter x to propagate data forward or backward faster through a cross-layer connection. The specific design of the residual block is shown in figure 7, which is also known as the bottleneck structure (BottleNeck).

Figure 7: schematic diagram of residual block structure

The following figure shows the structure of ResNet-50, which consists of 49 layers of convolution and 1 layer of full join, so it is called ResNet-50.

Figure 8:ResNet-50 model network structure schematic diagram

The implementation of ResNet-50 is shown in the following code:

In [2]

Import os

Import numpy as np

Import matplotlib.pyplot as plt

% matplotlib inline

From PIL import Image

DATADIR ='/ home/aistudio/work/palm/PALM-Training400/PALM-Training400'

# the file name begins with N for normal fundus pictures and P for diseased fundus pictures

File1 = 'N0012.jpg'

File2 = 'P0095.jpg'

# read pictures

Img1 = Image.open (os.path.join (DATADIR, file1))

Img1 = np.array (img1)

Img2 = Image.open (os.path.join (DATADIR, file2))

Img2 = np.array (img2)

# draw the read picture

Plt.figure (figsize= (16,8))

F = plt.subplot (121)

F.set_title ('Normal', fontsize=20)

Plt.imshow (img1)

F = plt.subplot

F.set_title ('PM', fontsize=20)

Plt.imshow (img2)

Plt.show ()

In [4]

# View picture shapes

Img1.shape, img2.shape

((2056, 2124, 3), (2056, 2124, 3))

In [3]

# define a data reader

Import cv2

Import random

Import numpy as np

# preprocess the read image data

Def transform_img (img):

# scale down the image size to 224x224

Img = cv2.resize (img, (224,224)

# the format of the image data read is [H, W, C]

# use transpose operation to change it to [C, H, W]

Img = np.transpose (img, (2je 0pl 1))

Img = img.astype ('float32')

# adjust the data range to between [- 1.0,1.0]

Img = img / 255.

Img = img * 2.0-1.0

Return img

# define the training set data reader

Def data_loader (datadir, batch_size=10, mode = 'train'):

# list the files in the datadir directory and read each file into

Filenames = os.listdir (datadir)

Def reader ():

If mode = 'train':

# randomly disrupt the order of data during training

Random.shuffle (filenames)

Batch_imgs = []

Batch_labels = []

For name in filenames:

Filepath = os.path.join (datadir, name)

Img = cv2.imread (filepath)

Img = transform_img (img)

If name [0] ='H' or name [0] = 'Nissan:

The file name that begins with # H is highly similar, and the file name that begins with N indicates normal eyesight.

# samples with high myopia and normal vision are not pathological and belong to negative samples with a label of 0

Label = 0

Elif name [0] = 'Please:

# P begins with pathological myopia, which is a positive sample with a label of 1

Label = 1

Else:

Raise ('Not excepted file name')

# every time you read the data of a sample, put it in the data list

Batch_imgs.append (img)

Batch_labels.append (label)

If len (batch_imgs) = = batch_size:

# when the length of the data list is equal to batch_size

# treat this data as a mini-batch and as an output of the data generator

Imgs_array = np.array (batch_imgs) .astype ('float32')

Labels_array = np.array (batch_labels) .astype ('float32') .reshape (- 1,1)

Yield imgs_array, labels_array

Batch_imgs = []

Batch_labels = []

If len (batch_imgs) > 0:

# the remaining sample size is less than one batch_size, and it is packaged into a mini-batch.

Imgs_array = np.array (batch_imgs) .astype ('float32')

Labels_array = np.array (batch_labels) .astype ('float32') .reshape (- 1,1)

Yield imgs_array, labels_array

Return reader

# define a validation set data reader

Def valid_data_loader (datadir, csvfile, batch_size=10, mode='valid'):

# the sample tag is determined by the file name when the training set is read, and the tag corresponding to each picture is read by the verification set through csvfile

# Please check the extracted validation set tag data and observe the contents contained in the csvfile file

# csvfile file contains the following format, with each line representing a sample

# the first column is the picture id, the second column is the file name, and the third column is the picture label

# the fourth and fifth columns are the coordinates of Fovea, independent of the classification task

# ID,imgName,Label,Fovea_X,Fovea_Y

# 1,V0001.jpg,0,1157.74,1019.87

# 2,V0002.jpg,1,1285.82,1080.47

# Open the csvfile that contains the tag of the validation set and read into it

Filelists = open (csvfile). Readlines ()

Def reader ():

Batch_imgs = []

Batch_labels = []

For line in filelists [1:]:

Line = line.strip () .split (',')

Name = line [1]

Label = int (line [2])

# load the picture according to the picture file name, and preprocess the image data

Filepath = os.path.join (datadir, name)

Img = cv2.imread (filepath)

Img = transform_img (img)

# every time you read the data of a sample, put it in the data list

Batch_imgs.append (img)

Batch_labels.append (label)

If len (batch_imgs) = = batch_size:

# when the length of the data list is equal to batch_size

# treat this data as a mini-batch and as an output of the data generator

Imgs_array = np.array (batch_imgs) .astype ('float32')

Labels_array = np.array (batch_labels) .astype ('float32') .reshape (- 1,1)

Yield imgs_array, labels_array

Batch_imgs = []

Batch_labels = []

If len (batch_imgs) > 0:

# the remaining sample size is less than one batch_size, and it is packaged into a mini-batch.

Imgs_array = np.array (batch_imgs) .astype ('float32')

Labels_array = np.array (batch_labels) .astype ('float32') .reshape (- 1,1)

Yield imgs_array, labels_array

Return reader

In [5]

# View data shapes

DATADIR ='/ home/aistudio/work/palm/PALM-Training400/PALM-Training400'

Train_loader = data_loader (DATADIR

Batch_size=10, mode='train')

Data_reader = train_loader ()

Data = next (data_reader)

Data [0] .shape, data [1] .shape

((10,3,224,224), (10,1))

In [6]

! pip install xlrd

Import pandas as pd

Df=pd.read_excel ('/ home/aistudio/work/palm/PALM-Validation-GT/PM_Label_and_Fovea_Location.xlsx')

Df.to_csv ('/ home/aistudio/work/palm/PALM-Validation-GT/labels.csv',index=False)

Looking in indexes: https://pypi.mirrors.ustc.edu.cn/simple/

Collecting xlrd

Downloading https://mirrors.tuna.tsinghua.edu.cn/pypi/web/packages/b0/16/63576a1a001752e34bf8ea62e367997530dc553b689356b9879339cf45a4/xlrd-1.2.0-py2.py3-none-any.whl (103kB)

| | ██ | 112kB 9.2MB/s eta 0:00:01 |

Installing collected packages: xlrd

Successfully installed xlrd-1.2.0

In [7]

# training and evaluation code

Import os

Import random

Import paddle

Import paddle.fluid as fluid

Import numpy as np

DATADIR ='/ home/aistudio/work/palm/PALM-Training400/PALM-Training400'

DATADIR2 ='/ home/aistudio/work/palm/PALM-Validation400'

CSVFILE ='/ home/aistudio/work/palm/PALM-Validation-GT/labels.csv'

# define the training process

Def train (model):

With fluid.dygraph.guard ():

Print ('start training...')

Model.train ()

Epoch_num = 5

# define optimizer

Opt = fluid.optimizer.Momentum (learning_rate=0.001, momentum=0.9)

# defining data readers, training data readers and validating data readers

Train_loader = data_loader (DATADIR, batch_size=10, mode='train')

Valid_loader = valid_data_loader (DATADIR2, CSVFILE)

For epoch in range (epoch_num):

For batch_id, data in enumerate (train_loader ()):

X_data, y_data = data

Img = fluid.dygraph.to_variable (x_data)

Label = fluid.dygraph.to_variable (y_data)

# run the forward calculation of the model to get the predicted value

Logits = model (img)

# perform loss calculation

Loss = fluid.layers.sigmoid_cross_entropy_with_logits (logits, label)

Avg_loss = fluid.layers.mean (loss)

If batch_id% 10 = 0:

Print ("epoch: {}, batch_id: {}, loss is: {}" .format (epoch, batch_id, avg_loss.numpy ()

# backpropagation, update weight, clear gradient

Avg_loss.backward ()

Opt.minimize (avg_loss)

Model.clear_gradients ()

Model.eval ()

Accuracies = []

Losses = []

For batch_id, data in enumerate (valid_loader ()):

X_data, y_data = data

Img = fluid.dygraph.to_variable (x_data)

Label = fluid.dygraph.to_variable (y_data)

# run the forward calculation of the model to get the predicted value

Logits = model (img)

# 2. The results calculated by sigmoid can be divided into two categories with a threshold of 0.5.

# calculate the prediction probability after sigmoid, and calculate the loss

Pred = fluid.layers.sigmoid (logits)

Loss = fluid.layers.sigmoid_cross_entropy_with_logits (logits, label)

# calculate categories whose prediction probability is less than 0.5

# get the prediction probabilities of the two categories and cascade along the first dimension

Pred = fluid.layers.concat ([pred2, pred], axis=1)

Acc = fluid.layers.accuracy (pred, fluid.layers.cast (label, dtype='int64'))

Accuracies.append (acc.numpy ())

Losses.append (loss.numpy ())

Print ("[validation] accuracy/loss: {} / {}" .format (np.mean (accuracies), np.mean (losses)

Model.train ()

# save params of model

Fluid.save_dygraph (model.state_dict (), 'mnist')

# save optimizer state

Fluid.save_dygraph (opt.state_dict (), 'mnist')

# define the evaluation process

Def evaluation (model, params_file_path):

With fluid.dygraph.guard ():

Print ('start evaluation.')

# load model parameters

Model_state_dict, _ = fluid.load_dygraph (params_file_path)

Model.load_dict (model_state_dict)

Model.eval ()

Eval_loader = load_data ('eval')

Acc_set = []

Avg_loss_set = []

For batch_id, data in enumerate (eval_loader ()):

X_data, y_data = data

Img = fluid.dygraph.to_variable (x_data)

Label = fluid.dygraph.to_variable (y_data)

# calculation, prediction and accuracy

Prediction, acc = model (img, label)

# calculate the value of loss function

Loss = fluid.layers.cross_entropy (input=prediction, label=label)

Avg_loss = fluid.layers.mean (loss)

Acc_set.append (float (acc.numpy ()

Avg_loss_set.append (float (avg_loss.numpy ()

# calculate the average accuracy

Acc_val_mean = np.array (acc_set). Mean ()

Avg_loss_val_mean = np.array (avg_loss_set). Mean ()

Print ('loss= {}, acc= {}' .format (avg_loss_val_mean, acc_val_mean))

The implementation of ResNet-50 is shown in the following code:

In [8]

#-*-coding:utf-8-*-

# ResNet Model Code

Import numpy as np

Import paddle

Import paddle.fluid as fluid

From paddle.fluid.layer_helper import LayerHelper

From paddle.fluid.dygraph.nn import Conv2D, Pool2D, BatchNorm, FC

From paddle.fluid.dygraph.base import to_variable

The BatchNorm layer is used in # ResNet, and BatchNorm is added after the convolution layer to improve numerical stability

# define convolution batch normalized block

Class ConvBNLayer (fluid.dygraph.Layer):

Def _ init__ (self

Name_scope

Num_channels

Num_filters

Filter_size

Stride=1

Groups=1

Act=None):

Name_scope, the name of the module

Num_channels, the number of input channels for the convolution layer

Num_filters, the number of output channels of the convolution layer

Stride, stride of convolution layer

Groups, the number of groups for packet convolution. By default, groups=1 does not use packet convolution.

Act, activation function type. Act=None does not use activation function by default.

Super (ConvBNLayer, self). _ init__ (name_scope)

# create a convolution layer

Self._conv = Conv2D (

Self.full_name ()

Num_filters=num_filters

Filter_size=filter_size

Stride=stride

Padding= (filter_size-1) / / 2

Groups=groups

Act=None

Bias_attr=False)

# create a BatchNorm layer

Def forward (self, inputs):

Y = self._conv (inputs)

Y = self._batch_norm (y)

Return y

# define residual blocks

# each residual block will convolution the input picture three times, and then short it with the input picture

# if the shape of the output feature graph of the third convolution in the residual block is not consistent with the input, then 1x1 convolution is performed on the input picture to adjust the output shape to be consistent

Class BottleneckBlock (fluid.dygraph.Layer):

Def _ init__ (self

Name_scope

Num_channels

Num_filters

Stride

Shortcut=True):

Super (BottleneckBlock, self). _ init__ (name_scope)

# create the first convolution layer 1x1

Self.conv0 = ConvBNLayer (

Self.full_name ()

Num_channels=num_channels

Num_filters=num_filters

Filter_size=1

Act='relu')

# create a second convolution layer 3x3

Self.conv1 = ConvBNLayer (

Self.full_name ()

Num_channels=num_filters

Num_filters=num_filters

Filter_size=3

Stride=stride

Act='relu')

# create a third convolution 1x1, but multiply the number of output channels by 4

Self.conv2 = ConvBNLayer (

Self.full_name ()

Num_channels=num_filters

Num_filters=num_filters * 4

Filter_size=1

Act=None)

# if the output of conv2 matches the shape of the input data of this residual block, then shortcut=True

# otherwise shortcut = False, add a convolution of 1x1 to the input data to make it the same shape as conv2

If not shortcut:

Self.short = ConvBNLayer (

Self.full_name ()

Num_channels=num_channels

Num_filters=num_filters * 4

Filter_size=1

Stride=stride)

Self.shortcut = shortcut

Self._num_channels_out = num_filters * 4

Def forward (self, inputs):

Y = self.conv0 (inputs)

Conv1 = self.conv1 (y)

Conv2 = self.conv2 (conv1)

# if shortcut=True, add the output of inputs and conv2 directly

# otherwise, you need to convolution the inputs and adjust the shape to match the conv2 output.

If self.shortcut:

Short = inputs

Else:

Short = self.short (inputs)

Y = fluid.layers.elementwise_add (x=short, y=conv2)

Layer_helper = LayerHelper (self.full_name (), act='relu')

Return layer_helper.append_activation (y)

# define ResNet model

Class ResNet (fluid.dygraph.Layer):

Def _ _ init__ (self, name_scope, layers=50, class_dim=1):

Name_scope, module name

Class_dim, the number of categories of the classification label

Super (ResNet, self). _ init__ (name_scope)

Self.layers = layers

Supported_layers = [50,101,152]

Assert layers in supported_layers,\

If layers = = 50:

# ResNet50 contains multiple modules, of which the second to fifth modules contain 3, 4, 6 and 3 residual blocks, respectively

Elif layers = = 101:

# ResNet101 contains multiple modules, of which the second to fifth modules contain 3, 4, 23 and 3 residual blocks, respectively

Depth = [3, 4, 23, 3]

Elif layers = 152:

# ResNet50 contains multiple modules, of which the second to fifth modules contain 3, 8, 36 and 3 residual blocks, respectively

Depth = [3,8,36,3]

# number of output channels of convolution used in residual blocks

Num_filters = [64,128,256,512]

The first module of # ResNet, containing 1 7x7 convolution, followed by a maximum pooling layer

Self.conv = ConvBNLayer (

Self.full_name ()

Num_channels=3

Num_filters=64

Filter_size=7

Stride=2

Act='relu')

Self.pool2d_max = Pool2D (

Self.full_name ()

Pool_size=3

Pool_stride=2

Pool_padding=1

Pool_type='max')

# second to fifth modules c2, c3, c4, c5 of ResNet

Self.bottleneck_block_list = []

Num_channels = 64

For block in range (len (depth)):

Shortcut = False

For i in range (depth [block]):

Bottleneck_block = self.add_sublayer (

'bb_%d_%d'% (block, I)

BottleneckBlock (

Self.full_name ()

Num_channels=num_channels

Num_filters=num_ filters[block]

Stride=2 if I = = 0 and block! = 0 else 1, # c3, c4, c5 will use all stride=2; residual blocks stride=1 in the first residual block

Shortcut=shortcut))

Num_channels = bottleneck_block._num_channels_out

Self.bottleneck_block_list.append (bottleneck_block)

Shortcut = True

# use global pooling on the output feature graph of c5

Self.pool2d_avg = Pool2D (

Self.full_name (), pool_size=7, pool_type='avg', global_pooling=True)

# stdv is used as the variance of the random initialization parameter of the fully connected layer

Import math

Stdv = 1.0 / math.sqrt (2048 * 1.0)

# create a full connection layer with the output size of the number of categories

Self.out = FC (self.full_name ()

Size=class_dim

Param_attr=fluid.param_attr.ParamAttr (

Initializer=fluid.initializer.Uniform (- stdv, stdv)

Def forward (self, inputs):

Y = self.conv (inputs)

Y = self.pool2d_max (y)

For bottleneck_block in self.bottleneck_block_list:

Y = bottleneck_block (y)

Y = self.pool2d_avg (y)

Y = self.out (y)

Return y

In [9]

With fluid.dygraph.guard ():

Model = ResNet ("ResNet")

Train (model)

Start training...

Epoch: 0, batch_id: 0, loss is: [0.83079195]

Epoch: 0, batch_id: 10, loss is: [0.5477183]

Epoch: 0, batch_id: 20, loss is: [0.87052524]

Epoch: 0, batch_id: 30, loss is: [1.0255078]

[validation] accuracy/loss: 0.7450000047683716/0.5235034823417664

Epoch: 1, batch_id: 0, loss is: [0.41455013]

Epoch: 1, batch_id: 10, loss is: [0.54812586]

Epoch: 1, batch_id: 20, loss is: [0.17374663]

Epoch: 1, batch_id: 30, loss is: [0.30293828]

[validation] accuracy/loss: 0.887499988079071/0.27671539783477783

Epoch: 2, batch_id: 0, loss is: [0.38499922]

Epoch: 2, batch_id: 10, loss is: [0.29150736]

Epoch: 2, batch_id: 20, loss is: [0.3396409]

[validation] accuracy/loss: 0.9274999499320984/0.17061272263526917

Epoch: 3, batch_id: 0, loss is: [0.06969612]

Epoch: 3, batch_id: 10, loss is: [0.0861987]

Epoch: 3, batch_id: 20, loss is: [0.05332329]

Epoch: 3, batch_id: 30, loss is: [0.46470308]

[validation] accuracy/loss: 0.9375/0.20805077254772186

Epoch: 4, batch_id: 0, loss is: [0.38617897]

Epoch: 4, batch_id: 10, loss is: [0.16854036]

Epoch: 4, batch_id: 20, loss is: [0.05454079]

Epoch: 4, batch_id: 30, loss is: [0.32432565]

[validation] accuracy/loss: 0.8600000143051147/0.3488900661468506

The above is the editor for you to share the PaddlePaddle dynamic diagram is how to achieve Resnet, if you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.