Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to implement Resnet Operation on Local dataset based on pytorch

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

In this article Xiaobian for you to introduce in detail "based on pytorch how to achieve Resnet on the local dataset operation", the content is detailed, the steps are clear, the details are handled properly, I hope that this "based on pytorch how to achieve Resnet on the local dataset operation" article can help you solve doubts, following the editor's ideas slowly in-depth, together to learn new knowledge.

The mian.py file is not only the total file of the project, but also the running file for training the network model. The introduction process of the text is to introduce the code one by one with the file.

The main.py code is as follows:

From dataset import data_dataloader # computer locally written function to read data from torch import nn # imported pytorch's nn module from torch import optim # imported pytorch's optim module from network import Res_net # computer-written network framework function from train import train # computer-written training function def main (): # The following is entered as: the path to the data through the Data_dataloader function Data mode, data size, batch size Train_loader = data_dataloader (data_path='./data', mode='train', size=64, batch_size=24, num_workers=4) val_loader = data_dataloader (data_path='./data', mode='val', size=64, batch_size=24, num_workers=2) test_loader = data_dataloader (data_path='./data', mode='test', size=64, batch_size=24) Num_workers=2) # the following is the definition of hyperparameters lr= 1e-4 # learning rate epochs = 10 # training round model= Res_net (2) # resnet network optimizer= optim.Adam (model.parameters (), lr=lr) # optimizer loss_function= nn.CrossEntropyLoss () # loss function # training and verification test function train (model=model, optimizer=optimizer, loss_function=loss_function) Train_data=train_loader, val_data=val_loader,test_data= test_loader, epochs=epochs) if _ _ name__ = ='_ main__': main ()

The main.py flowchart is shown in figure 1:

Figure 1 main.py code flow chart

1.dataset.py (look at the overall flow of the code before looking at the introduction)

The first five lines of main.py () import the corresponding module, respectively, where dataset,network and train are locally written files. In the first few lines of code in the mian () function, we use the Data_dataloader function in the dataset.py file to import the training set, verification set, and test set. The Dataset file is imported into our own local database, and its function is to get all the data, turn it into tensor data that pytorch can recognize, and then get the picture.

The dataset.py file code is as follows:

The first part of import torchimport os,globimport randomimport csvfrom torch.utils.data import Datasetfrom PIL import Imagefrom torchvision import transformsfrom torch.utils.data import DataLoader#: get the output data of tensor type class Dataset_self (Dataset) through three steps: # if it is nn.moduel, it is to write the network model framework, here you need to inherit the data of dataset So what's in parentheses is Dataset # first step: initialize def _ _ init__ (self,root,mode,resize,): # root is the file root directory, and what kind of dataset mode chooses Resize is the image resizing super (Dataset_self Self). _ init__ () self.resize = resize self.root = root self.name_label = {} # create a dictionary to save the tags of each file # first get the dictionary relative to the tags (one-to-one correspondence between tags and names) for name in sorted (os.listdir (os.path.join (root)): # sort and use Open the folder if not os.path.isdir (os.path.join (root) in the form of a list Name): # if it is not a folder, you do not need to read continue self.name_ label [name] = len (self.name_label.keys ()) # the name of each file is the number of key-value pairs in the name_Label dictionary # print (self.name_label) self.image Self.label = self.make_csv ('images.csv') # write a total function to read the path of the picture and label # divide the image data on the basis of image and label (note: if cross-validation is required, no verification set is required Only divided into training set and test set) if mode = = 'train': self.image, self.label= self.image [: int (0.6*len (self.image))], self.label [: int (0.6*len (self.label))] if mode = =' val': self.image, self.label= self.image [int (0.6*len (self.image)): int (0.8*len (self.image)] Self.label [int (0.6*len (self.label)): int (0.8*len (self.label))] if mode = = 'test': self.image, self.label= self.image [int (0.8*len (self.image)):], self.label [int (0.8*len (self.label)):] # function def make_csv (self) to get pictures and tags Filename): if not os.path.exists (os.path.join (self.root) Filename): # if there is no summary directory, create a new images = [] for image in self.name_label.keys (): # Let image go to every file in name_label to read the picture images + = glob.glob (os.path.join (self.root,image) '* jpg')) # plus * greedy search for all files about jpg # print (' length: {}) The second picture is: {} '.format (len (images), images [1]) random.shuffle (images) # shuffle the data in the images list # images [0]:. / data\ ants\ 382971067_0bfd33afe0.jpg with open (os.path.join (self.root,filename), mode='w') Newline='') as f: # create the file writer = csv.writer (f) for image in images: name = image.split (os.sep) [- 2] # get the tag label = self.name_ label [name] writer.writerow ([image] corresponding to the picture Label]) # the first line of writing to the file:. / data\ ants\ 382971067: 0bfd33afe0.jpg images,labels = [], [] with open (os.path.join (self.root,filename)) as f: # read the file reader = csv.reader (f) for row in reader: image Label = row label = int (label) images.append (image) labels.append (label) assert len (images) = = len (labels) # similar to if statement Continue execution only if the two lengths are the same, otherwise an error return images,labels # returns all! Is all the pictures and tags (the picture here is not the picture data itself, but its file directory) # step 2: get the length of the picture data (the length of the tag data is the same as the picture) def _ _ len__ (self): return len (self.image) # step 3: read the picture and tag And output def _ getitem__ (self, item): # single return tensor image and label image,label = self.image [item] Self.label [item] # get a single picture and corresponding tags (here image are all file directories) image = Image.open (image) .convert ('RGB') # get picture data # use transform to process the picture and change it into tensor type data transf = transforms.Compose ([transforms.Resize ((int (self.resize), int (self.resize) Transforms.RandomRotation (15), transforms.CenterCrop (self.resize), transforms.ToTensor (), # becomes tensor type data first Then in the following standardization]) image = transf (image) label = torch.tensor (label) # change the picture tag to tensor type return image,label# part II: use the DataLoader function that comes with pytorch to get the image data def data_dataloader (data_path,mode,size,batch_size) in batch Num_workers): # load appellate data with a function Data_path, mode and size are the parameters in Dataset_self () defined above, and batch_size is how many images are output at one time. Num_worker processes several images at the same time dataset = Dataset_self (data_path,mode,size) dataloader = DataLoader (dataset,batch_size,num_workers) # use the dataloader function in pytorch to get data return dataloader# test def main (): test = Dataset_self ('. / data','train',64) if _ name__ = ='_ main__': main ()

Dataset.py flow chart 2 shows:

Fig. 2 dataset.py flow chart

As shown in the above code, when you use pytorch to load a custom dataset, you need to define an object for dataset, then define an object for dataloaber, and finally get training data and tags for dataloaber repeatedly. So this document is mainly divided into two parts: the custom dataset part and the part that uses the dataloaber in pytorch to get the training data.

The code first imports the necessary python libraries, and then writes the first part. The first part is mainly through three steps to get a single output of tensor type pictures and tags.

The three steps are initialization, getting the length of the data, and reading the data and labels. The initialization is to get a file in which the corresponding directories and tags of all the pictures are saved, and then the resulting files are read out into training sets, verification sets and test sets. The specific implementation is shown in the above code. First, the variables resize, root and name_label are defined in the initialized function to facilitate the following function calls:

Figure 3 initialization of parameters in Dataset_self

Then we write code to read the root directory and get the classification name and its corresponding label:

Fig. 4 acquisition of labels

In the code, first use the os library to turn the files in the root directory into a list to be read out, then save all the file names in the root directory in the name_label dictionary, and numeralize the tags according to the number of dictionaries stored respectively. (the first tag read into the dictionary is 0, the second is 1, the rest of the files, and so on)

After getting the tag dictionary, we write a function to get the directory of all the pictures, which is easy to read in the following steps:

Fig. 5 Reading of pictures and tags

Write the make_csv function to get image and label (image is the directory of each picture, and label is the corresponding label).

In the make_csv function, we first determine whether there is a file we need, read it directly if it exists, and turn it into a file that stores all the image directories and tags if it does not exist.

Figure 6 make_csv function

When the file does not exist (the judgment of the first line of sentence), the idea of writing a file is to write a list to save all the picture directories, and then create a file to write the list data into the file using the csv library. So under the judgment statement, we get an empty images list, and then traverse the keys in name_label, for name_label, it is a key is the file name, value is the tag (value) dictionary, because the file is read into a dictionary using the os library, so when traversing the key in the dictionary, it is reading the corresponding file. So the fourth line of code in the figure above reads the pictures in the file separately, and then uses the glob library to store all the jpg files in the images list. In the list, images [0] is:. / data\ ants\ 382971067_0bfd33afe0.jpg

After getting the picture directory list, first arrange the data in the list randomly, then create a file, get the tag name in the directory in the list images, use name_label to get the corresponding value of the tag name, and finally write it into the file. The first line of the file is:. / data\ ants\ 382971067 image 0bfd33afe0.jpgjournal 0 (relative directory and relative label of the picture)

After getting the file, because what we need is the directory of each picture instead of the file (mainly for repeated debugging later, so we get a file as a transit station), so we need to use two lists to get the picture directory and the corresponding tag value, and finally write the data in the file to the list to get the picture and tag list.

At this point, we can get image and label through the function make_csv. After getting these two lists, we cut them because all the data is saved in the list, so we need to split them into training sets, verification sets, and test sets. The code is simple (if cross-validation is required, you only need to divide the training set and the test set) as shown in the following figure:

Figure 7 Partition of data sets

The above is the first step of initialization, and the second step is to read the image length:

Figure 8 read the length of the image

Quite simply, a len () function is done, and its main function is to know how much data there is.

Step 3: read the data and tags, and read the data one by one, so first get a single data from the image and label list. Because the image list holds the directory of the picture, read the picture in RGB format first, and then use transform to process the picture accordingly (size, picture change, change to tensor type, etc.). Finally, you can also change the label to tensor type and return the picture data and label data. The code is shown in the following figure:

Figure 8 reading images and tags

The first part is to read the tag corresponding to the picture. The process is three steps: initializing, getting the data length and reading a single piece of data. The dataset processing for pytorch is based on these three steps. Among them, the algorithm logic is not complex, mainly because there are a lot of sentences to be used, and the logic needs to be carefully considered.

The second part is much simpler than the first part, and you can even run it in the main () function. The main content is to get the data through the dataset_self obtained in the first part, and then use the dataloader that comes with pytorch to get the data set trained in the model. The code is shown below:

Fig. 9 acquisition of dataset

The simple summary of the function of the Dataset part is to change the pictures and tags in the local dataset into tensor-type data to read into the dataset that needs to be used.

2.network.py

In main.py (), we define some hyperparameters, such as learning rate, training turn, training model, optimizer and loss function. For the training model, this paper uses a small Resnet model written locally. The code is as follows:

Import torchfrom torch import nn# first writes resnet's block block class Res_block (nn.Module): def _ init__ (self,in_num,out_num,stride): super (Res_block, self). _ _ init__ () self.cov1 = nn.Conv2d (in_num,out_num, (3L3), stride=stride,padding=1) # (3p3) padding=1, then the image size remains the same, and stride shrinks several times as many images as it takes. Can greatly reduce the parameter self.bn1 = nn.BatchNorm2d (out_num) self.cov2 = nn.Conv2d (out_num,out_num, (3Magne3), padding=1) self.bn2 = nn.BatchNorm2d (out_num) self.extra = nn.Sequential (nn.Conv2d (in_num,out_num, (1Magne1), stride=stride) Nn.BatchNorm2d (out_num) # makes the image data size consistent before and after input self.relu = nn.ReLU () def forward (self X): out = self.relu (self.bn1 (self.cov1 (x) out = self.relu (self.bn2 (self.cov2 (out) out = self.extra (x) + out return outclass Res_net (nn.Module): def _ init__ (self,num_class): super (Res_net) Self). _ init__ () self.init = nn.Sequential (nn.Conv2d (3pm 16, (3pm 3)) Nn.BatchNorm2d (16)) # pre-processing self.bn1 = Res_block (16) self.bn2 = Res_block (32) self.bn3 = Res_block (64) self.bn4 = Res_block (128) self.fl = nn.Flatten () self.linear1 = nn.Linear (8192) self.linear2 = nn.Linear (10) Num_class) out = self.relu (self.init (x)) # print ('inint:',out.shape) out = self.bn1 (out) # print (' bn1:', out.shape) out = self.bn2 (out) # print ('bn2:', out.shape) out = self.bn3 (out) # print (' bn3:') Out.shape) out = self.fl (out) # print ('flatten:', out.shape) out = self.relu (self.linear1 (out)) # print (' linear1:', out.shape) out = self.relu (self.linear2 (out)) # print ('linear2:', out.shape) # Test def main (): X = torch.randn 64) net = Res_net (2) out = net (x) print (out.shape) if _ _ name__ ='_ _ main__': main ()

The network.py flowchart is shown in figure 10:

Figure 10 network.py flow chart

The Resnet model network mainly consists of two parts: first, each residual block in resnet is written, and then the whole network is written. Before I begin to introduce the code, I will first use my understanding to introduce the ideas and logic of Resnet, that is, the residual network (you can search for other materials for details). The main purpose of residual network is to train a deep network. It is hoped that with the deepening of the network, the effect is getting better and better. However, due to the deepening of the network, it is very likely that some parameters will not be trained (iterations make the gradient disappear). All Resnet networks skillfully use a residual block to solve the problem that the gradient disappears because the network model is too deep, as shown in figure 11:

Fig. 11 residual block

To put it simply, after x passes through two layers, it is added to x itself, so that in the process of back propagation, the band of f (x) + x becomes so that the gradient will not disappear (at least 1) when it is sent back to the hidden layer above x. If there are n layers in front of the x input residual block, then even if the hidden layer in the fast residual is not trained because of the disappearance of the gradient, at least the n layer before the x input is trained, so as long as part of the hidden layer in the fast residual can be trained, the accuracy of the neural network is likely to increase on the original basis. (we still have to study it carefully, the explanation of Resnet here may not be that accurate.)

Based on the image of the residual block above, we first define the residual block, and the code is shown in figure 12 below:

Figure 12 definition of residual block

The flow chart is shown in figure 13:

Fig. 13 flow chart of residual block definition

When the residual block is written, you can write a simple Resnet network, as shown in figure 14:

Figure 14 simple Resnet network model

In the above code, first through a normal convolution layer, then through three residual blocks, and finally through two linear layers, the code is very simple. After defining the residual block, it can be done by calling the function that comes with pytorch itself. The only thing to pay attention to is the setting of the parameters. Generally speaking, the dimensions of the network are slowly increasing and the size of the image is slowly decreasing.

3.train.py

Train.py is the training process of the whole model, this paper packages it into a function, and then calls it in mian.py, because basically the training process of the network is more or less the same, generally training with the training set, getting the best turn on the verification set, and finally saving the network parameters and testing on the test set, so here the training process and verification process are directly packaged into functions to facilitate the direct call of future projects.

The train.py code is as follows:

Import torchfrom torch import optimfrom torch.utils.data import DataLoaderfrom dataset import Dataset_selffrom network import Res_netfrom torch import nnfrom matplotlib import pyplot as pltimport numpy as npdef evaluate (model,loader): # calculate the accuracy after each training correct = 0 total = len (loader.dataset) for x Y in loader: logits = model (x) pred = logits.argmax (dim=1) # get the classification value in logits (either [1mem0] or [0meme1] means divided into two categories) correct + = torch.eq (pred) Y) .sum () .float () .item () # using logits and the tag label to compare the number of correct classifications return correct/total# defines the training process as a function def train (model,optimizer,loss_function,train_data,val_data,test_data,epochs): # input: network architecture Optimizer, loss function, training set, verification set, test set, best_acc,best_epoch = 0 # output the highest accuracy round and accuracy in the verification set train_list,val_List = [], [] # create a list to save each acc For the final drawing for epoch in range (epochs): print ('= round {} = '.format (epoch + 1)) for steps, (xmemy) in enumerate (train_data): # for xdepartment y in train_data logits = model (x) # data is put into the network loss = loss_function (logits) Y) # get the loss value optimizer.zero_grad () # the optimizer clears zero first Otherwise, the last value loss.backward () # backward propagates optimizer.step () train_acc = evaluate (model,train_data) train_list.append (train_acc) print ('train_acc') Train_acc) # if epoch% 1 = = 2: # here you can set val_acc= evaluate (model,val_data) print ('val_acc=') once every two training sessions Val_acc) val_List.append ((val_acc)) if val_acc > best_acc: # determine whether the accuracy of each verification set is the maximum best_epoch = epoch best_acc = val_acc torch.save (model.state_dict ()) 'best.mdl') # Save the maximum accuracy on the verification set print (' = split line =') print ('best acc:',best_acc,'best_epoch:',best_epoch) # detect the accuracy of the trained model on the test set model.load_state_dict ((torch.load (' best.mdl') print ('detect the test data') Test_acc = evaluate (model,test_data) print ('test_acc:',test_acc) train_list_file = np.array (train_list) np.save (' train_list.npy',train_list_file) val_list_file = np.array (val_List) np.save ('val_list.npy',val_list_file) # drawing x_label = range (1) val_List (val_List) + 1) plt.plot (x_label Train_list,'bo',label='train acc') plt.plot (plt.title train and validation accuracy') plt.xlabel ('epochs') plt.legend () plt.show () # Test def main (): train_dataset = Dataset_self ('. / data', 'train', 64) vali_dataset = Dataset_self ('. / data', 'val' 64) test_dataset = Dataset_self ('. / data', 'test', 64) train_loaber = DataLoader (train_dataset, 24, num_workers=4) val_loaber = DataLoader (vali_dataset, 24, num_workers=2) test_loaber = DataLoader (test_dataset, 24, num_workers=2) lr= 1e-4 epochs = 5 model = Res_net (2) optimizer = optim.Adam (model.parameters (), lr=lr) criteon = nn.CrossEntropyLoss () train (model,optimizer) Criteon,train_loaber,val_loaber,test_loaber,epochs) if _ _ name__ = ='_ main__': main ()

The train.py flowchart is shown in figure 15:

Figure 15 train.py flow chart

In the above code, the first function is defined to get the accuracy after a training (or verification or test), that is, what the accuracy of the model is after running all the training sets at once. The content of the code is not complicated. We first get the pred of the classification label in the model logits (whether it is [1Power0] or [0Power1], which means it is divided into two categories), and then compare it with the tag with logits to get the number of correct classification in batch_size, and then add it up to get the number of correct classification of data sets in a training (correct). Finally, divide it by the number of data sets to get the accuracy and return its value.

For the second function, the definition of train's function, its main content is to train on the training set. After each round of training, it is verified on the verification set (which can be every two or three times). After all the rounds are executed, the best network parameters and rounds are saved on the verification set. Finally, the saved network parameters are loaded to detect the test set.

The train function first defines the best accuracy and the best turn in the verification set, then creates two lists to save the accuracy of each training set and verification set (for drawing and viewing), and then carries out epochs training.

Figure 16 definition of parameters in the trian function

During the training, if you directly use XMagi y to get the pictures and tags of the data, you can use the code inside the tag, and use the enumerate function, which is mainly to mark an index to the data obtained every time. The index is steps, starting from 0 (the steps parameter is not used here). In each execution, the picture data x is put into the network model model to be processed, and then the loss value between the predicted and the correct tag is obtained using the defined loss_function function. The optimizer first clears the zero (otherwise there will be numerical superposition), then lets the lost value loss perform the back propagation operation (chain derivation), and finally the optimizer performs the optimization function, thus realizing a training and parameter update of the model.

Fig. 17 training steps of the model

In the following code, each time the network model is trained, the verification set is put into the network model to test how well the network model is trained, and then save the network model parameters and rounds with the best accuracy in the number of epochs. Finally, the saved network model parameters are loaded to detect the accuracy on the test set.

Fig. 18 preservation and testing of model parameters

The last few sentences of the code are to map the saved accuracy, and it should be noted that because it has been tested on the verification set after each training, the length of the axis is represented by the length of the accuracy of the training set.

Fig. 19 drawing

4. Results and summary

The project of this paper is to use the Resnet model to identify ants and bees. One has a total of 396 pieces of data, and the training set has only more than 200 pieces (the data set is very small). After ten rounds of running, the accuracy of the training set and test set in each round is shown in the figure:

Figure 20 train and validation accuracy

The accuracy of the test set is shown in the figure:

Fig. 21 accuracy of test set

In the end, the result is not ideal, it is likely that too few data sets lead to the weakening of the generalization ability of the model (the model writes down all the training sets). For such problems, we can try to enhance the generalization ability of the model through cross-validation (the effect may be improved to some extent) or by adding data sets. The improvement of accuracy will be discussed in subsequent articles.

After getting the model parameters, I randomly found two pictures of ants on the Internet and put them into the model to test the effect:

Figure 22 first test

Figure 23 second test

The first test identified ants, but failed the second time. It is possible that the model has not seen black bees, so the black ones are regarded as ants. In short, there is still a lot of room for improvement in the model.

Attach the code for the leaflet test:

From network import Res_netimport torchfrom PIL import Imageimport torchvision# import picture img = '1.jpg'img = Image.open (img) tf = torchvision.transforms.Compose ([torchvision.transforms.Resize (64), torchvision.transforms.ToTensor ()]) img = tf (img) image = torch.reshape (img, (1mem3) 64)) # load model net = Res_net (2) net.load_state_dict (torch.load ('best.mdl')) with torch.no_grad (): out = net (image) # determine the classification class_cl = out.argmax (dim=1) class_num = class_cl.numpy () if class_num = 0: print (' this picture is an ant') else: print ('this picture is a bee') read here This article "how to operate local datasets based on Resnet based on pytorch" has been introduced. If you want to master the knowledge points of this article, you still need to practice and use it yourself. If you want to know more about related articles, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report