Example Analysis of Dataset data processing in Pytorch 04/24 Update SLTechnology News&Howtos

Example Analysis of Dataset data processing in Pytorch

2025-04-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article shares with you the content of a sample analysis of Dataset data processing in Pytorch. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

Pytorch series is to understand and use Pytorch programming to implement convolution neural networks.

Learn how to program convolutional neural networks; first, you need to understand Pytorch's use of data (which is also the part of data preprocessing in our model process), where there are two packages of Dataset,DataLoader. Dataset is the Pytorch's processing of individual data similar to numbering a pile of data, extracting images and tags sequentially (in tagged image processing).

DataLoader is a block of data for batch processing.

The data used in this experiment is the data set of facial emojis of teacher Deng Weihong of Beijing Post.

Of course, you can also manually make a binary dataset and put the pictures in the folder named by the picture label.

Simply analyze teacher Deng Weihong's RAF-DB, assuming that it has only Image, no real Annotation and so on.

Then its root path (the general location of the entire data) is set to root_dir = "D:\ data\ basic"

(because Annotation is considered below, "Image" is put into label) the label path (label location under data) is set to label_dir= "Image\ aligned (original)"

Please refer to the following figure for understanding:

Suppose aligned and original are tags, but it is the path to the real picture

Now start programming:

Because using Dataset, that is, letting the new class (MyData) inherit Dataset requires rewriting def _ _ getitem__ (self,item): and def _ _ len__ (self):

Among them, def _ _ getitem__ (self,item): enter the path of a series of images and the index of the image (combined into a detailed address of an image), and output the image and label. The default item in the code is the serial number, but to facilitate the rewriting of item to idx

Def _ _ len__ (self): enter the path to a series of images and output the number of these images.

Other functions can be creatively loaded into their own defined classes.

The package import os # path of from torch.utils.data import Dataset # Dataset needs this import cv2 # need to read the picture, it is best to use opencv-python, of course, you can also use PIL. I just don't like class MyData (Dataset): # the class def _ _ init__ (self, root_dir, label_dir): # the variables that need to be used are defined in _ _ init__. Self.root_dir = root_dir # the approximate location of the root path data on the computer or server self.label_dir = label_dir # label (assuming that the name of Image is the location of label) self.path = os.path.join (self.root_dir Self.label_dir) # combine these two to find the approximate path of the whole picture self.img_path = os.listdir (self.path) # the path to get the whole picture (preferably the name of one of the images) def _ _ getitem__ (self, idx): # overwrite the _ _ getitem__ (self,item) function And finally get the image. Tag # get the name of a specific image img_name = self.img_ path [IDX] # get the detailed address of an image img_item_path = os.path.join (self.root_dir, self.label_dir Img_name) # use opencv to read the image img = cv2.imread (img_item_path) # get the tag (here aligned and original are simply written) label = self.label_dir return img Label def _ _ len__ (self): # rewrite the size of the whole image return len (self.img_path) root_dir = "D://data//basic" img_dir = "Image" aligned_label_dir = "aligned" # aligned_label_dir = "Image//aligned" aligned_label_dir = os.path.join (img_dir Aligned_label_dir) original_label_dir = "original" # original_label_dir = "Image//original" original_label_dir = os.path.join (img_dir, original_label_dir) # aligned_data = "D://data//basic//Image//aligned" aligned_data = MyData (root_dir, aligned_label_dir) # original_data = "D://data//basic//Image//original" original_data = MyData (root_dir Original_label_dir) data = aligned_data + original_data# 15339print (len (aligned_data)) # 15339print (len (original_data)) # 30678print (len (data)) img_1, label_1 = data [15338] img_2, label_2 = data [15339] print (label_1) # Image\ alignedprint (label_2) # Image\ original thanks for reading! This is the end of this article on "sample Analysis of Dataset data processing in Pytorch". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it out for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.