In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces "Pytorch how to inherit Subset class to complete custom data split". In daily operation, I believe many people have doubts about how Pytorch inherits Subset class to complete custom data split. Xiaobian consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful to answer the question of "how Pytorch inherits Subset class to complete custom data split". Next, please follow the editor to study!
The following are common operations for loading built-in training datasets:
From torchvision.datasets import FashionMNISTfrom torchvision.transforms import Compose, ToTensor, NormalizeRAW_DATA_PATH ='. / rawdata'transform = Compose ([ToTensor (), Normalize ((0.1307,), (0.3081,))]) train_data = FashionMNIST (root=RAW_DATA_PATH, download=True, train=True, transform=transform)
Train_data here as a dataset object, it has a lot of familiarity, we can use the following methods to obtain the sample data classification set, sample feature dimension, sample label set and other information.
Classes = train_data.classesnum_features = train_data.data [0] .shape [0] train_labels = train_data.targetsprint (classes) print (num_features) print (train_labels)
The output is as follows:
['Trouser',' Pullover', 'Dress',' Coat', 'Sandal',' Shirt', 'Sneaker',' Bag', 'Ankle boot']
twenty-eight
Tensor ([9,0,0,..., 3,0,5])
However, we often split the verification set on the basis of the training set (or use only part of the data for training). The first method we think of is to use torch.utils.data.random_split to partition the dataset. Let's assume that 10000 samples are divided as the training set and the rest as the verification set:
From torch.utils.data import random_splitk = 10000train_data, valid_data = random_split (train_data, [k, len (train_data)-k])
Notice that if we print the type of train_data and valid_data, we can see that it shows:
It is no longer a torchvision.datasets.mnist.FashionMNIST object, but a so-called Subset object! At this point, the Subset object still has data properties, but the built-in target and classes properties no longer exist.
For example, if we force access to the target property of valid_data:
Valid_target = valid_data.target
The following error will be reported:
'Subset' object has no attribute 'target'
But if we often defaults the split dataset to dataset objects in subsequent code, how do we achieve code consistency?
There is a trick, which defines a new CustomSubSet class by inheriting the SubSet class, so that the new class has properties similar to the original dataset class, such as targets and classes, while maintaining the basic properties of the SubSet class:
From torch.utils.data import Subsetclass CustomSubset (Subset):''A custom subset class''' def _ _ init__ (self, dataset, indices): super (). _ init__ (dataset, indices) self.targets = dataset.targets # preserve targets attribute self.classes = dataset.classes # retain classes attribute def _ getitem__ (self, idx): # also supports index access operation x Y = self.dataset [self.indications [IDX]] return x, y def _ len__ (self): # also supports fetching length operation return len (self.indices)
Then the second partition method is introduced, that is, the dataset is partitioned directly by initializing the CustomSubset object (here, the shuffle step is omitted for simplicity):
Import numpy as npfrom copy import deepcopyorigin_data = deepcopy (train_data) train_data = CustomSubset (origin_data, np.arange (k)) valid_data = CustomSubset (origin_data, np.arange (k, len (origin_data))-k)
Note: the second parameter indices of the initialization method of the CustomSubset class is the sample index, which we can create through the method np.arange ().
Then, we access the classes and targes properties corresponding to valid_data:
Print (valid_data.classes) print (valid_data.targets)
At this point, we find that we can successfully access these properties:
['Trouser',' Pullover', 'Dress',' Coat', 'Sandal',' Shirt', 'Sneaker',' Bag', 'Ankle boot'] tensor ([9,0,0,3,0,5])
Of course, the role of CustomSubset is not just to add attributes to the dataset, we can also customize some data preprocessing operations.
We modify the structure of the class as follows:
Class CustomSubset (Subset):''A custom subset class with customizable data transformation''' def _ _ init__ (self, dataset, indices, subset_transform=None): super (). _ init__ (dataset, indices) self.targets = dataset.targets self.classes = dataset.classes self.subset_transform = subset_transform def _ getitem__ (self, idx): X Y = self.dataset [self.indications [IDX]] if self.subset_transform: X = self.subset_transform (x) return x, y def _ len__ (self): return len (self.indices)
We can set up the data preprocessing operator before using the sample:
From torchvision import transformsvalid_data.subset_transform = transforms.Compose (\ [transforms.RandomRotation ((180180)]))
In this way, when we use index access to fetch a sample of the dataset as follows, we automatically invoke the operator to complete the preprocessing operation:
Print (valid_data [0])
The print result is abbreviated as follows:
(tensor ([- 0.4242,-0.4242,-0.4242,.-0.4242,-0.4242,-0.4242,-0.4242,-0.4242], 9)
At this point, the study on "how Pytorch inherits the Subset class to complete the custom data split" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.