Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use ImageFolder in Pytorch to ignore specific files when reading datasets

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces how to use ImageFolder to read data sets in Pytorch to ignore specific files, the article is very detailed, has a certain reference value, interested friends must read it!

Ignore specific files when reading datasets using ImageFolder

If you know in advance which files need to be ignored, of course, just delete them from the dataset. But if you need to confirm dynamically while the program is running, or if the filtering rules are complex and difficult to do manually, you need to let ImageFolder use custom filtering rules when reading.

ImageFolder has an optional parameter of is_valid_file and a callable function of type, which passes in a str argument and returns a Bool value. Keep the file when the return value is True, otherwise ignore it.

For example, you want to ignore all files with 'invalid'' names when reading

The code is as follows:

Import platformfrom torchvision.datasets import ImageFolderclass Check (object): def _ init__ (self, key_word: str): self.key_word = key_word self.separator ='\\'if platform.system () = = 'Windows' else' / 'def _ call__ (self) File_name: str)-> bool: folders = file_name.split (self.separator) return folders [- 1] .find (self.key_word)

< 0dataset = ImageFolder('./data', is_valid_file=Check('invalid')) 这里定义了一个实现了__call__方法的Check类,相比于直接定义函数的好处在于可以在构造函数里指定想要忽略的字符,并且能够根据操作系统的不同把文件目录分隔符给确定了。 更加复杂的功能可以自行修改代码逻辑实现,但是要注意如果某个类别的所有文件都被筛选掉了,ImageFolder会报FileNotFoundError错误。 如果想要忽略整个类别可以使用下面方法!!! 二、ImageFolder只读取部分类别文件夹 直接继承并且重写ImageFolder类的find_classes方法即可 from torchvision.datasets.folder import *from typing import *class FilterableImageFolder(ImageFolder): def __init__( self, root: str, transform: Optional[Callable] = None, target_transform: Optional[Callable] = None, loader: Callable[[str], Any] = default_loader, is_valid_file: Optional[Callable[[str], bool]] = None, valid_classes: List = None ): self.valid_classes = valid_classes super(FilterableImageFolder, self).__init__(root, transform, target_transform, loader, is_valid_file) def find_classes(self, directory: str) ->

Tuple [List [str], Dict [str, int]: classes = sorted (entry.name for entry in os.scandir (directory) if entry.is_dir ()) # adds the following sentence classes = [valid_class for valid_class in classes if valid_class in self.valid_classes] if not classes: raise FileNotFoundError (f "Couldn't find any class folder in {directory}.") Class_to_idx = {cls_name: i for i, cls_name in enumerate (classes)} return classes, class_to_idx

When using, for example, dataset folders with three categories: mouse, cat and dog, only want to read cat and dog

The code is as follows:

Dataset = FilterableImageFolder ('. / data', valid_classes= ['cat',' dog']) these are all the contents of the article "how to ignore specific files when reading datasets using ImageFolder in Pytorch". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report