In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-10 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article mainly explains the "python read mnist data set method", the article explains the content is simple and clear, easy to learn and understand, the following please follow the editor's train of thought slowly in depth, together to study and learn "python read mnist data set method"!
Introduction to dataset format
This part of the content is very common on the Internet, here is a brief introduction. The mnist dataset downloaded on the network contains four files:
The first two are the image and label of the test set, containing 10000 samples. The last two are from the training set and contain 60000 samples. GZ represents this compressed package, and if you decompress it, you will get a binary file in .ubyte format.
The image above shows the storage format of the label and image data of the training set. Both files start with magic number and number of images/items, and the second is useful, indicating the number of samples stored in the file. Another thing to note is the number of bits of data, there are 32-bit integers and 8-bit integers.
Read method file reading in .gz format
Import gzip is required
The code to read the training set is as follows:
Def load_mnist_train (path, kind='train'):''path: path to the dataset kind: value is train Represents the read training set''labels_path = os.path.join (path,'%s-labels-idx1-ubyte.gz'% kind) images_path = os.path.join (path,'%s-images-idx3-ubyte.gz'% kind) # Open the file with gzip.open (labels_path,' rb') as lbpath using gzip: # use the struct.unpack method to read the first two data, > indicates that the high order comes first I stands for 32-bit integers. Lbpath.read (8) means to read 8 bytes at a time from a file # so that the first two data read are magic number and the number of samples magic, n = struct.unpack ('> II',lbpath.read (8)) # use np.fromstring to read the rest of the data Lbpath.read () means reading all data labels = np.fromstring (lbpath.read (), dtype=np.uint8) with gzip.open (images_path, 'rb') as imgpath: magic, num, rows, cols = struct.unpack (' > IIII',imgpath.read (16)) images = np.fromstring (imgpath.read (), dtype=np.uint8) .reshape (len (labels), 784) return images, labels
The code to read the test set is similar.
Reading of uncompressed files
If you extract the four files locally, you will get a file in .ubyte format, and the read code will change.
Def load_mnist_train (path, kind='train'):''path: path to the dataset kind: value is train Represents the read training set''labels_path = os.path.join (path,'%s-labels-idx1-ubyte'% kind) images_path = os.path.join (path,'%s-images-idx3-ubyte'% kind) # No longer use gzip to open the file with open (labels_path,' rb') as lbpath: # use the struct.unpack method to read the first two data, > indicates the high bit comes first, and I represents the 32-bit integer. Lbpath.read (8) means to read 8 bytes at a time from a file # so that the first two data read are magic number and the number of samples magic, n = struct.unpack ('> II',lbpath.read (8)) # use np.fromfile to read the remaining data labels = np.fromfile (lbpath,dtype=np.uint8) with gzip.open (images_path, 'rb') as imgpath: magic, num, rows Cols = struct.unpack ('> IIII',imgpath.read (16)) images = np.fromfile (imgpath,dtype=np.uint8) .reshape (len (labels), 784) return images, labels
After reading, you can check the length of images and labels to confirm that the read is correct.
Thank you for your reading, the above is the content of "the method of python reading mnist data set". After the study of this article, I believe you have a deeper understanding of the method of python reading mnist data set, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.