How to realize transforms.ToTensor and transforms.Normalize in pytorch 07/04 Update SLTechnology News&Howtos

How to realize transforms.ToTensor and transforms.Normalize in pytorch

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)05/31 Report--

In this article, the editor introduces in detail "how to achieve transforms.ToTensor and transforms.Normalize in pytorch". The content is detailed, the steps are clear, and the details are handled properly. I hope that this article "how to achieve transforms.ToTensor and transforms.Normalize in pytorch" can help you solve your doubts.

Transforms.ToTensor

Recently, when watching pytorch, I encountered normalization of image data, as shown in the following figure:

How to understand this string of code? If we look at transforms.ToTensor () sentence by sentence, we can first go to the official definition, as shown in the following figure:

What it roughly means is that transforms.ToTensor () can convert data in PIL and numpy formats from the [0255] range to [0255] by dividing the raw data by 255. In addition, the shape of the original data is (H x W x C), and after passing transforms.ToTensor (), the shape will become (C x H x W). I think you can understand this. This part is not difficult, but I think we should use some examples to deepen our impression.

Import some packages first

Import cv2import numpy as npimport torchfrom torchvision import transforms

Define an array model picture and note the np.uint8 when the array data type is needed [given in the official diagram]

Data = np.array ([1rect 1rect 1], [1recorder 1], [1rect 1], [1rect 1], [1rect 1], [1pr 1], [2pl 2], [2pr 2], [2pr 2], [2pr 2], [2pr 2], [3pr 3penny 3], [3je 3jue 3], [3je 3jue 3], [3je 3jue 3], [3je 3jue 3] [[4pyrre 4pr 4], [4jre 4pr 4], [4pje 4pr 4], [4pr 4je 4], [5pr 5pr 5], [5pr 5pr 5], [5pr 5pr 5], [5pr 5pr 5]], dtype='uint8')

You can take a look at data's shape, and notice that it is now (W H C).

Use transforms.ToTensor () to convert data

Data = transforms.ToTensor () (data)

At this point, let's take a look at the data and shape in data.

Obviously, the data is now mapped to [0,1], and the shape of data has been transformed.

* * Note: I don't know how you understand 3D arrays. Here is a method for me.

The shape of the original data is (5pjm 5p3), then it represents a two-dimensional array of five (5p3), that is, we get five five-row and three-column data by removing the outermost []. Similarly, if the shape of the transformed data is (3), then it represents a two-dimensional array of 3 (5), that is, we remove the outermost [] and get three five-row and five-column data.

Transforms.Normalize????

I believe that through the previous description, we should have some understanding of transforms.ToTensor. Let's talk about this transforms.Normalize in the future. Similarly, let's first give an official definition, as shown in the following figure:

You can see the output of this function output [channel] = (input [channel]-mean [channel]) / std [channel]. Here [channel] means to do this for each channel of the feature graph. [mean is the mean, std is the standard deviation] next let's look at the code in the first picture, that is

The first parameter here (0.5, 0.5) indicates that the average value of each channel is 0.5, and the second parameter (0.5, 0.5) indicates that the variance of each channel is 0.5. [because the image usually has three channels, the vectors here are all 1x3.] with these two parameters, when we pass in an image, we will transform the image according to the above formula. [note: the image here is not accurate, because the format passed in by this function cannot be PIL Image, so we should first convert it to Tensor format]

After all that has been said, what is the use of this function? We've normalized the data to 0-1 through the previous ToTensor, so what's the use of attaching a Normalize function now? In fact, what the Normalize function does is to transform the data to [- 1]. The previous data is 0-1, when taking 0, output = (0-0.5) / 0.5 =-1; when taking 1, output = (1-0.5) / 0.5 = 1. This unifies the data into [- 1], so the question comes again, what are the benefits of unifying the data into [- 1]? If the data is distributed between (0j1), the actual bias, that is, the input b of the neural network, will be relatively large, while the input b of the model will be 0, which will lead to the slow convergence of the neural network. After Normalize, the convergence speed of the model can be accelerated. [this sentence is to find the most explanations on the Internet, and I am not sure that it is correct]

At this point, do you think it's over? I would also like to talk to you about how did you get the above two parameters (0.5, 0.5, 0.5, 0.5). This is the mean and standard deviation calculated from the data in the data set, so the two values are often different in different data sets. Here is another example to help you understand the calculation process. The data mentioned in the above example are also used.

The above data has been converted by ToTensor, and now you need to ask for the mean and std of each channel of the data.

# need to enlarge the dimension of the data, increase the batch dimension data = torch.unsqueeze (data,0) # in pytorch is generally (batch,C,H,W) nb_samples = 0. # create a 3D empty list channel_mean = torch.zeros (3) channel_std = torch.zeros (3) N, C <... = data.shape [: 4] data = data.view (N, C,-1) # after flattening the HW of the data Sum (0) is the data of the same latitude accumulating channel_mean + = data.mean (2). Sum (0) # after flattening, the wsum h belongs to the second dimension. Calculate the standard deviation for them. Sum (0) is to accumulate the data of the same latitude channel_std + = data.std (2). Sum (0) # to get all the batch data. Here, for 1nb_samples + = N#, get the mean and standard deviation of the same batch channel_mean / = nb_sampleschannel_std / = nb_samplesprint (channel_mean, channel_std) # the result is tensor ([0.0118, 0.0118, 0.0118]) tensor ([0.0057, 0.0057, 0.0057])

Bring the above mean and std into the formula and calculate the output.

For i in range (3): data [I] = (data [I]-channel_ [I]) / channel_ STD [I] print (data)

Output result:

From the results, we can see that the mean and std calculated by us are not 0.5, and the final result is not between [- 1].

After reading this, the article "how to achieve transforms.ToTensor and transforms.Normalize in pytorch" has been introduced. If you want to master the knowledge points of this article, you still need to practice and use it yourself. If you want to know more about related articles, welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.