In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article is about how pytorch automatically prints tensor information for each line of code. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.
Introduces TorchSnooper, a utility for PyTorch code. The writer is the author of TorchSnooper and one of the developers of PyTorch.
GitHub project address: https://github.com/zasdfgbnm/TorchSnooper
You may encounter this kind of trouble: for example, when you run your own PyTorch code, PyTorch prompts you that the data types do not match and you need a tensor of double, but you give float; or you need a CUDA tensor, but you give you a CPU tensor.
Such as the following:
RuntimeError: Expected object of scalar type Double but got scalar type Float
This kind of problem is troublesome to debug because you don't know where the problem started. For example, you may create a new CPU tensor with torch.zeros in the third line of the code, and then the tensor does a number of operations, all on CPU, and does not report an error until the tenth line needs to operate with the CUDA tensor you passed as input. To debug this error, sometimes you have to hand-write print statements line by line, which is very troublesome.
Or, you may imagine in your head what you will get if you operate a tensor, but PyTorch mistakenly reports that the shape of the tensor does not match, or it does not report a mistake at all, but the final shape is not what we want. At this time, we often don't know where we begin to "deviate from our expectations". We sometimes have to insert a lot of print statements to find out why.
TorchSnooper is a tool designed to solve this problem. The installation of TorchSnooper is very simple, you only need to execute the standard Python package installation instructions:
Pip install torchsnooper
After installation, you only need to decorate the function to be debugged with @ torchsnooper.snoop (). When this function is executed, it will automatically print the shape, data type, device, and gradient information of the tensor of the execution result of each line.
After installation, here are two examples to illustrate how to use it.
Example 1
For example, we wrote a very simple function:
Def myfunc (mask, x): y = torch.zeros (6) y.Maskedkeeper _ (mask, x) return y
This is how we use this function:
Mask = torch.tensor ([0,1,0,1,1,0], device='cuda') source = torch.tensor ([1.0,2.0,3.0], device='cuda') y = myfunc (mask, source)
The above code seems to be fine, but it actually runs with an error:
RuntimeError: Expected object of backend CPU but got backend CUDA for argument # 2 'mask'
What is the problem? Let's snoop! Decorate the myfunc function with @ torchsnooper.snoop ():
Import torchimport torchsnooper@torchsnooper.snoop () def myfunc (mask, x): y = torch.zeros (6) y.Maskedpeople _ (mask, x) return ymask = torch.tensor ([0,1,0,1,0], device='cuda') source = torch.tensor ([1.0,2.0,3.0], device='cuda') y = myfunc (mask, source)
Then run our script, and we see the output like this:
Starting var:.. Mask = tensorStarting var:.. X = tensor21:41:42.941668 call 5 def myfunc (mask, x): 21 mask 41mask 42.941834 line 6 y = torch.zeros (6) New var:. Y = tensor21:41:42.943443 line 7 y.maskedspeaker _ (mask, x) 21 exception 41purl 41.944404 exception 7 y.maskedspeaker _ (mask, x)
Combined with our errors, we mainly look at the device of each variable output to find out which variable is on CPU from the earliest. We noticed this line:
New var:. Y = tensor
This line tells us directly that we created a new variable y and assigned a CPU tensor to it. This line corresponds to y = torch.zeros (6) in the code. So we realized that when using torch.zeros, the tensor created by default is on CPU if the device is not artificially specified. We change this line to y = torch.zeros (6, device='cuda'), and the problem in this line is fixed.
Although the problem in this line has been fixed, our problem has not been solved completely, and the modified code still reports an error, but at this time the error becomes:
RuntimeError: Expected object of scalar type Byte but got scalar type Long for argument # 2 'mask'
Well, this time the error lies in the data type. This error report is more informative, and we can probably know that the data type of our mask is wrong. Looking at the output of TorchSnooper again, we notice:
Starting var:.. Mask = tensor
Sure enough, our mask type is int64, not the uint8 it should be. Let's modify the definition of mask:
Mask = torch.tensor ([0,1,0,1,1,0], device='cuda', dtype=torch.uint8)
And then you can run it.
Example 2
This time we are going to build a simple linear model:
Model = torch.nn.Linear (2,1)
We wanted to fit a plane y = x1 + 2 * x2 + 3, so we created a dataset like this:
X = torch.tensor ([[0.0,0.0], [0.0,1.0], [1.0,0.0], [1.0,1.0]]) y = torch.tensor ([3.0,5.0,4.0,6.0])
We use the most common SGD optimizer for optimization, and the complete code is as follows:
Import torchmodel = torch.nn.Linear (2,1) x = torch.tensor ([[0.0,0.0], [0.0,1.0], [1.0,0.0], [1.0,1.0]]) y = torch.tensor ([3.0,5.0,4.0,6.0]) optimizer = torch.optim.SGD (model.parameters () Lr=0.1) for _ in range (10): optimizer.zero_grad () pred = model (x) squared_diff = (y-pred) * * 2 loss = squared_diff.mean () print (loss.item ()) loss.backward () optimizer.step ()
However, in the process of running, we found that the loss dropped to around 1.5 and no longer dropped. This is very abnormal, because the data we build are all error-free and fall on the plane to be fitted, and loss should be reduced to 0 to be normal.
At first glance, I don't know what the problem is. With the idea of giving it a try, let's snoop for a while. In this example, we don't have a custom function, but we can use the with statement to activate TorchSnooper. Put the training loop into the with statement, and the code becomes:
Import torchimport torchsnoopermodel = torch.nn.Linear (2,1) x = torch.tensor ([[0.0,0.0], [0.0,1.0], [1.0,0.0], [1.0,1.0]]) y = torch.tensor ([3.0,5.0,4.0,6.0]) optimizer = torch.optim.SGD (model.parameters () Lr=0.1) with torchsnooper.snoop (): for _ in range (10): optimizer.zero_grad () pred = model (x) squared_diff = (y-pred) * * 2 loss = squared_diff.mean () print (loss.item ()) loss.backward () optimizer.step ()
When we run the program, we see a long list of output, browsing bit by bit, and we notice
New var:. Model = Linear (in_features=2, out_features=1, bias=True) New var:. X = tensorNew var:. Y = tensorNew var:. Optimizer = SGD (Parameter Group 0 dampening: 0 lr: 0....omentum: 0 nesterov: False weight_decay: 0) 02Parameter Group 38 for 02.016826 line 12 for _ in range 10: New var:. _ = 002line 38 line 02.017025 line 13 optimizer.zero_grad () 02line 38 line 14 pred = model (x) New var:. Pred = tensor02:38:02.018100 line 15 squared_diff = (y-pred) * * 2New var:. Squared_diff = tensor02:38:02.018397 line 16 loss = squared_diff.mean () New var:. Loss = tensor02:38:02.018674 line 17 print (loss.item ()) 02print (loss.item ()) 02.018852 line 18 loss.backward () 26.97929000854492202For 38 loss 02.057349 line 19 optimizer.step ()
If we take a closer look at the shape of each tensor, it is not difficult to find that the shape of y is (4,), while the shape of pred is (4, 1). They subtract from each other, and the shape of the squared_diff becomes (4, 4) due to the presence of the broadcast.
Of course, this is not what we want. This problem is also easy to fix by changing the definition of pred to pred = model (x). Squeeze (). Now look at the TorchSnooper output of the modified code:
New var:. Model = Linear (in_features=2, out_features=1, bias=True) New var:. X = tensorNew var:. Y = tensorNew var:. Optimizer = SGD (Parameter Group 0 dampening: 0 lr: 0....omentum: 0 nesterov: False weight_decay: 0) 02purl 4623.545042 line 12 for _ in range (10): New var:. _ = 002line 46line 23.545285 line 13 optimizer.zero_grad () 02VO2VAND 23.545421 line 14 pred = model (x) .squeeze () New var:. Pred = tensor02:46:23.546362 line 15 squared_diff = (y-pred) * * 2New var:. Squared_diff = tensor02:46:23.546645 line 16 loss = squared_diff.mean () New var:. Loss = tensor02:46:23.546939 line 17 print (loss.item ()) 02Parade 4623.547133 line 18 loss.backward () 02Partition 46purl 23.591090 line 19 optimizer.step ()
Now the result looks normal. And after testing, loss can now be reduced to very close to zero. The great task has been completed.
Thank you for reading! This is the end of this article on "how pytorch automatically prints the tensor information of each line of code". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it out for more people to see!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.