Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use the .notify () method in Pytorch

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly explains "how to use the .notify () method in Pytorch". The content of the explanation in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "how to use the .resume () method in Pytorch".

One of the main functions and features of PyTorch is the backword function. I know some basic derivatives:

Let, F = aqb

Where

A = 10

B = 10 ∂ F / ∂ a = b = > ∂ F / ∂ a = 20

∂ F / ∂ b = a = > ∂ F / ∂ b = 10

Let's implement this in PyTorch:

If an and b are vectors, the following code seems to give an error:

RuntimeError: grad can be implicitly created only for scalar outputs

It is written in the document that when we call the inverse function of a tensor, if the tensor is non-scalar (that is, its data has more than one element) and requires a gradient, then the function also needs to specify a specific gradient.

Here F is a non-scalar tensor, so we need to pass the gradient parameter to the back propagation function with the same dimension as the tensor F.

In the above code example, the gradient parameter is passed to the backword function and the required gradient values an and b are given. But why do we have to pass gradient parameters to the backword function?

To understand this, we need to understand how the .upload () function works. Mention these documents again:

Torch.autograd is an engine for calculating vector-Jacobian products. That is, given any vector v, calculate its product J@v.T Note: @ represents matrix multiplication

Generally speaking, a Jacobian matrix is a matrix with full partial derivatives. If we consider the function y, it has an n-dimensional input vector x and it has an m-dimensional output. Then calculate the Jacobian matrix that contains all the partial derivatives represented by J:

V is the outer gradient provided by the backword function. In addition, another important thing to note is that by default F.backward () and F.backward (gradient=torch.tensor ([1.]) The same, so by default, when the output tensor is a scalar, we don't need to pass the gradient parameter, as we did in the first example.

When the output tensor is a scalar, the size of v_vector is 1, that is, torch.tensor ([1.]), which can be replaced by a value of 1. In this way, we get the complete Jacobian matrix, that is, Juncv. T = J

However, when the output tensor is non-scalar, we need to pass the external gradient vector v, and the resulting gradient calculates the Jacobian vector product, namely J@v.T.

Here, for F = afib, a = [10.0,10.0] b = [20.0,20.0] and v = [1]. one. We get ∂ F / ∂ a:

So far, we have:

We introduce a new variable G, which depends on F

So far so good, but let's check the grad value of F, which is F.grad.

We get the None and display a warning

The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its. Grad attribute won't be populated during autograd.backward (). If you indeed want the gradient for a non-leaf Tensor, use. Retain _ grad () on the non-leaf Tensor.

In the process of forward propagation, the calculation graph is generated automatically and dynamically. For the above code example, the dynamic diagram is as follows:

From the above calculation diagram, we find that the tensors An and B are leaf nodes. We can use is_leaf to verify:

Torch backward () only accumulates the gradient of the leaf node tensor by default. Therefore, F grad has no value because the F tensor is not a leaf node tensor. To accumulate the gradient of non-leaf nodes, we can use the retain_grad method as follows:

In general, our loss value tensor is a scalar value, and our weight parameter is the leaf node of the calculated graph, so we will not get the error conditions discussed above. But knowing these special situations helps to learn more about the features of pytorch, in case you use it that day, right?

Thank you for your reading, the above is the content of "how to use the .notify () method in Pytorch". After the study of this article, I believe you have a deeper understanding of how to use the .notify () method in Pytorch, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report