In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces why model.eval () is added to the pytorch test. It is very detailed and has a certain reference value. Friends who are interested must read it!
Do need to use model.eval () when I test?
Sure, Dropout works as a regularization for preventing overfitting during training.
It randomly zeros the elements of inputs in Dropout layer on forward call.
It should be disabled during testing since you may want to use full model (no element is masked)
When using PyTorch for training and testing, be sure to pay attention to the instantiated model to specify train/eval,eval (), the framework will automatically fix the BN and DropOut, will not take the average, but with the trained value, otherwise, once the test batch_size is too small, it will easily be caused by the BN layer to generate a picture color distortion!
Add: the difference between model eval and torch no grad () in pytorch
The difference between model.eval () and with torch.no_grad ()
When you validation in PyTorch, you use model.eval () to switch to test mode, where you can
Mainly used to notify the dropout layer and batchnormalization layer to switch between train and val modes
In train mode, the dropout network layer sets the probability of retaining the activation unit according to the set parameter p (retention probability = p); the batchnorm layer continues to calculate and update the data parameters such as mean and var.
In val mode, the dropout layer allows all activation units to pass, while the batchnorm layer stops calculating and updating mean and var and directly uses the mean and varvalues that have been learned during the training phase.
This mode does not affect the gradient computing behavior of each layer, that is, gradient computing and storage are the same as training mode, except that there is no backprobagation.
While with torch.no_grad () is mainly used to stop the work of the autograd module in order to accelerate and save video memory, the specific behavior is to stop gradient computing, thus saving GPU computing power and video memory, but will not affect the behavior of dropout and batchnorm layers.
Working with scen
If you don't care about video memory size and computing time, just using model.eval () is enough to get the correct validation results, while with torch.zero_grad () is to further accelerate and save gpu space (because you don't have to calculate and store gradient), so you can calculate faster and run a larger batch to test.
Add: personal understanding of Pytorch's modle.train,model.eval,with torch.no_grad
1. Recently, I have encountered several problems in the process of learning pytorch
Do not understand why the difference between model.eval () and model.train () in training and test functions, after consulting, do the following
In general, our training process is as follows:
1. After getting the data for training, in the training process, use the
Model.train (): tell our network that this phase is used for training and that parameters can be updated.
2. Make a prediction after the completion of the training. In the prediction process, use the
Model.eval (): tell our network that this phase is used for testing, so the parameters of the model are not updated at this stage.
two。 But why use with torch.no_grad () during the eval () phase?
Access to related information: portal
With torch.no_grad-disables tracking of gradients in autograd.
Model.eval () changes the forward () behaviour of the module it is called upon
Eg, it disables dropout and has batch norm use the entire population statistics
To sum up, in the eval phase, even if it is not updated, the dropout or batch norm used in the model will fail and will be predicted directly, while using no_grad will set the gradient Autograd to False (because we default to True in training), which ensures that the reverse process is a pure test without changing the parameters.
In addition, the reference document says that this avoids the need to set every parameter, liberates the time overhead at the bottom of the GPU, and uniformly sets the gradient to False in the test phase.
The above is all the contents of the article "Why did you add model.eval () to the pytorch test?" Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.