Pytorch's method of using ReduceLROnPlateau to update learning rate 04/23 Update SLTechnology News&Howtos

Pytorch's method of using ReduceLROnPlateau to update learning rate

2025-04-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article mainly introduces "the method that Pytorch uses ReduceLROnPlateau to update the learning rate". In the daily operation, I believe that many people have doubts about the method that Pytorch uses ReduceLROnPlateau to update the learning rate. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts of "Pytorch uses ReduceLROnPlateau to update the learning rate". Next, please follow the editor to study!

Https://www.emperinter.info/2020/08/05/change-leaning-rate-by-reducelronplateau-in-pytorch/

Reason

> I have written a Pytorch learning rate update before, in which I feel that the learning rate is dynamically updated according to the number of times loss goes up or down. I feel like it's a fun thing. I've been setting mistakes for a long time, and I got it out today!

Analytical description

Torch.optim.lr_scheduler.ReduceLROnPlateau (optimizer, mode='min', factor=0.1, patience=10, verbose=False, threshold=0.0001, threshold_mode='rel', cooldown=0, min_lr=0, eps=1e-08)

After finding that loss no longer decreases or acc no longer increases, reduce the learning rate. The meanings of the parameters are as follows:

Parameter meaning mode'min' mode detects whether metric no longer decreases, 'max' mode detects whether metric no longer increases; lr*=factor;patience no longer decreases (or increases) cumulative times after factor trigger condition; print;threshold only pays attention to significant changes beyond threshold after verbose trigger condition Threshold_mode has two threshold calculation modes: rel and abs. Rel rules: if it exceeds best (1+threshold) in max mode, it is significant if it is lower than best (1-threshold) in min mode; abs rule: if it exceeds best+threshold in max mode, it is significant if it is lower than best-threshold in min mode; after cooldown triggers a condition, wait for a certain epoch to detect again to avoid lr falling too fast; min_lr minimum allows lr Eps ignores this update if the difference between the new lr and the old 1e-8 is small.

For example, as shown in the figure, the y-axis is lr,x and the order of adjustment is lr,x. If the initial learning rate is 0.0009575, the equation of learning rate is: lr = 0.0009575 * (0.35) ^ x

Import math import matplotlib.pyplot as plt#%matplotlib inlinex = 0 o = [] p = [] o.append (0) p.append (0.0009575) while (x

< 8): x += 1 y = 0.0009575 * math.pow(0.35,x) o.append(x) p.append(y) print('%d: %.50f' %(x,y))plt.plot(o,p,c='red',label='test') #分别为x,y轴对应数据,c:color,labelplt.legend(loc='best') # 显示label,loc为显示位置(best为系统认为最好的位置)plt.show()难点 >

I feel that the most difficult part is the choice of these parameters. The first one is the initial learning rate (both the miniest I have come into contact with and the image classification below seem to be 0.001. When I trained and adjusted here, I found that I set it to 0.0009575. I forgot to change this value in the last experiment, but I found that the result was good. The first time the code is run close to a loss of 0.001), it is difficult to estimate the product coefficient and determine how many times it has not decreased (increased) and decided to transform the learning rate. My own best way is to first press the default constant of 0.001 to train (combined with * * tensoarboard**) to see where the problem begins to determine the number of times, and the product coefficient, personal feeling is to use the above code to get a smoother and minimal change number as a choice. It is recommended that you back up the model first so as not to waste too much time when doing this kind of test!

Examples

The initial learning rate of this example is 0.0009575, and the product coefficient is 0.35. in my example, the condition of x change is: if there is no decrease for a total of 125times, x plus 1; after the first lr change (from 0.0009575 to 0.00011729), the loss value slowly orients to 0.001 (as shown in the first picture), and the accuracy is 69%.

Import torchimport torchvisionimport torchvision.transforms as transformsimport matplotlib.pyplot as pltimport numpy as npimport torch.nn as nnimport torch.nn.functional as Fimport torch.optim as optimfrom datetime import datetimefrom torch.utils.tensorboard import SummaryWriterfrom torch.optim import * PATH ='. / cifar_net_tensorboard_net_width_200_and_chang_lr_by_decrease_0_ 35 ^ x.pth'# Save the model address transform = transforms.Compose ([transforms.ToTensor (), transforms.Normalize ((0.5,0.5,0.5), (0.5) ]) trainset = torchvision.datasets.CIFAR10 (root='./data', train=True, download=True, transform=transform) trainloader = torch.utils.data.DataLoader (trainset, batch_size=4, shuffle=True, num_workers=0) testset = torchvision.datasets.CIFAR10 (root='./data', train=False) Download=True, transform=transform) testloader = torch.utils.data.DataLoader (testset, batch_size=4, shuffle=False, num_workers=0) classes = ('plane',' car', 'bird',' cat', 'deer',' dog', 'frog',' horse', 'ship' 'truck') device = torch.device ("cuda:0" if torch.cuda.is_available () else "cpu") # Assuming that we are on a CUDA machine, this should print a CUDA device:print (device) print ("get some random training data") # get some random training imagesdataiter = iter (trainloader) images Labels = dataiter.next () # functions to show an imagedef imshow (img): img = img / 2 + 0.5 # unnormalize npimg = img.numpy () plt.imshow (np.transpose (npimg, (1,2) )) plt.show () # show imagesimshow (torchvision.utils.make_grid (images)) # print labelsprint (''.join (' 5s'% classes [labels [j]] for j in range (4) print ("* *") # set a tensorborad# helper function to show an image# (used in the `plot_classes_ preds` function below) def matplotlib_imshow (img) One_channel=False): if one_channel: img = img.mean (dim=0) img = img / 2 + 0.5 # unnormalize npimg = img.cpu () .numpy () if one_channel: plt.imshow (npimg, cmap= "Greys") else: plt.imshow (np.transpose (npimg, (1,2) 0)) # set tensorBoard# default `log_ dir` is "runs"-we'll be more specific herewriter = SummaryWriter ('runs/train') # get some random training imagesdataiter = iter (trainloader) images, labels = dataiter.next () # create grid of imagesimg_grid = torchvision.utils.make_grid (images) # show images# matplotlib_imshow (img_grid, one_channel=True) imshow (img_grid) # write to tensorboard# writer.add_image (' imag_classify') Img_grid) # Tracking model training with TensorBoard# helper functionsdef images_to_probs (net, images):''Generates predictions and corresponding probabilities from a trained network and a list of images' output = net (images) # convert output probabilities to predicted class _, preds_tensor = torch.max (output, 1) # preds = np.squeeze (preds_tensor.numpy ()) preds = np.squeeze (preds_tensor.cpu (). Numpy () return preds [F.softmax (el, dim=0) [I] .item () for I, el in zip (preds, output)] def plot_classes_preds (net, images, labels): preds, probs = images_to_probs (net, images) # plot the images in the batch, along with predicted and true labels fig = plt.figure (figsize= (12,48) for idx in np.arange (4): ax = fig.add_subplot (1,4, idx+1, xticks= [] Yticks= []) matplotlib_imshow (images [idx], one_channel=True) ax.set_title ("{0}, {1pur.1f}%\ n (label: {2})" .format (classes [predes [IDX]], probs [idx] * 100.0, classes [labels [IDX]]) Color= ("green" if predes [IDX] = = labels [IDX]. Item () else "red") return fig#class Net (nn.Module): def _ _ init__ (self): super (Net, self). _ init__ () self.conv1 = nn.Conv2d (3,200,5) self.pool = nn.MaxPool2d (2,2) self.conv2 = nn.Conv2d (200,16) 5) self.fc1 = nn.Linear (16 * 5 * 5120) self.fc2 = nn.Linear (120,84) self.fc3 = nn.Linear (84,10) def forward (self, x): X = self.pool (F.relu (self.conv1 (x) x = self.pool (F.relu (self.conv2 (x) x = x.view (- 1) 16 * 5 * 5) x = F.relu (self.fc1 (x)) x = F.relu (self.fc2 (x)) x = self.fc3 (x) return xnet = Net () # # visualize the net structure to writer.add_graph (net Images) net.to (device) for the full code, please jump to: https://www.emperinter.info/2020/08/05/change-leaning-rate-by-reducelronplateau-in-pytorch/ here The study on "Pytorch uses ReduceLROnPlateau to update the learning rate" is over. I hope I can solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.