In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
Python artificial intelligence deep learning model training experience, I believe that many inexperienced people do not know what to do, so this paper summarizes the causes of the problem and solutions, through this article I hope you can solve this problem.
First, if the training set does not perform well. Try a new activation function
ReLU:Rectified Linear Unit
The image is shown in the following figure: when Z0, a = z, which means that the activation function is a linear conversion of the input. Using this activation function, due to the existence of 0, some neurons will be deleted after calculation, making the neural network narrower.
There are other variants of this function, as shown in the following figure, mainly for z less than 0.
Maxout: the general form of the above functions
To put it simply, who has a big output can learn the activation function by himself through Maxout. When the parameters given are different, all kinds of functions described above can be obtained. As shown in the following figure, when input to a cell, you get a blue line. If the parameters of the second cell are both 0, then it is a line on the X axis. Then the larger of the two is ReLU;. When the parameter of the second cell is not 0, you can get other forms of results.
two。 Adaptive learning rate ① Adagrad
Adagrad uses the previous gradient to square and re-square as part of the coefficient when calculating the gradient.
② RMSProp
It is an advanced version of Adagrad, and in Adagrad, it uses all the previous sum of the square of the gradient, and the current gradient is not taken into account in this coefficient. In RMSProp, we consider the current gradient, square it, and assign a weight to the two terms.
③ Momentum
Gradient drop of added momentum
In the following picture, v is the direction of the last time. When calculating this direction, add lambda times the previous direction. In fact, v is the sum of all the gradients calculated in the past.
④ Adam
Combine RMSProp and Momentum
Second, the effect is not good on the test set. Stop ahead of time
Stop training ahead of time through cross-validation set
two。 Regularization
Consistent with other regularization algorithms, there are L1 and L2 regularization, which are not described in detail here.
3.Dropout
During each training, some neurons and input values are removed with a p% chance. Get a thinner neural network as shown in the figure below. To train this neural network directly. The next time you train, resample the entire network. (similar to random forests)
If the dropout is not performed during the test, if the dropout probability during training is p%, then on the test set, all the weights are multiplied by (1murp)%.
After reading the above, have you mastered the training methods of Python artificial intelligence deep learning model? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.