Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What does pruning optimization in TensorFlow refer to?

2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

Today, I would like to talk to you about what pruning optimization in TensorFlow refers to. Many people may not know much about it. In order to make you understand better, the editor has summarized the following content for you. I hope you can get something according to this article.

For a long time, the compression algorithm of AI model is a profound knowledge, before the domestic deep science and technology also rely on efficient compression algorithm, so that more lightweight computing terminals can also enjoy the benefits of AI.

At present, the artificial neural network used by the mainstream AI, as its name implies, is a very complex and interrelated network structure. Because different types of samples, such as images, languages and characters, will be cut and stored in different ways to form the most basic neurons, machine learning or deep learning is to find out the connections among these seemingly unrelated neurons and form dense mesh logic.

However, the more neurons, the larger the network, the higher the need for space to store the neural network, but the storage space in some terminals, especially in mobile intelligent devices, is a very rare resource, so if these intelligent terminals with limited storage space can also form AI processing capacity, then the neural network model can not be too large.

Interestingly, our brains actually go through a similar process as they grow up, called synaptic pruning, or synaptic pruning. And if the brain is not effectively pruned during development, it may eventually develop into autism. Autistic brains have far more neural connections in the prefrontal lobe than in normal people.

Supplement the deficiency of TensorFlow model optimization tool

In order to solve this problem, many developers, including deep lessons, have proposed different compression methods and logic, among which pruning has been proved to be one of the most efficient. In fact, the pruning method is to prune the branches that have little impact on the overall logic or even unnecessary branches in the neural network, so that the size of the model can be more effectively controlled. however, the most important key of the pruning method is to be pruned in the right place. in order to improve efficiency, reduce storage space requirements, while taking into account the correctness of model logic.

Google published a model optimization tool in 2018, which can significantly reduce the storage space requirements of machine learning models and improve execution efficiency by using post-training quantization mechanisms.

Initially, the quantization method is achieved by reducing the accuracy of the parameters. To put it simply, the original model using 32bit floating point uses 8bit integers to re-store and calculate. Of course, the optimization process still needs to be compared and calibrated to avoid serious distortion.

Pruning API, which has just been launched, is an optimization tool added to the center of previous model optimization tools. Its function is mentioned above, through the algorithm, the unnecessary values in the weight tensor (Weight Tensor) are trimmed to minimize the unnecessary connections in the neural network.

Pruning API can be compatible with the previous version of quantitative optimization, in other words, developers can use both optimization methods in their models to achieve better optimization efficiency.

Google also demonstrated in the tool notes how to compress the 90% sparse model of MNIST from 12MB to 2MB.

In more model compression experiments, the compression efficiency of pruning API ranges from 50% to 90% under different sparse accuracy conditions.

working principle

This new compression optimization tool is based on Keras's weight pruning API, uses a simple but widely applicable algorithm, and iterates to delete connections according to the size of the training period.

The developer specifies the sparsity of the final target (for example, 90%), as well as a plan to perform pruning (for example, starting pruning at step 2000, stopping at step 10000, and executing every 100 steps), as well as optional configuration structures for pruning (for example, individual values or value blocks applied to certain shapes).

As the training goes on, the pruning process will be arranged to eliminate the weights with the lowest range (that is, those closest to zero) until the specified sparsity target is reached. Each time the pruning process is scheduled for execution, the current sparsity target is recalculated, starting at 0% until it reaches the final goal sparsity at the end of the pruning plan.

After reading the above, do you have any further understanding of what pruning optimization in TensorFlow refers to? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report