Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Stop SOTA, it's called "fine-tuning"! Science posted bombardment of thesis irrigation.

2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

2020-06-06 14:58:59 Jin Lei comes from Aofei Temple

Qubit report | official account QbitAI

Is the development of AI algorithm really that rapid?

To find out, researchers from MIT cross-tested 81 AI algorithms, and the results were startling:

There is no clear evidence that these algorithms have significantly improved the effectiveness of the task within 10 years.

In response to similar problems, Science recently issued an article saying:

The progress of artificial intelligence in some areas is remarkable, but it is not a real progress.

So, what's going on here?

"it's fine-tuning, not core innovation."

The subjects tested by MIT researchers are 81 pruning algorithms.

To put it simply, this kind of algorithm is to "prune" the connection of the neural network in order to improve the efficiency.

However, the development of this algorithm, as Matthew Hutson, the author of Science, said:

On top of this, many researchers make some "fine-tuning" and then claim that their algorithms have advantages.

Therefore, MIT researchers made a meta-analysis of these algorithms and proposed a framework-ShrinkBench, which is used to promote the standardized evaluation of pruning algorithms.

A really good algorithm needs to stand the test, so what's the result?

The first test: pruning vs architecture

Based on ImageNet, the researchers plotted the accuracy and compression / acceleration level of the pruned model, as well as the same indicators without pruning and different architectures, and the results are shown in the following figure.

It is not difficult to see that after pruning, a given architecture can improve the tradeoff between time / space and accuracy, and sometimes improve accuracy.

But the effect of pruning is usually not as good as that of another architecture.

The second round of test: comparison of "peer" algorithms

This dimension is considered because researchers have found that many jobs hold high the banner of "SOTA", but the objects of comparison are not complete.

The obvious phenomenon is that there is a lack of comparison with the algorithms proposed before 2010, or even with other algorithms known as SOTA, as shown in the following figure.

The third test: the combination of datasets and architectures

Of the 81 papers, the combination of ImageNet and VGG-16 is the most common, and three of the first six most common combinations involve MNIST.

However, MNIST is very different from other mainstream image classification data sets: its images are grayscale, most of them are composed of 0, and the accuracy can reach more than 99% by using a simple model for classification.

The fourth round of test: measurement indicators

There are a variety of metrics, do not say anything, directly above the picture.

Of course, there are a series of problems, such as data preprocessing, parameter adjustment strategy and so on, which will lead to different results.

The first research work, Davis Blalock, said:

These improvements are so-called "fine-tuning" rather than "core innovation" as claimed by researchers, and some of them may not even exist at all.

As a result, researchers at MIT have developed a pruning method that facilitates the development and standardized evaluation of neural networks.

ShrinkBench provides standardized and extensible capabilities for training, pruning, fine-tuning, calculating metrics, and drawing, all using standardized pre-training models and datasets.

As another author, John Guttag, said:

If you can't measure something, it's hard to make it better.

Well, now if you want to make a little progress in pruning algorithm, it may no longer be so easy.

Science posts bombardment of water papers

Recently, Science also published an article on "Water papers", arguing that the development of many branches of artificial intelligence is unstable:

In 2019, a meta-analysis of the information retrieval algorithms used in search engines resulted in a "high water mark" (high-water mark). However, it existed as early as 2009; also in 2019, another study reproduced seven neural network recommendation systems, and the results showed that six of them did not perform as well as the simpler non-neural network algorithms developed many years ago. In February this year, Zico Kolter, a computer scientist at Carnegie Mellon University, published a paper on arXiv. He found that the early confrontation training method, PGD, could achieve the effect of the so-called newer and more complex methods by strengthening them with simple tips. In March this year, Kevin Musgrave, a computer scientist at Cornell University, published a paper on arXiv to study the loss function. In a task involving image retrieval, he compared more than a dozen algorithms equally and found that, contrary to the claims of these researchers, the accuracy had not improved since 2006.

As Musgrave says:

The wave of hype exists all the time.

On the other hand, those enduring algorithms, such as LSTM, have made great breakthroughs in language translation tasks since they were proposed in 1997.

If LSTM is properly trained, its performance will be comparable to that of the algorithm in 20 years' time.

Similarly, GAN, which was proposed in 2014, has greatly improved the ability to generate images. According to a report in 2018, as long as there is enough computation, the original GAN method can be comparable to the later method.

In this regard, Kolter believes that researchers should be keen to create a new algorithm to achieve the SOTA effect of the new algorithm, rather than making adjustments to the existing algorithm.

So, what is the reason behind this kind of paper irrigation today?

One of the factors is the problem of evaluation criteria pointed out by MIT researchers-different data sets, different adjustment methods, different performance indicators and baselines, which is not feasible.

Another reason is the explosive growth in the field of AI. The number of papers far exceeds the number of experienced reviewers. Reviewers should insist on making a better comparison with a reasonable and scientific benchmark.

What is more terrible than the filling of papers is fraud.

Think this is the only academic mess?

No, there is also a "counterfeiting wind".

On May 20, foreign netizens exposed a major event of academic fraud:

8 articles, different authors, different hospitals, different types of cancer, different protein expression, Leng is exactly the same results, 8 papers were published.

Chenguang us, a postdoctoral fellow and doctor of nutrition at the Diabetes Center of UAB Medical College and a netizen on Weibo, said:

Such insane counterfeiting is simply suffocating.

However, what is even more sad is that the authors of the paper are all from China.

And from the point of view of the signature of the article, from the front-line doctor to the director and deputy chief physician, the deputy director of the hospital, and many articles are funded by the National Natural Science Foundation of China.

It is not easy to make such a fake.

Netizens also said:

Broke through all my knowledge of academic fraud.

Coincidentally, not long ago in Zhihu, it was exposed that the professor of Nanjing University of posts and Telecommunications published 300 IEEE papers in three and a half years, which became a hot topic for a while.

His disciple "classmate Huang" faked his thesis and pretended to be a student of Peking University.

……

What do you think of such an academic mess?

Portal:

ShrinkBench project address:

Https://github.com/jjgo/shrinkbench

Address of ShrinkBench paper:

Https://arxiv.org/abs/2003.03033

Reference link:

Https://www.sciencemag.org/news/2020/05/eye-catching-advances-some-ai-fields-are-not-real

Https://weibo.com/roger1130?referflag=0000015010&from=feed&loc=nickname&is_hot=1#_rnd1591086111501

Https://twitter.com/MicrobiomDigest/status/1266140721716719616

Https://www.zhihu.com/question/397548354/answer/1248933002

Https://www.toutiao.com/i6835125799020921355/

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report