In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
With less than 10 days left in 2018, looking back at the progress in the field of CV (Computer Vision) this year, there has been no revolutionary new breakthrough in technology.
The business focus of several leading enterprises is not only to enhance the accuracy of existing algorithms, but also to devote more energy to business layout.
It seems that every company is racing against the clock to explore new application scenarios for this visible "match point".
However, there are still a lot of new technological advances worth talking about in an article, such as Zero-Shot Learning today.
After all, when the scenario is developed to the limit, everyone goes back to the starting line of technology.
What is ZSL?
Zero-sample learning zero-shot learning is one of the most challenging machine recognition methods. In 2009, Lampert et al proposed the Animals with Attributes dataset and the classic attribute-based learning algorithm, which began to attract widespread attention.
The reason why it is so important is that it is very different from the thinking of traditional image recognition tasks.
In principle, ZSL is to make computers have the ability of human reasoning to identify a new thing that has never been seen before.
For example, we tell a child who has never seen a zebra: "A zebra is an animal that looks like a horse and has black and white stripes." he can easily find out which zebra is in the zoo.
However, in traditional image recognition algorithms, it is often necessary to feed sufficient zebra samples to the machine in order to make the machine recognize the zebra. Moreover, using the classifier trained by the zebra, other species cannot be identified.
But ZSL can do it without learning a single time, identifying new things only by feature description, which is undoubtedly one step closer to human intelligence.
So, how on earth does this "Tianxiu" work?
To put it simply, the high-dimensional semantic features are used to replace the low-dimensional features of the sample, so that the trained model is mobile.
For example, the high-dimensional semantics of the zebra is "the shape of the horse, the color of the panda, the stripe of the tiger". Although there is a lack of more details, these high positions are enough to classify the "zebra" so that the machine can successfully predict it.
This solves the long-standing problem of image recognition: how machines should learn and recognize something if it has never appeared in existing data sets.
Does it sound cool and smart, but in fact it is!
Where does ZSL's sense of superiority come from?
At CVPR 2018, the top meeting in the field of CV, a paper on learning zero sample recognition using discriminant features is considered to represent the current best level in the field.
The reason why it has been paid so much attention is mainly due to the outstanding skill of zero sample learning (ZSL) in target recognition task in recent years.
Because the reality situation challenged by ZSL is more stringent than ever, it has the key ability to influence the effect of other image recognition.
Most of the existing recognition technologies focus on supervised learning, so there is a need to launch larger data sets, and Google has said they are training with 300Million 3D images. Moreover, each domain needs its own dataset.
In this case, the workload of all data tagging becomes very large, and it is impossible to talk about many new things that want to label. In this way, the efficiency and cost on the deployment side have become the "unbearable weight" of the industry.
What are we going to do? Researchers have to try to make machines learn to "spend less money and do more things".
Take the research of Tencent AI Lab as an example, its "Diverse Image Annotation" is to make full use of the semantic relationship between tags and use a small number of diversity tags to express as much image information as possible to realize automatic tagging.
ZSL is even more extreme, to "empty glove white wolf" without a single sample, this extreme challenge has brought new vitality to the technical community.
First of all, ZSL reduces the dependence of existing algorithms on data sets and the pressure of annotation, which helps to improve the affinity and deployment efficiency of machine vision technology.
In addition, with the growing demand for reduced computing power in the industry, ZSL clearly and effectively points to feasible solutions.
More importantly, ZSL not only solves visual problems, but also complements the development of NLP. To identify according to the fuzzy high-dimensional semantic description, the requirement for machines is not only a simple classification, but also to understand some advanced human knowledge, such as the style of a work of art, a special emotion and so on. Find this semantic connection and combine machine vision with NLP technology to solve problems. The technical imagination inspired by ZSL is very interesting.
It is said that "data is the fuel of AI", is it doomed to GG without fuel? ZSL said he could renew his life, just showing off!
What is the difference between 0 to 1:ZSL and OSL?
At this time, many students who pay attention to the technology trend may have found that zero-sample learning and small-sample learning (OSL,One-Shot Learning) seem to play a very similar role in the final application results.
For example, they all point to high-level cognitive problems. As long as you give OSL a picture of a zebra, it can effectively identify it from other animals. It also depends on the ability to learn, classify and reason from very few tags.
On the application side, because they do not rely on large data sets, both models can help the AI identification of the industry to reduce costs and increase efficiency.
In theory, since zero sample is a subset of a small number of samples, can we directly apply ZSL's model to solve the OSL problem?
Actually, it can. After all, "never seen" compared to "met once", "from 0 to 1" is more technically difficult.
However, the two can not be easily replaced or equated, and their respective studies are of great significance.
The biggest difference is that ZSL challenges knowledge transfer in similar semantics, while OSL needs to address the ability of semantic completion, that is, how to use unique samples to learn more features.
In practical application, different key abilities give them different "must-kill skills".
Wider than the grassland: the Application scenario of ZSL
So what exactly can ZSL do?
As we said earlier, the biggest pain point of deep learning in industry is to fall in love with a Mustang (generalization ability), but there is no prairie at home (high-quality data sets). No enterprise will contract all the grasslands regardless of cost for a few Mustangs.
And the imagination space that ZSL can provide is much larger than that of the Prairie:
1. Automatic image tagging and processing. Manual tagging is expensive and slow. Once ZSL is applied, its semantic understanding and transfer ability can be observed systematically by combining different visions with the aid of knowledge graph (such as attributes, text description, etc.). Data identification and labeling can be completed automatically, and the accuracy of the results is not lower than that of manual work.
two。 Translation in unknown or obscure languages. In the movie "the coming", American linguists complete the communication with aliens through difficult feature inference. In the future, it can be done by machines. For example, some languages with few or even untested samples (such as Urbuk) can automatically complete the translation process and realize the love&peace of the universe through the ZSL system.
3. New category of image compositing. The learning goal of ZSL is to identify new things, and some new types of image compositing can be created through ZSL. Such as restoring extinct species. Maybe the dinosaurs you see in the Jurassic series in the future are "painted" by machines.
4. Video recognition. At present, more and more data are visual and text signals, such as comprehensive video sites, video, audio, subtitles, subtitles, comments and other multimodal information. To explore the correlation between them depends on the macro prediction ability of ZSL.
All in all, it is a very useful function to enable machines to reason and judge by virtue of "a few words" like human beings.
From getting started to giving up: the problem of ZSL is still stubborn
Since it is so awesome, why is ZSL lukewarm all the time? At least it has not become a "group favorite" like other deep learning algorithms. The main reason is still a few "psoriasis" stubborn diseases:
First, the effect of ZSL depends on the information of similar modes. If there is too much difference between the categories of the training set and the test set during training, for example, one is full of animals and the other is full of homes, then it is too difficult for ZSL to analyze the mapping relationship between the two, and it is easy to have the problem of "strong deviation" of attribute drift, which is difficult to predict the correct results, resulting in a great discount to the performance of ZSL.
The second is the lack of sufficient professional definition and description. Although ZSL does not need a large number of image data sets, it needs feature description. In this respect, manual classification is better than machine classification. However, at present, there is still a lack of sufficient professionals to assist, and the development of NLP itself is not enough to meet the needs of ZSL, which makes the overall process relatively slow.
If these shackles are not solved, even if ZSL has the potential to go from zero to high achiever, it can only enter Baoshan and return in vain, and the job opportunities will be taken away by an algorithm inferior to it.
Looking back on the industrialization process of CV technology in the past year, it can be regarded as a popular trance.
We can imagine that machine vision will be everywhere in the next year or two, from personal intelligent terminals to the eyes of cities.
On the one hand, the application scene is unusually hot, on the other hand, potential stocks like ZSL are in a relatively stagnant state, and the core issues have not made a breakthrough.
At this stage of the New year as a connecting link between the past and the future, maybe it's time to give ZSL a future.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.