In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >
Share
Shulou(Shulou.com)11/24 Report--
Thanks to CTOnews.com netizens, I want Kangkang's clues to deliver! Unexpectedly, with the blessing of AI, I also have a crazy day in my company!
The black photo of my colleague at the bottom of the box was turned out by me in 3 seconds, and the speed of light was made into a pre-emptive meme.
Here, just type "the laughing man" in the search box of the online disk, and the relevant image can be retrieved immediately.
Then select the object you want to "attack", click Edit, and you can add text to emojis with one click.
The whole process is quite slippery. I have finished the map, and my colleagues are still looking for the picture (doge).
And this search can also directly understand "what is a meme", you can click to find the previous picture.
That is to say, using accurate search in a web disk full of old photos and materials, King Doutu gives up who I am.
In addition to photos, it can even directly search the videos of colleagues or identify the text in emojis, which is very flexible.
So, where on earth does this function start, and what is the difference between traditional album search and traditional album search?
Classic meme search "artifact" first of all, enable this intelligent search function in Baidu's online disk.
Open the network disk search box, enter "Advanced Image search", and a function entry will be displayed. After entering, click "experience now" and wait for the data upgrade to be completed. The system will send a notification, and you can start playing.
After enabling the advanced image search function, you can search for pictures with one click in the search box on the home page of the network disk, without having to enter a tool.
How to search? "search what's on your mind".
Take searching for a single word as an example, such as vague expressions such as "Beyer" and "winking", intelligent search will understand in a second:
Online buzzwords such as "repair dog" are also in its grasp.
It has even evolved its own "biases", such as "PPT" (doge) that is often associated with "press conferences":
Not only a single complex word or adjective, you can also directly say a whole sentence description, or even add multiple qualifiers when searching a picture.
At the same time, as the details of the input increase, its search results will be adjusted in real time.
For example, just enter "sleeping" and you can see that the first photo found by the system is a meme of a cat lying down.
But as the keyword was perfected to "sleeper", the cat film was immediately screened out by the system.
In addition to accurately grasping the essence of the picture, this intelligent search can also identify the text in the picture, and the search results are very comprehensive.
For example, if you search for "can't hold back", it can not only give emoji packages with identical pictures and text, but also release other similar pictures:
On the search scope, not only to find the picture, this function can also search the video.
In short, the new intelligent search function of the online disk, so that finding photos, videos and other documents is no longer a laborious "prompt word project".
If you want to find your own picture, just "chat" with the online disk, and it can get the photo you are looking for as accurately as a person.
So, how on earth is this function realized? We scratched the technical principle behind the pick and found that it was really not that simple.
Sure enough, behind the use of a large model of intelligent search, in essence, a bit like a picture + video intelligent search function of the "private customized version" web search engine.
But in order to achieve this function, Baidu online disk team and even large models have been used, the core purpose is to solve the four major problems of traditional image search-can not be found, search is not accurate, search is not fast, search method is single.
The first and one of the biggest difficulties to be solved is the problem that "cannot be found".
In traditional photo albums searched by tags, the built-in search engine does not really associate the "meaning" between the picture and the text, in other words, "the text is not right to the picture".
Tag search of △ ordinary mobile phone
In order to solve this problem, the team chose Baidu Wenxin's multimodal large model VIMER-ViLP, and trained it with massive picture and text data to realize vector-based semantic search.
The core principle of this method is to map the feature vectors of text and image to the same semantic vector space. The closer the distance between the vectors is, the higher the similarity is. At the same time, it can avoid semantic loss and reduce the probability of "not found".
Compared with CLIP,VIMER-ViLP, it uses more Chinese data in training, so searching for special Chinese nouns will be more accurate. For example, the cultural relic "the first Dragon of China" photographed in the museum:
However, while the large model can enhance the understanding of the picture, it is powerless in the face of information such as place, time and person's name that involves photo shooting.
Next, we also need to combine the information of the photo itself to solve the problem of "inaccurate search".
Traditional label search requires accurate photo shooting data such as specific date (year, month, day) and latitude and longitude, but the search terms entered by users are often vague.
To this end, the team implemented a composite query based on semantic understanding, that is, using AI to correspond the input text to the photo shooting data, which is equivalent to doing a translation. For example, enter "the year before last" and semantic understanding will automatically provide all the photos taken in 2021.
Even a more specific place name, such as "Xidan", is fine, and the search can be accurate to "photos" and sift out materials you don't want to search:
After the accuracy is solved, it is the problem that this kind of intelligent search is "not easy to search" and high cost.
After all, just indexing existing images can make the phone's computing power explode, not to mention the cost of re-indexing new images and using large models during queries.
Therefore, in the index, the team designed a set of semantic retrieval system based on end-cloud fusion. First, vector computing is carried out with cloud computing power, and then local indexes are deployed and retrieved by terminal devices, which can not only reduce the amount of terminal computing, but also ensure the speed of search.
In order to further reduce the power consumption of the terminal, the team also compressed and optimized the index format to ensure that the search is for the most "essential" data in the image.
In terms of computing power, the team also developed a scheduling system for unified management of CPU, GPU and other heterogeneous resources, making full use of "idle" resources to calculate the data on the cloud disk.
As a result, even if you have 100000 photos in your disk, the search takes millisecond-less than a second to find the image you want.
Solve these three problems, and finally "icing on the cake", making the search methods more diversified.
For example, the online disk team has also introduced AI technologies such as image search, OCR and video retrieval.
By searching for pictures, you can upload pictures directly. By comparing the contents of photos, you can find similar pictures on the disk or across the network:
You can even connect to Baidu encyclopedia:
OCR recognition can recognize the information and knowledge in pictures through AI, even pictures with crazy punctuation marks OK:
As for the video retrieval technology, the AI algorithm is used to quickly screen out a cover image that can best represent the video in order to speed up the video search.
It is understood that the image search function in Baidu has covered tens of millions of users, a cumulative image search service more than 250 million times a year. Even in such a large amount of data, Baidu disk always puts user data security and privacy protection in the first place.
Taking storage security as an example, Baidu network disk relies on Baidu Cloud Computing (Yangquan) Center, and its data reliability is as high as 99.9999999999% (12 9s), greatly improving the stability and reliability of user data. At the same time, it also continues to pass the annual audit of three ISO security certification to ensure the data security of each user in an all-round way.
To sum up, Baidu online disk is precisely through including large models and other cutting-edge technology to achieve the core function "evolution", so as to stand out in a number of similar App.
But why does the big model take the lead in causing change in the field of App such as Baidu online disk?
Big model is rewriting all applications in fact not only Baidu network disk, there are already many applications on the market began to absorb big model such new technology.
However, no matter from the point of view of product technology, industry, or Baidu itself, the online disk must be a "foothold" of the first big model.
From the point of view of the product itself, the network disk, as an online storage database for managing massive data, is bound to face the demand of more intelligent interaction like Excel and other data processing software.
Just as automatic mapping has become a rigid requirement of Excel, a sentence of "search map" is bound to become a rigid requirement for users to use the database.
The emergence of the large model, directly built a bridge between the text and pictures, so that the network disk is no longer just a "hard disk", but really become the user's "second brain".
From the point of view of the development trend of the industry, search itself will also become the first field for large models to land.
Including Google AI snapshot and Baidu "AI partner", domestic and foreign online search engines have rapidly introduced large model capabilities.
However, in addition to searching external knowledge, whether it is internal database search such as network disk, or mobile local search, there is also a great demand for intelligent data search. Whoever can take the lead in introducing intelligent search technology into the product will be the first to improve the user experience and attract more people to use the product.
Finally, from Baidu's own point of view, when the big model was first hot, CEO Robin Li left a famous saying:
You need to redo all the applications with a big model.
And the online disk App is one of the first and most competitive large model products of Baidu, and its changed function is not only at the level of intelligent search.
In other words, intelligent search map, search video, or just the beginning of Baidu network disk reform. Now, with the blessing of the large model, the AI and data processing capabilities of the network disk have been thoroughly stimulated and have become the intelligent assistant of the user.
It takes the large model as the core brain, and can quickly realize personal knowledge management by calling knowledge, AI model and API, and will soon realize multimodal creation and multi-device interconnection.
Personal knowledge management: including search, omni-directional intelligent management of network disk data. For example, quickly summarize the English financial report, answer questions according to the information in the document, interact with users, and so on.
Multimodal creation: the picture and text video content in the network disk can be recreated with AI. Such as pictures automatically to video, video subtitles automatically to text, and so on.
Multi-device interconnection: based on IoT, the contents of the network disk can be quickly interconnected on multiple smart devices, and it is very convenient to transfer files.
This intelligent assistant is the "cloud" that started internal testing not long ago. With it, it is a matter of one sentence to find pictures, summaries, translation and other capabilities.
From intelligent search to Baidu online disk "cloud one", Baidu online disk, which has been "rewritten" by the big model, has taken the lead in the reform of the industry.
Children's shoes that you are interested in can be experienced.
Reference link:
Https://mp.weixin.qq.com/s/D1miYkH1C6MstJsqx6XwXQ
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.