In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >
Share
Shulou(Shulou.com)11/24 Report--
The wave of big models reshaping everything is accelerating the rush to mobile applications.
Not long ago, Qualcomm just showed off a coquettish operation that relies solely on mobile phones to run Stable Diffusion,15 seconds to show pictures on MWC:
Three months later, on CVPR 2023, the parameter is increased to 1.5 billion, and ControlNet has also made its debut on the mobile end. It took less than 12 seconds to create the image:
Even more unexpected speed is that Ziad Asghar, senior vice president of product management and head of AI of Qualcomm Technologies, revealed:
From a technical point of view, it takes less than a month to move these 1 billion + parameter models into the phone.
And this is just the beginning.
In communication with qubits, Ziad believes that:
Large models are rapidly reshaping the way people interact with each other. This will revolutionize the use of mobile applications and the way they are used.
"Big Model changes Terminal interaction." for everyone who has seen Iron Man, it's hard not to envy Iron Man's omnipotent assistant Jarvis.
Although voice assistant is nothing new, its present form is still somewhat different from that of intelligent assistant in science fiction movies.
And the big model, in Ziad's view, is a game breaker.
Big models have the ability to really reshape the way we interact with applications.
A concrete manifestation of this change is all in one.
In other words, through the application portal of digital assistant supported by large model, people can control everything on terminals such as mobile phones:
Through natural language instructions, the digital assistant can automatically manage the APP on all your phones, do banking, write e-mail, make trips, book tickets, and so on.
More importantly, such a digital assistant can also be "privately customized"--
The personalized data on the mobile phone, combined with the large language model that can understand text, voice, image, video and other multi-modal input, can enable the digital assistant to grasp the user's preferences more accurately.
And such a personalized experience can be achieved without sacrificing privacy.
From a technical point of view, the key is to move Stable Diffusion and ControlNet into the hybrid AI architecture of mobile phones and AI technologies such as quantification, compilation and hardware accelerated optimization as support.
Hybrid AI means that terminals and the cloud work together to distribute the workload of AI computing in appropriate scenarios and times in order to make more efficient use of computing resources.
Quantification, compilation and hardware accelerated optimization are the key AI technologies to realize hybrid AI, which have been concerned and bet by terminal AI manufacturers such as Qualcomm for a long time.
Quantization is to convert a larger model from a floating-point number to an integer with the same precision to save computing time, or to compress the size of the model while ensuring its performance, making it easier to deploy in the terminal.
The compiler is the key for the AI model to run efficiently with the highest performance and lowest power consumption. The AI compiler converts the input neural network into code that can run on the target hardware, while optimizing for delay, performance, and power consumption.
In terms of hardware acceleration, take Qualcomm as an example, the key core Hexagon processor in its AI engine uses a dedicated power supply system to support micro-slice reasoning, INT4 accuracy, Transformer network acceleration, etc., which can provide higher performance while reducing energy consumption and memory consumption.
The data show that Transformer acceleration greatly increases the reasoning speed of the multi-head attention mechanism that is fully used in generative AI, resulting in a 4.35-fold improvement in AI performance in specific use cases using MobileBERT.
Take Stable Diffusion as an example, Qualcomm researchers have now been able to run this model at a speed of 15 seconds and 20 steps on mobile phones equipped with the second generation Snapdragon 8 mobile platform through quantification, compilation and hardware acceleration optimization to generate 512 × 512 pixel images.
△ source tubing @ TK Bay so that the whole reasoning process can be done entirely on mobile phones-even in flight mode without networking.
The deployment of such AI technology is not easy, and Ziad says Qualcomm has been preparing for software, tools and hardware for two to three years.
But now, when the software and hardware tools such as Qualcomm AI model plug-in kit, Qualcomm AI software stack and Qualcomm AI engine are in place, as mentioned earlier, Qualcomm realized the high-speed operation of Stable Diffusion on Snapdragon platform in less than a month.
In other words, when the basic technology is ready, the generative AI deployment, including the large model, will be easier, and it is not impossible to deploy the unimaginable "large model to the terminal to become a digital assistant".
Specifically, under the "dual" architecture of mixing AI and software AI technology in hardware, large models deployed in terminals such as mobile phones can constantly optimize and update user portraits on the terminal side according to user habits, so as to enhance and create customized generative AI prompts. These prompts are processed around the terminal side, diverting tasks to the cloud only if necessary.
Ziad further explained to us:
The cloud doesn't know you, but the terminal device knows you. If the model can be fine-tuned on the device, it will be very powerful.
This is also one of the ways to break through the hallucinations and memory bottlenecks of large models. Qualcomm can provide "exclusive" services for a long time with the help of terminal device data without networking through a series of technologies, while protecting the privacy of users.
It is worth noting that Ziad also revealed that in addition to Stable Diffusion and ControlNet, based on Qualcomm's full-stack software and hardware capabilities, researchers are migrating more generative AI models to mobile phones, and the number of parameters is also moving to the 10 billion level.
Soon, you will see a model like LLaMA 7B / 13B on the terminal. All the tools are in place, and it's only a matter of time.
Moreover, although only "specific" large models can be deployed on the terminal side, with the continuous application and maturity of technology, the number of large models, modal types and deployment forms that can be deployed will evolve rapidly. Ziad says:
As more and better AI algorithms are open source, we can also use this set of software and hardware technology to deploy them to the terminal side more quickly, including various multimodal AI such as Vincent video.
From this point of view, it is not impossible for future users to migrate the large model they want to use to the mobile side and become the core of the super assistant.
Big models are reshaping the mobile Internet. in fact, the interactive change on mobile phones is just the tip of the iceberg.
Long before the outbreak of generative AI and large model technology, in the era of mobile Internet, the demand for AI has shown a trend of shifting to edge devices.
Just like Ziad's view that "terminal-side AI is the future of AI", as the wave of generative AI represented by large models accelerates to change the way of human-computer interaction, more terminal sides, such as laptops, AR / VR, cars and Internet of things terminals, will also be reshaped because of this change, and even accelerate the large-scale landing of AI.
In this process, not only new measurement standards will be born in hardware, but also super AI applications with large models as the core in software are more likely to appear.
First of all, on the hardware, because the computing power of the terminal side will become an indispensable part of the extended generation AI landing application, for the mobile chip itself, the AI processing capacity will become increasingly prominent, and even become one of the new design benchmarks.
As large models become more popular and more applications continue to access their capabilities, more potential users will realize the advantages of large models, leading to a rapid increase in the use of such technologies.
But the computing power of the cloud is limited after all. Ziad believes that:
With the increasing demand for AI computing, cloud computing power will not be able to carry such a large amount of computing, resulting in a sharp increase in the cost of a single query.
In order to solve this problem, we should let more computing demand spill over to the terminal and rely on the terminal computing power to alleviate this problem.
In order to make more large models can be processed or even run in the terminal, so as to reduce the call cost, it is necessary to improve the ability of the mobile chip to deal with AI while ensuring the user experience.
In the long run, AI processing capacity will become a benchmark to measure hardware capabilities, just as mobile phone chips compete with general computing power and ISP imaging capabilities in the past, and become a new "competition point" for the whole mobile chip.
Whoever can take it into account when designing a mobile chip is more likely to have a say in this big model contest.
It's not just hardware. In software, by changing the way of human-computer interaction, the big model will reshape all mobile applications, including entertainment, content creation, and productivity.
In this case, more and more large models, or generative AI, will participate in reshaping different mobile AI applications, which will vary according to the computing power and application scenarios of different mobile devices:
On the smartphone side, as mentioned earlier, this reshaping will be the first to appear in search and "smart assistants". For example, with the phrase "schedule a meeting for five people", the big model can simplify a message that used to be repeatedly confirmed by e-mail into a command that is automatically sent to someone else's calendar.
On laptops and PC, the biggest impact may be the improvement of tool production efficiency, such as the way to use Office no longer need to rely on typing, but chat to complete the report they want to write, deal with the PPT.
As for the car side, digital assistants and self-driving software may be the first to be affected. For example, when using navigation software, you no longer need to click on the destination, but directly tell it, "I want to go to XX, arrange a place to eat halfway, not too expensive." the big model can read human words and automatically plan the route of the car.
Or maybe the more attractive application reshaping of XR lies in 3D content creation and immersive experience, while the change in the Internet of things may occur in operational efficiency and customer support applications.
Of course, this does not mean the "disappearance" of the AI mini model. Before the emergence of the large model, image application has become the most significant field of mobile AI landing, including AI image modification, dark scene video shooting denoising algorithm and so on, there have been many mature AI applications.
Ziad believes that the emergence of generative AI will not replace the existing AI applications, and even under its stimulation, the upgrade and evolution of CPU, GPU and AI processors will further enhance the ability of traditional AI algorithms such as denoising.
At the same time, there is no "isolated island" between mobile applications. Whether smartphones, computers, cars, the Internet of things or XR, once the big model leads to the emergence of a truly "killer" application, it is bound to be deployed between mobile devices.
Therefore, in this wave of large models, how to quickly adapt the developed applications to different mobile terminals to achieve "one-time development of multi-terminal interconnection" is also an indispensable technical trend.
In a word, from the chip design of hardware, the application program of software, to the overall development mode of application, the big model is bringing changes to the mobile end and even the whole mobile Internet.
So what role will Qualcomm play in this wave of big model change?
According to Ziad, Qualcomm will lead the technology frontier and be at the heart of this change:
On the terminal side, Qualcomm is in the lead in terms of both hardware and software. This is true not only in mobile phones, but also in computers, AR, VR, cars, the Internet of things and other fields.
The source of this confidence is Qualcomm's long-term accumulation of AI technology, "all the tools are in place."
Whether it's the Hexagon AI processor in hardware, the hybrid AI that allows generative AI to be applied "seamlessly" between the cloud and the terminal, or the software technologies such as quantification, compression, neural network architecture search (NAS) and compilation, Qualcomm already has the technology reserve to apply large models to the terminal side at any time.
Once the large model is successfully deployed to a terminal side, such as a smartphone, it can be quickly deployed to all other end-side devices through the Qualcomm AI software stack, further accelerating the large-scale landing of the large model.
Like the 1 billion-parameter Stable Diffusion model, after being deployed on mobile phones, it has also been implemented to run on laptops equipped with Snapdragon computing platform.
In the face of the opportunities and challenges brought by generative AI under this wave of large models, many technology companies are looking for solutions to explore how to follow up the technology.
At least on the terminal side, Qualcomm has taken the lead in the industry as a technical player.
One More Thing in this wave of generative AI craze, is it possible for big models to bring new "killer" applications like Wechat? What does Qualcomm think of this view?
Ziad replied that it may, and such "killer" apps are more likely to be the first to emerge in China:
According to the development trend, such applications are indeed likely to emerge more quickly in China.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.