In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >
Share
Shulou(Shulou.com)11/24 Report--
After ChatGPT, music may become the frontier of AI content generation.
On January 27, local time, Google released a new AI model, MusicLM. Through the AI model MusicLM, text can directly generate high-fidelity music.
This is after the text generation AI model Wordcraft, video generation tool Imagen Video, Google once again launched a generative AI model tool, this time Google is targeting the music field.
Through MusicLM, it is not difficult to see that the generative AI track is ushering in an explosion in the past two years.
01. MusicLM challenges more complex scenarios
Google's latest AI model MusicLM can automatically generate music from text and images, and has a variety of music styles. Almost all the music you want to listen to can be generated automatically.
MusicLM is not the first AI model to generate music automatically. Previously, the visual AI tool Riffusion can also create music automatically, as well as Dance Diffusion, and OpenAI, the developer of ChatGPT, the most popular chat robot, has also launched Jukebox.
But it is worth noting that these AI systems that can automatically generate music are limited by technology and data and other factors, the creation of music is relatively simple, relatively not complex.
Unlike its predecessors, MusicLM can create particularly complex and high-fidelity music, as well as generate music through images. This is a new breakthrough. Through AI technology, we can not only identify musical instruments and integrate music genres, but also generate music through more abstract concepts.
For example, if you want a soundtrack for an arcade game, just enter words such as "the main soundtrack of the arcade game, it is fast and optimistic," and MusicLM can automatically generate music. MusicLM can also generate music through images, such as the world famous works "Scream", "Guernica", "Starry Sky" and so on.
However, it is worth mentioning that Google has only released the results of MusicLM research, because of copyright and other issues, Google has not yet opened MusicLM to the public.
02. What is the difficulty in generating music from AI?
Last October, Google launched AudioLM on its generative AI model, which can generate audio of a similar style by entering short-term audio. At that time, AudioLM was just a pure audio model, and this technology was similar to the language model, which independently judged and generated similar content according to the voice content of the prompt.
From this point of view, AudioLM can be regarded as the predecessor of MusicLM. AudioLM can imitate the timbre, loudness and clarity of audio without transcription or labeling. However, the audio generated by AudioLM is not much different from the original and has not been publicly applied.
It is not easy to create music through the AI model, because the generated music, including audio signals, ambient sound, human voice and other dimensions, is formed by the interaction of many signals, and every sound emitted by the human body, regardless of loudness and poor timbre, is composed of syntax, rhythm, etc., which is a very complex comprehensive system.
It is precisely for these reasons, in the early exploration process, the automatically generated audio synthesis traces are obvious, the sound does not sound natural, and the pronunciation is not standard. Therefore, in order to realize the automatic generation of audio in the real sense, the AI model relies on massive data training and simulation, which is an essential basic step.
In response to these challenges, as the "upgraded version" of AudioLM, the training data of MusicLM is even larger. It is understood that Google trained MusicLM in a 280000-hour music dataset, providing a basis for understanding in-depth and complex music scenes.
In addition, it is worth mentioning that in view of the lack of evaluation data for tasks, Google has specifically introduced MusicCaps to evaluate text-to-music generation tasks.
03. Generative AI ushered in the outbreak
Google's launch of MusicLM can be seen as a footnote for the expansion of AI applications, with the outbreak of generative AI track behind it. In fact, generative AI has been the hottest topic in the past two years.
In 2021, OpenAI successively released epoch-making DALL-E and DALL-E 2 models, realizing the leapfrogging of text generation images; last year, Meta released AI short video generation model Make-A-Video, which can also generate video from text content; Google also released short video AI generation models Imagen Video and Phenaki.
There are many generative AI applications not only abroad, but also at home. For example, Byte Jump clip APP can automatically generate matching video images according to the text content. Early last year, NetEase launched NetEase Tianyin, an one-stop AI music creation platform that automatically generates user-edited content into songs through AI.
As you can see, the scenarios of generative AI applications are becoming more and more widespread. Writing, painting, video clipping and so on can be realized through AI technology. Based on the wide application prospect of generative AI, Google, Microsoft, Meta and other giants have promoted research and development, integrating generative AI technology into products, which accelerated the outbreak of generative AI track.
In fact, the rapid development of generative AI is not a matter of the past two years, but because the technical threshold is too high, its cutting-edge trends have been spread in a small range in the science and technology circle. Until AI painting, AI writing and other frequently out of the circle, generative AI has received more attention.
There are inevitable reasons for the outbreak of the generative AI track. The application of big data and the algorithm is becoming more and more mature, and the model tools are becoming more and more perfect, which accelerate the iteration of generative AI applications. At present, generative AI has ushered in the outbreak, and there is still great potential for development in the future. According to Gartner statistics, generative AI is expected to account for 10 per cent of all generated data by 2025, compared with less than 1 per cent today.
Of course, any technology is a "double-edged sword", and generative AI is also faced with challenges such as copyright issues, as well as various losses caused by "errors" generated by AI. At present, human intervention is indispensable. But in the long run, the huge development potential of generative AI has become a consensus.
This article is from the official account of Wechat: new Research (ID:chuxinyanjiu), author: Feifei
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.