Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Magic community online AI video generation tool Live Portait, can make the photo speak with one click

2025-02-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Share

Shulou(Shulou.com)11/24 Report--

Aliyun has made new progress in the field of generative AI. On August 16, Aliyun launched Live Portait, a digital human video generation tool, which can generate a digital human video to speak by uploading a photo and a piece of text or voice, which can be used in live video streaming, chat robots, corporate marketing and other scenarios. At present, the tool has created an open space experience in the magic community.

Since the large dialogue model and AI painting model have become popular, the research on generative AI in the industry has gradually evolved to more modes, and AI video generation is one of the hot technologies. This technology can convert text or audio information into facial motion information, and then drive the animation of photo characters, which can effectively reduce the threshold of video shooting and production.

The Live Portait tool launched this time is composed of a motion module and a generation module. It adopts the mouth shape prediction algorithm developed by Ali Yun, and the accuracy of the mouth shape is greatly improved compared with the traditional method. In the training phase, the explicit control of the attitude is added, and the video of any action can be generated without the backplane video, which greatly improves the reality of digital people's speech. In addition, through the active eye control technology, Live Portait can add some natural movements to the eyeball, making the results closer to the human effect. According to reports, Live Portait-related technology has been CVPR, ICCV and other international AI will be included.

According to information on the Mudo community, after uploading photos on Live Portait, users can choose between text-driven and audio-driven modes. In text-driven mode, the tool provides 28 sounds such as Mandarin, English, Cantonese and children's voices. In addition, Live Portait also provides lightweight model selection to help users generate videos more quickly.

Zhang Bang, head of the tool's algorithm, said: "Live Portait integrates a number of innovative technologies developed by the team, such as generating realistic facial animation with only a single picture, breaking through the limitations of the traditional confrontation generation network. With the further iteration of technology, Tusheng video has a huge application space and is expected to become a production tool for enterprises to reduce costs and increase efficiency."

It is reported that the team's research interests include digital human, 3D model AI generation, high-photorealistic rendering, natural human-computer interaction and other fields, and more than 50 international top conference papers have been published.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

IT Information

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report