Text-generated images are out of date. Meta launched a text-generated video AI system. 02/09 Update SLTechnology News&Howtos

Text-generated images are out of date. Meta launched a text-generated video AI system.

2026-02-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

CTOnews.com, Oct. 2, Meta recently released an artificial intelligence system that can generate short videos based on text prompts.

CTOnews.com learned that the system, called Make-A-Video, allows users to enter a series of words, such as "a dog flying in the sky in a superhero costume and a red cloak," and generate a five-second short video.

Although the effect is quite rough, this system is obviously more advanced than the text-to-picture AI system.

Last month, artificial intelligence lab OpenAI offered everyone its latest text-to-image AI system, DALL-E, while artificial intelligence startup Stability.AI launched Stable Diffusion, an open source text-to-image system.

But the text-to-video AI system is accompanied by some bigger challenges. First of all, these models require a lot of computing power. They are more computational than large text-to-image artificial intelligence models, which use millions of images for training because hundreds of images are needed to piece together just a short video. This means that only large technology companies will be able to build these systems for the foreseeable future. Their training is also tricky because there are no large-scale high-quality video and text matching data sets.

To solve this problem, Meta combines data from three open source image and video data sets to train its model. The marked still images of standard text-to-image datasets help artificial intelligence learn the names of objects and what they look like. A video database helps it learn how these objects should move in the world. The combination of these two methods helps Make-A-Video generate video on a large scale from text.

Meta says the technology can "open up new opportunities for creators and artists." However, with the development of technology, people worry that it may be used as a powerful tool to create and spread misinformation and deep forgery, and it may make it more difficult for people to distinguish between real and false content on the Internet.

The researchers who created Make-A-Video have filtered out offensive images and text, but it is almost impossible to completely remove biased and harmful content from data sets of millions and millions of words and pictures.

A spokesman for Meta said the model is not yet available to the public. "as part of this study, we will continue to explore ways to further improve and reduce potential risks," a spokesman for Meta said.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.