Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Wayve shows GAIA-1 's autopilot world model, claiming that predictable events "see the future"

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Share

Shulou(Shulou.com)11/24 Report--

CTOnews.com, October 9 (Xinhua)-- British AI startup Wayve announced the latest development of its GAIA-1 generative model. In June this year, Wayve established a concept verification of using the generative model for autopilot. In the past few months, Wayve has continued to expand GAIA-1 to have 9 billion parameters to generate realistic driving scenes that show autopilot "reactions in a variety of situations." And better predict future events.

▲ Image Source WayveGAIA-1 is a world model (World Model) that can use different types of materials, including movies, text, and actions, to create realistic driving scenes. The behavior and scene characteristics of self-driving vehicles can be carefully controlled, and because of the multi-modal nature of GAIA-1, relevant videos can be generated from a variety of prompt modes and combinations.

▲ source Wayve officially mentioned that GAIA-1 can learn about the environment to provide a structural understanding of the environment and help autopilot make informed decisions. "predicting future events" is the basic key ability of the model, and accurate prediction of the future can make self-driving vehicles know in advance the upcoming events, so as to plan the corresponding actions and increase the safety and efficiency of cars on the road.

It is reported that GAIA-1 will first use a special encoder to encode various forms of input, such as movies or text, into a shared representation, and then achieve unified timing alignment and context understanding in the model. This coding method enables the model to better integrate and understand different types of input.

The core of ▲ image source Wayve and GAIA-1 is an autoregressive Transformer, which can predict the next set of image token in the sequence. The world model not only considers the past image token, but also refers to the context information of text and action token. The image token generated by this method will not only be visually coherent, but also consistent with the expected text and action guidance.

After that, the model will start the video decoder this stage is started, the main function is to convert these images token back to pixel space, video decoder as a diffusion model, its power is mainly to ensure that the generated film, with semantic meaning, visual accuracy and time sequence consistency.

▲ source WayveCTOnews.com learned from the official website that GAIA-1 's world model contains up to 6.5 billion parameters after 15 days of training on 64 Nvidia A100 GPU, while the video decoder has been trained on 32 Nvidia A100 GPU for 15 days, with a total of 2.6 billion parameters.

The main value of GAIA-1 is to introduce the concept of generative world model into autopilot, to demonstrate the potential of multimodal learning in creating diversified driving situations through the integration of film, text and action input, and through the integration of world model and driving model, so that the driving model can better understand its own decisions and extend to real-world situations, thus improving the ability of the autopilot system.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

IT Information

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report