In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >
Share
Shulou(Shulou.com)11/24 Report--
CTOnews.com, May 30, the Meta team recently developed an AI model called Megabyte to compete with Transformer. Megabyte is said to have solved the problems faced by the Transformer model and improved its speed by 40%.
At present, ▲ image source Arxiv is very popular in natural language processing and other fields, but because its sequence data is processed step by step and can not be parallelized, so the training speed is slow; it is difficult to deal with long sequences, because the gradient is easy to disappear or explode in the process of back propagation; in addition, due to the need to retain historical information in each step, memory consumption is large.
The Megabyte model divides the input and output sequences into patch rather than a single token. This architecture makes byte-level prediction relatively easy for most tasks, such as words that are predicted based on the first few characters. This means that characters can be streamlined to improve efficiency in large networks, and internal predictions can be made using smaller models. This method of Megabyte model solves the challenges of training speed, reliability and hardware occupation ratio faced by today's AI model.
▲ graph source Arxiv in addition, in terms of computational efficiency, less token is used in the fixed model size and sequence length range compared with the equal size Transformer and Linear Transformer,Megabyte models. Therefore, compared with the Transformer,Megabyte model, the model with richer content, larger volume and better performance can be trained at the same computing cost.
At present, the Meta team has released a paper on the Megabyte model, which can be consulted by CTOnews.com partners.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.