In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >
Share
Shulou(Shulou.com)11/24 Report--
The whole world is rolling up big models, and Xiaoza is in a hurry. Today, Meta has made big bets on custom chips and supercomputing in order to develop AI.
Meta also has a pure self-developed chip!
On Thursday, Meta released the first generation of AI reasoning customized chips MTIA v1 and supercomputing.
It can be said that MTIA is a great boon to Meta, especially now that everyone is rolling up big models, the demand for AI computing power is becoming higher and higher.
Mr Oza said recently that Meta saw "an opportunity to introduce artificial intelligence agents to billions of people in a useful and meaningful way".
Obviously, as Meta increases its investment in AI, MTIA chips and supercomputing plans will be key tools for Meta to compete with other tech giants, and there is no giant that does not invest a lot of resources in AI.
As you can see, with custom chips and supercomputing, Meta made a big bet on AI.
At MTIA's most recent online event, Meta kicked off the development of its own infrastructure.
The full name of the new chip is Meta training and reasoning Accelerator, or MTIA for short.
MTIA is an ASIC, a chip that combines different circuits on one board, allowing it to be programmed to perform one or more tasks in parallel.
Santosh Janardhan, vice president and head of infrastructure at Meta, wrote in a blog post that MTIA is Meta's "internal customized accelerator chip family for reasoning workloads" that provides "higher computing power and efficiency" than CPU and is "customized for our internal workloads."
By combining the MTIA chip with GPU,Janardhan, Meta believes that "We will provide better performance, lower latency, and higher efficiency for each workload." "
I have to say, this is the projection of Meta's strength. In fact, Meta has not made rapid progress in the application of AI-friendly hardware systems. This affects the ability of Meta and competitors (such as Microsoft, Google, etc.) to keep pace with each other.
Alexis Bjorlin, vice president of infrastructure at Meta, said in an interview that by building its own hardware, Meta has the ability to control every layer of the stack, from data center design to the training framework.
This vertical level of integration is essential for large-scale promotion of AI research boundaries.
Over the past decade, Meta has spent billions of dollars hiring top data scientists to build new AI models.
Meta has also been trying to put many of its more ambitious AI innovation research into production, especially generative AI.
Until 2022, Meta mainly used a combination of CPU and chips designed to accelerate the AI algorithm to maintain its AI operation.
The combination of CPU and chip is usually less efficient than GPU in performing such tasks.
So Meta canceled the custom chips that it had planned for mass marketing in 2022 and ordered the multibillion-dollar Nvidia GPU instead.
The introduction of these GPU,Meta requires a disruptive redesign of several of its data centers.
To reverse this situation, Meta plans to develop an internal chip, which is expected to be launched in 2025. This internal chip can not only train the AI model, but also run the AI model.
The protagonist has finally arrived-the new chip is called MITA, full name Meta Training and Inference Accelerator.
This chip can be used to accelerate the efficiency of AI training and reasoning.
According to the research team, MTIA is an ASIC, which refers to a chip that combines different circuits on a board. By programming, the chip can perform one or more tasks simultaneously.
Meta, an AI chip tailored for AI workloads, you know, the competition among tech giants is the chip.
For example, Google's TPU is used to train Palm-2 and Imagen. Amazon also has its own chip to train the AI model.
In addition, it is reported that Microsoft is also developing a chip called Athena with AMD.
This is not, the arrival of MITA is also a sign that Meta is not to be outdone.
Meta says it created the first generation of MITA--MITA v1 in 2020, using the 7nm process.
The internal memory of the chip can be expanded from 128MB to 128GB. At the same time, in the benchmark test designed by Meta, MITA is more efficient than GPU in dealing with medium and low complexity AI model.
There is still a lot of work to be done in the memory and network part of the chip. As the scale of the AI model becomes larger and larger, MITA is about to encounter a bottleneck. Meta needs to share the workload on multiple chips.
In response, Meta said it will continue to improve MITA's performance per watt when running the recommended workload.
As early as 2020, Meta designed the first generation of MTIA ASIC for internal workloads.
This reasoning accelerator is part of a jointly designed full-stack solution, including chips, PyTorch, and recommended models.
The accelerator is manufactured by TSMC 7nm process and operates at 800MHz. It provides 102.4 TOPS with INT8 precision and 51.2TFLOPS with FP16 precision. Its thermal design power (TDP) is 25W.
At a high level, the accelerator is composed of processing elements (PE), on-chip and off-chip memory resources and interconnection. The accelerator is equipped with a special control subsystem to run the system firmware, which manages the available computing and memory resources, communicates with the host through the dedicated host interface, and coordinates job execution on the accelerator.
The memory subsystem uses LPDDR5 for off-chip DRAM resources, which can be extended to 128GB
The chip also has 128MB's on-chip SRAM, which is shared among all PE, providing higher bandwidth and lower latency for frequently accessed data and instructions.
The grid consists of 64 PE;PE organized in a 8x8 configuration that connect to each other and to blocks of memory over a mesh network. The grid can be used to run the entire job, or it can be divided into subgrids that can run independent jobs. The MTIA accelerator is installed on a small dual M.2 board and can be more easily aggregated into the server. These motherboards use PCIe Gen4 x8 links to connect to the host CPU on the server, with power consumption as low as 35W.
The sample MTIA software (SW) stack with MTIA is designed to provide developers with efficiency and high performance. It is fully integrated with PyTorch, using PyTorch with MTIA as simple as using PyTorch for CPU or GPU.
The PyTorch runtime for MTIA manages execution and functions on the device, such as the MTIA tensor, memory management, and API for scheduling operators on the accelerator.
The MTIA software stack has several ways to create a computing kernel that can run on accelerators, including using PyTorch, C / C++ (for manually tuning, very optimized kernels), and a new domain-specific language called KNYFE.
Use five different DLRM (from low complexity to high complexity) to evaluate the MTIA of a representative production workload
The evaluation found that MTIA is more efficient in dealing with low complexity (LC1 and LC2) and medium complexity (MC1 and MC2) models than NNPI and GPU. The researchers also realize that they haven't optimized MTIA for highly complex (HC) models, but the MTIA chip still seems to have a long way to go-- according to media reports, it won't be available until 2025.
RSC maybe one day in the future, Meta will be able to leave most of the work of training AI and running AI to MITA.
But for now, you still have to rely more on your own overcalculation: Research SuperCluster, or RSC for short.
RSC made its debut in January 2022 and has completed the second phase of construction in partnership with Penguin Computing, Nvidia and Pure Storage.
Today, the RSC contains 2000 Nvidia DGX A100 systems and 16000 Nvidia A100 GPU.
In an all-out situation, Meta achieves nearly 5 exaflops of computing power (an exaflop is quintillion per second, that is, billions of times).
With the increase of the number of GPU allocated, the training time can be greatly reduced. Over the past year, Meta has used this huge scale to train a number of impact projects. Prior to this, Meta has been working on building "next-generation data center design" to "optimize AI" and "build faster and more cost-effective".
Janardhan said that Meta is very confident in the power of the RSC artificial intelligence supercomputer. "We believe it is one of the fastest artificial intelligence supercomputers in the world.
So the question is, why did Meta build such an internal supercomputer?
First of all, other tech giants are under too much pressure. A few years ago, Microsoft partnered with OpenAI to build an AI supercomputer. Recently, he said that he would work with AMD to build a new AI supercomputer in the Azure cloud.
In addition, Google has been touting its AI-focused supercomputer, which has 26000 Nvidia H100 GPU, completely crushing Meta.
Of course, in addition to this reason, Meta also said that RSC also allows Meta researchers to use real cases in their company's production system to train models.
This is different from the company's previous artificial intelligence infrastructure, which uses only open source and publicly available datasets.
RSC AI supercomputers are used to promote AI research boundaries in many areas, including generative AI,Meta, which aims to provide AI researchers with the most advanced infrastructure to develop models and provide them with a training platform to advance AI development.
At its peak, RSC had nearly 5 exaflops of computing power, which the company claims makes it one of the fastest in the world, far surpassing many of the world's fastest supercomputers.
Meta said it would use RSC to train LLaMA.
The largest LLaMA model was trained on 2048 A100 GPU, which took 21 days, Meta said.
As Meta tries to stand out from other tech giants' increasingly aggressive artificial intelligence programs, Meta clearly needs to have a layout for AI hardware, too.
In addition to MTIA, Meta is developing another chip to handle specific types of computing workloads.
The chip, known as Meta Extensible Video processor (MSVP), is the first ASIC solution developed internally by Meta and is designed to meet the processing needs of video-on-demand and real-time streaming media.
Meta began conceiving customized server-side video chips a few years ago, and announced the launch of ASIC for video transcoding and reasoning in 2019.
The custom chip of Meta is designed to speed up the processing of video work, such as streaming media and transcoding.
The Meta researchers said, "in the future, MSVP will enable us to support more of Meta's most important use cases and requirements, including short videos-- the ability to efficiently deliver AI, AR / VR, and other meta-universe-related content. "
Catching up with Meta if there is one thing these products have in common today, it is that Meta is desperately trying to speed up its involvement in artificial intelligence, especially generative AI.
In February of this year, Xiaoza said that he would set up a new top-level generative AI team.
In his words, is to give the company's research and development a wave of nitrogen acceleration.
Chief scientist Yann LeCun says Meta plans to deploy a tool to generate AI to continue its ambitions in virtual reality.
Currently, Meta is exploring chat experiences in WhatsApp and Messenger, visual creation tools in Facebook and Instagram and advertising, as well as video and multimodal experiences.
To some extent, however, Meta is also feeling increasing pressure from investors who worry that Meta is not growing fast enough to capture the market for generative AI.
For chatbots like Bard,Bing Chat or ChatGPT, Meta struggles to cope. Little progress has been made in image generation.
The latter is another key area of explosive growth.
If the relevant experts' predictions are correct, the total potential market for generative AI software could reach $150 billion.
Goldman Sachs forecasts that it will increase GDP by 7 per cent.
Even a small part of it could eliminate the billions of dollars lost by Meta in meta-cosmic technology investments such as AR / VR headsets and conferencing software.
Reality Labs, Meta's augmented reality division, reported a net loss of $4 billion last quarter.
Reference:
Https://ai.facebook.com/blog/meta-training-inference-accelerator-AI-MTIA/
Https://ai.facebook.com/blog/supercomputer-meta-research-supercluster-2023/
Https://ai.facebook.com/blog/meta-ai-infrastructure-overview/
This article comes from the official account of Wechat: Xin Zhiyuan (ID:AI_era)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.