In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
Https://www.toutiao.com/a6667673351273579012/
In the IT world, heterogeneous computing is not a new word.
In the past decade, the computing industry has experienced changes from 32bit, "x86-64", multi-core, general-purpose GPGPU and 2010 "CPU-GPU" heterogeneous computing. In recent years, with the rise of computing-intensive fields such as artificial intelligence, high-performance data analysis and financial analysis, heterogeneous computing has suddenly become popular.
Because the traditional general computing method has been unable to meet our needs for computing power, heterogeneous computing is considered to be the key technology to take up the backbone of computing at this stage. Aliyun heterogeneous computing product solution was born in such an environment. Zhang Xiantao is at the helm of this team.
Zhang Xiantao, Hua name Xuqing, PhD in Information Security from Wuhan University, before joining Alibaba, he worked at Intel Asia-Pacific Research and Development Center, was a major contributor to Xen, KVM and other open source virtualization projects, served as Maintainer; of Xen/IOMMU and KVM/IA64 projects, and was also the main author and contributor of Intel HAXM Accelerator, for which he won Intel's highest Achievement Award.
Zhang Xiantao officially joined Alibaba as a senior expert in 2014 and is currently responsible for the technology and R & D team of virtualization technology, high-performance computing products, heterogeneous computing products, as well as some innovative products.
In this interview, Zhang Xiantao shared the pain points of enterprises using heterogeneous computing solutions, and he also gave an in-depth introduction to Aliyun's work in balancing heterogeneous computing resources.
Opportunities and challenges of heterogeneous Computing
Heterogeneous computing refers to the computing mode of a system composed of different types of instruction sets and computing units of architectures. at present, "CPU+GPU" and "CPU+FPGA" are the most concerned heterogeneous computing platforms in the industry. Its biggest advantage is that it has higher efficiency and low latency than traditional CPU parallel computing, especially when the demand for computing performance in the industry is increasing, heterogeneous computing becomes more and more important. The ecology of the entire computing industry is all here, chip companies have invested a lot of money, the development standards of heterogeneous programming are also gradually mature, and the mainstream cloud service providers are actively laying out, for a time, heterogeneous computing has a great potential to replace traditional isomorphic computing.
Zhang Xiantao also said that heterogeneous computing can well meet the computing needs of computing-intensive fields such as artificial intelligence, high-performance data analysis and financial analysis, and this technology will gradually replace the parts that general computing is not good at.
However, under the shiny shell, for ordinary users, the threshold for the procurement, deployment and use of heterogeneous computing is very high for most enterprises. In this regard, Zhang Xiantao mainly talked about the following pain points:
1. High procurement cost: users basically do not have bargaining power for small purchases, especially the purchase of FPGA cards, the purchase price is particularly high if the quantity is small.
two。 Long delivery cycle: it usually takes several months for users to purchase from the beginning to the process of model selection, hardware architecture design, supplier selection, computer room selection, financial approval and so on.
3. Inflexibility: the number of GPU/FPGA is fixed after purchasing. If there are fewer tasks, more GPU/FPGA will be wasted, and if there are more tasks, the number of GPU/FPGA will not be enough.
4. No hardware bonus: the model will be fixed after the purchase, and if there is a new GPU/FPGA architecture online, you can only buy it with an additional budget, and the old GPU/FPGA performance can not keep up with the application.
5. Data isolated island: offline GPU/FPGA and online services can not be connected.
In addition, he added that the biggest challenge to making FPGA products is that the ecological environment of the whole FPGA is very poor, and very few customers have the ability to develop FPGA, especially FPGA for computing acceleration. To this end, we will establish an IP development market on the cloud and introduce a series of FPGA IP partners, and promote the establishment of cloud FPGA development standards, enrich the entire FPGA development environment, and attract more IP developers and partners to put their IP on the IP development market to serve their end users, thus further enriching the entire FPGA ecological environment. "
Aliyun has successively launched elastic GPU and FPGA heterogeneous computing solutions in a short time, which aims to reduce the threshold for the use of heterogeneous computing resources, so that enterprises with demand for high-performance computing can buy and use them.
Aliyun elastic GPU products are mainly for artificial intelligence, data analysis, scientific computing, movie rendering, video image processing, video transcoding and other scenarios. Current application cases include behavior data analysis, thousands of faces, face recognition, video recognition, image recognition, object classification and so on. Aliyun elastic FPGA products are mainly for artificial intelligence, semiconductor design, genetic computing, video image processing, data analysis and decision-making, etc. The current application cases include deep learning reasoning, deep learning model tailoring, irregular data computing, video image processing, hardware semiconductor design and so on.
Aliyun's Exploration in the Field of heterogeneous Computing
As we all know, compared with CPU,GPU and FPGA, GPU has too many advantages, GPU has higher parallelism, higher peak computing on a single machine, and higher computing efficiency; while the advantage of FPGA is mainly reflected in its higher performance per watt, higher performance of irregular data computing, higher hardware acceleration performance, and lower device interconnection latency.
However, when it comes to cloud solutions, it means further magnification of advantages. According to Zhang Xiantao, Aliyun's GPU and FPGA heterogeneous computing solutions mainly have the following characteristics:
1.GPU/FPGA resources are ready-to-use, flexible and flexible.
two。 Large-scale resource pool to meet the demand for the number of GPU/FPGA at the peak of business.
3. Enjoy the hardware dividend of super-Moore's law in heterogeneous computing and use more powerful GPU/FPGA instances at the same price.
4. The most comprehensive heterogeneous product line to meet the needs of artificial intelligence training, reasoning, image and video processing and other different needs.
5. Product integration: deep integration with the entire Aliyun product system, data access.
These features perfectly solve the pain point of users using heterogeneous computing solutions. Zhang Xiantao also revealed that most customers now train models on a single machine, which usually takes several weeks to a month, so Aliyun is planning to launch an ultra-high-performance heterogeneous cluster product.
"the GPU/FPGA of this product can be directly connected to each other through the RDMA protocol through 25/100Gb ROCE, and can be multi-machine and multi-card. A very large number of GPU/FPGA device clusters are used to train a model together, greatly reducing the training time for users from a few weeks to a month to a level of one day or several hours."
It is worth mentioning that Aliyun heterogeneous computing solution also provides a more friendly experience for developers:
In terms of GPU programming, Aliyun will launch a distributed multi-machine multi-card training framework and other performance optimization services on GPU, which can greatly reduce the threshold for customers to use multi-machine and multi-card, thereby reducing the time for customers to do deep learning and training on the cloud.
In terms of FPGA, Aliyun will establish an IP development market and introduce a series of FPGA IP partners, and will launch a self-developed IP series, so that more end users can enjoy the performance acceleration of FPGA through the prosperity of the IP market.
In addition, Aliyun also launched IaaS+ services, including publishing E-HPC products for resource scheduling, account management and auto scaling of heterogeneous clusters, one-click deployment, distributed training and auto scaling through container service, behavioral data analysis through XDL, and using Aliyun's self-developed GPU assembler to optimize and improve application performance, improve the utilization of heterogeneous computing devices, and reduce the procurement cost of resources.
The future: GPU, FPGA and ASIC
The demand for computation in artificial intelligence and other emerging applications exceeds the development speed of general CPU's Moore's Law, and the performance growth rate of heterogeneous computing can meet these emerging directions and trends. It can be predicted that heterogeneous computing will occupy more and more share in data centers in the future.
From a macro point of view, the development of heterogeneous computing also benefits from the promotion of national strategy. For example, the country recently issued the development plan of artificial intelligence, artificial intelligence has become a national strategy, which is bound to stimulate the demand for heterogeneous computing. Of course, Zhang Xiantao also said frankly that although the application demand of heterogeneous computing is increasing, the demand of general computing will always exist, and the two will coexist for a long time.
There is no doubt that GPU processors have occupied the mainstream position in the field of heterogeneous computing, but as for the future trend, Zhang Xiantao said, "with the establishment and improvement of the ecological environment of FPGA and the gradual maturity of ASIC chips, the future heterogeneous computing field will show a situation of GPU, FPGA and ASIC chips. GPU, FPGA and ASIC chips will all have their own unique strengths and application fields, and have their own unique customer groups."
This is also the focus of Zhang Xiantao's team, and then the team will release 8-card / 16-card GPU products, the next generation of Volta architecture GPU products, the new generation of FGPA products, and ASIC chip products are also under development.
At present, his team has two main goals: on the one hand, it is committed to turning heterogeneous computing into user-ready computing resources, providing the most comprehensive heterogeneous computing product solutions; on the other hand, it is committed to enabling users to make good use of heterogeneous resources. give full play to the processing power of heterogeneous resources and make users' services more competitive. That is to promote heterogeneous computing to become a pervasive computing power.
The highlights of the Yunqi conference were revealed.
This Hangzhou Yunqi Conference will set up a special session for heterogeneous computing / high-performance computing and virtualization technology, at which time Zhang Xiantao will give a keynote speech. Before the official opening of the conference, he also revealed an important news to the Yunqi community-Aliyun will release several heavyweight heterogeneous computing family products, involving heterogeneous computing, general computing, high-performance computing and other fields. He said that these products are all designed to solve the pain points encountered by users in the process of using Aliyun, including the management and scheduling of clusters, the License problem of flexible use of paid software on the cloud, the need for instances to have both the flexibility of virtual machines and the performance of physical machines, and multi-machine and multi-card distributed training to reduce training time.
Heterogeneous computing
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.