In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >
Share
Shulou(Shulou.com)11/24 Report--
At the beginning of 2023, a little pig demon in Langlang Mountain was out of the circle. in the first unit of the Chinese novel, the pig demon was full of ambition and wanted to make a career, but his work was denied. Then, the little pig demon said the classic line "I want to leave Langshan".
The experience of the pig demon pokes the hidden pain of contemporary workers, and it is also very similar to the little-known and heart-scratching dilemmas in the data center.
In recent years, we can hear some far-sighted and concise words at many forums, summits and press conferences, such as "computing power is productivity", "infrastructure of digital economy", "endowing intelligence with numbers on the cloud", and so on. these macro trends and routes support the rapid development of computing clusters such as cloud data centers and smart computing centers, which we have analyzed a lot in previous articles.
However, in the actual construction process, there will be specific challenges of one kind or another, which may be hard to imagine for people sitting in an office / research institute and pointing at PPT.
For example, a female staff member of the computing center in a university in the west once told me that the cooling of the server mainly depends on air cooling, and maintaining cooling will increase the air supply wind, and it is impossible for their female employees to wear skirts when they enter the computer room; there is a lot of noise in the computer room. colleagues who are in charge of operation and maintenance all the year round have also suffered hearing damage.
These detailed and real problems constitute the wavy mountain that the data center must climb, otherwise it will be as tired and futile as a pig demon, and these problems can only come from the land rooted at its feet and from communicating with front-line personnel. Today we will combine some field experience to talk about which mountains the data center is waiting to climb.
The first mountain: what do you think of when it comes to the differences between China and the United States in data centers? Chip, architecture, software, industry chain? There is an easily overlooked but important factor: power supply.
Yi Enterprise Research Institute has conducted field visits to several domestic cloud data centers since 2018 and found that 2U is the mainstream specification in the domestic server market. The server market tracking report of IDC also confirmed that 2U accounted for about 70% of cabinet servers from 2018 to 2021. However, 1U is more popular in the US market.
What on earth are 1U and 2U? What is the reason for this difference? What does it mean?
We know that with the change of IT equipment technology, the height of servers used in modern data centers is generally 1U or 2U, which refers to the thickness of rack servers, 1U is 4.45cm, while the rack servers in early data centers are generally 3-5U in height.
The less the number of U, the lower the height of the server, the higher the computing density of the stand-alone machine, and the computing density of 1U server can be twice as high as that of 2U server. However, for the requirements of data center cluster in the East-West calculation project, the hub nodes of Beijing-Tianjin-Hebei, Yangtze River Delta, Guangdong-Hong Kong-Macau Greater Bay Area and Chengdu-Chongqing all emphasize "high density". Because only with higher density can we provide more calculation power on the limited land area and improve the efficiency of land resources.
From this point of view, 1U should be a better choice, but the result of field visits is that 2U specifications account for more of China's cloud data centers. Why? There is a decisive factor here-the power supply capacity.
Because 1U consumes more power than 2U, the power supply of a single cabinet supporting about 18 2U servers needs to reach 6kW, and if 36 1U servers are deployed, the power supply will reach 12kW. If the power supply capacity of the single cabinet can not be reached, the density advantage of 1U can not be brought into full play.
At present, the power of cabinets in China's data centers is generally on the low side, and the mainstream power is mainly 4-6KW. In the publicity of the project, we can even see the configuration of "2.5kW standard rack", with cabinets above 6KW accounting for only 32%.
The power supply system of the data center has both old and new diseases. The old problem lies in that each electromechanical system of the traditional data center runs separately, the collection accuracy is insufficient, the regulation range is also limited, and the power supply capacity and IT demand can not be refined and equal. Once the power density of the single cabinet increases, the reliability of continuous operation of the power supply may be affected, and the risk of outage will also increase. For cloud service providers, the power outage of the cloud data center will directly lead to customer business interruption and economic losses, which is unbearable.
The new problem is that after the country put forward the "double carbon" strategy, it has become a consensus to build a green energy-saving data center, and the increase in power density of a single machine will directly increase the refrigeration requirements, thus increasing the power consumption of air-conditioning equipment and air-cooling. Take the cloud data center inspected by Digital China Wanli Travel in 2021 as an example. Tencent Cloud Huailai Ruibei data Center uses 52U cabinets and UCloud Ulanchabu Cloud Base uses 47U and 54U cabinets. If you switch to 1U servers, it will not really increase the density, but will increase the challenge of server cooling design.
It is known that the data center must increase the computing density, so it is necessary to increase the single cabinet density, and the power of the single cabinet needs higher reliable and highly available power supply capacity, so it can be concluded that the power supply capacity will be a big mountain that China's data centers must climb next.
The second mountain: cold mentioned earlier, the increase in the power density of the cabinet will increase the power consumption of refrigeration. Some resourceful friends may ask, won't the adoption of more efficient and energy-saving refrigeration solve this problem and smoothly evolve to high density?
Indeed, the data center industry is worried about more energy-efficient refrigeration systems. On the one hand, it is necessary to speed up the "western calculation", give full play to the climate advantages of Wulanchabu and other western regions, build new data centers and make use of outdoor natural cold sources. "Digital China Wanli Travel" inspected seven data center clusters and found that the data centers of Zhangjiakou data center cluster and Helinger data center cluster can use natural cold sources for more than 10 months a year, with an average annual PUE of 1.2.
In addition, we should give full play to the advantages of liquid cooling in reducing energy consumption and gradually use liquid-cooled servers instead of air-cooled. For example, in 2018, Alibaba deployed a submerged liquid cooling (Immersion Cooling) computer room, a horizontal 54U cabinet, 32 1U dual-way servers and 4 4U JBOD in Zhangbei County, Zhangjiakou City, Hebei Province. At the beginning, we mentioned that the small trouble that the air-cooled computer room brings to female employees' dress, liquid cooling technology can solve this problem very well.
Does this mean that liquid cooling technology will soon be popular in the data center industry? After the end of the 2021 Digital China Journey in 2021, the 2021 China Cloud data Center investigation report released by the Institute of Economic Enterprises gives the answer of "wait and see carefully".
We believe that there are three reasons:
1. Ecological problems in the mature period.
Although the cooling efficiency of liquid cooling is much higher than that of air cooling, air-cooled computer rooms have occupied the mainstream in the construction of data centers for a long time, and decades-old air-cooled servers have formed a mature ecological chain, with advantages in construction and operating costs, so in some areas with superior climate, air-cooled solutions can meet the needs of PUE reduction. For example, Huawei Wulanchabu cloud data center is dominated by 8-kilowatt air coolers. In addition, in some eastern and central regions, there is a demand and willingness to introduce liquid cooling, but the cost should also be taken into account. If significant energy-saving results can be achieved by optimizing the UPS architecture and adopting intelligent energy efficiency management programs, then air cooling can be air-cooled.
two。 Technical problems during the transition period.
Of course, for HPC, AI and other calculations, the use of liquid cooling has great advantages, so some companies want to try liquid cooling technology, but do not want to transform the air-cooled computer room, so the transition period from air-cooled to liquid-cooled, there is a market demand for "air-liquid mixed distribution".
We know that the air-cooled server can be loosely coupled with the refrigeration equipment, with high environmental adaptability and flexibility, while the immersion liquid cooling needs to completely immerse the heating components of the server, such as board, CPU and memory, in the coolant, while spray liquid cooling needs to transform the chassis or cabinet, both of which bring high costs. During the transition period, the mixed use of cold plate liquid cooling and air cooling is a more suitable scheme. However, in order to fix the cold plate on the main heating device of the server and rely on the liquid flowing through the cold plate to take away the heat, the requirement of full sealing and anti-leakage is high, so it is very difficult to design and manufacture.
(Atlas 900cluster deployed by Huawei Cloud Dongguan Songshan Lake data Center, using wind-liquid mixing technology to dissipate heat) 3. The problem of cooperation in the industrial chain.
Liquid-cooled data center needs collaborative innovation in the upstream and downstream of the industrial chain, including manufacturing, design, materials, construction, operation and maintenance and other links. The air-cooled mode is also due to loose coupling, which leads to the separation of the refrigeration industry and the data center industry. to promote the transformation of the data center to liquid cooling, it is necessary to build a new ecology and strengthen the relationship between various roles. reduce the pre-manufacturing cost and follow-up maintenance cost of the liquid-cooled server. This requires a process of multi-party running-in and cooperation, which cannot be realized overnight.
From these perspectives, although the liquid-cooled data center is the trend of the times, but there is still a long way to go, the whole industry continues to pay attention to change.
Third: core if power supply efficiency and air-cooled liquid cooling are important changes in the infrastructure of cloud data center computer rooms, then chips may be the focus of IT infrastructure.
In 2021, Digital China Wanli Travel, sponsored by the exclusive name of Amou Science and Technology, discovered a new phenomenon during the inspection of Wulanchabu and Helinger in Guizhou and Inner Mongolia-the rising power of China's "core". The maturity and application of domestic technology are improving to catch up with the mainstream. Aliyun's Litian 710, AWS's Graviton, Ampere's Altra, etc., have all made great progress and application.
There are many reasons for this situation, such as the autonomy of the whole cloud stack, which provides market support for China's "core"; the accelerated digitization of government affairs, finance, transportation, power, manufacturing and other industries provides an application landing scene for China's "core"; and the coexistence of x86 and Arm provides a R & D basis for China's "core" to customize and optimize based on the new structure.
But it must be pointed out that the moon has a dark side. Behind the rise of China's "core", we should also see that China's semiconductor field is still difficult to explore.
First of all, it is the shackles of the process. We know that the continuation of Moore's Law is based on the advancement of the process process, but the improvement of the semiconductor process has reached the ceiling for a long time, unable to keep up with the improvement of chip specifications. Therefore, cloud data centers began to adopt the practice of "heap CPU" to improve cabinet density, but the performance improvement brought about by stacking has a boundary and can not stop there.
So in the post-Moore era, small chips (Chiplet) began to be selected by many domestic chip manufacturers. This new chip design pattern allows multiple wafers to be encapsulated together to form a chip network, and both x86 and Arm ecosystems are using this technology. However, it should be noted that in the current IP reuse methods, there are relatively mature methods for testing and verification of IP, but how to test after multiple Chiplet packages and how to ensure the yield is still a problem that must be solved by China's "core".
More importantly, the packaging of small chips depends on advanced packaging technology, and the chip I / O interface can be co-designed and optimized with packaging, which is very important for the improvement of chip performance. This requires a strong interaction between advanced packaging design and chip design, and also puts forward certain requirements for design tools. We know that EDA tools have always been one of the "weaknesses" in China's semiconductor field, which is not solved. With the increasing importance of Chiplet, China's "core" is very difficult to rest easy.
At present, it seems that data center cluster, as an important part of digital infrastructure, is undergoing a series of changes. how well it is doing and what unanswered questions are questions that must be answered but not easy to answer.
The reason why we can't know the true face of Lushan Mountain is that we are in the middle of these mountains. Many things, only close to the field line, and then pull out to take a look at the overall situation, can we see a lot of "wavy mountains" that hinder the progress of the data center.
2023 data centers still have many mountains to cross, although the road is long and blocked, but as long as they stay on the road, there will always be a day when birds can fly.
This article comes from the official account of Wechat: brain polar body (ID:unity007), author: Tibetan fox
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.