The structure and Challenge of Alibaba's Service Mesh landing 04/25 Update SLTechnology News&Howtos

The structure and Challenge of Alibaba's Service Mesh landing

2025-04-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/03 Report--

Click to download "different double 11 Technologies: cloud Native practice in Alibaba economy"

This article is excerpted from the book "different double 11 Technologies: cloud Native practice in Alibaba economy". Click on the picture above to download it!

Author | Fang Keming (Xi Weng) Technical expert of Aliyun Middleware Technology Department

Guide: cloud native has become a future-oriented technology infrastructure for the entire Alibaba economy. As one of the key technologies of cloud native, Service Mesh successfully completed the landing verification in the stringent and complex scenarios of double 11 core applications. The author of this article will share with you the challenges we face and overcome in achieving this goal.

Deploy Architectur

Before getting to the topic, you need to explain the deployment architecture that landed on the double 11 core application, as shown in the following figure. In this article, we mainly focus on the Mesh of RPC protocol between Service An and Service B.

The example in the figure illustrates the three major planes contained in Service Mesh: the data plane (Data Plane), the control plane (Control Plane), and the operation and maintenance plane (Operation Plane). In the data plane, we use the open source Envoy (Sidecar in the figure above, please note that the two words can be used interchangeably in this article), the control plane uses the open source Istio (currently only the Pilot components are used), and the operation and maintenance plane is completely self-developed.

Unlike when it was launched half a year ago, we have adopted the Pilot cluster deployment mode for the launch of double 11 core applications, that is, Pilot is no longer deployed in the business container with Envoy, but a separate cluster has been built. This change makes the deployment of the control plane evolve to the final state that Service Mesh should have.

Challenges

The double 11 core applications selected for landing are all implemented in Java programming language. In the process of landing, we face the following challenges.

1. How to realize the Mesh of the application when SDK cannot be upgraded

When deciding to launch Mesh on the core application of double 11, the version of RPC SDK that the Java application depends on has been finalized, and there is no time to develop and upgrade a RPC SDK for Mesh in order to Mesh. At that time, the technical question facing the team was: how to Mesh the RPC protocol without upgrading SDK?

Readers familiar with Istio will know that Istio transparently intercepts traffic through the NAT table of iptables. Through transparent blocking of traffic, traffic can be hijacked into Envoy without any sense of application, so as to achieve Mesh. Unfortunately, the nf_contrack kernel module used in the NAT table was removed from Alibaba's online production machine because it was inefficient, so the community solution could not be used directly. Fortunately, shortly after the beginning of the year, we reached a cooperation with Alibaba OS team to undertake the construction of the two basic capabilities of transparent traffic interception and network acceleration required by Service Mesh. After close cooperation between the two teams, the OS team explored a transparent interception scheme based on userid and mark identification traffic, and implemented a new transparent interception component based on iptables's mangle table.

The following figure example illustrates the flow direction of the RPC service invocation in the presence of a transparent interceptor. Inbound traffic refers to the incoming traffic (the recipient of the traffic is the Provider role), while Outbound refers to the outgoing traffic (the sender of the traffic is Consumer role). Usually, an application will assume two roles at the same time, so there are two streams of traffic, Inbound and Outbound.

With the transparent interception component, the Mesh of the application can be completely senseless, which will greatly improve the convenience of Mesh landing. Of course, because the SDK of RPC still has the previous service discovery and routing logic, and the traffic is hijacked to Envoy, it will be done again, which will cause the traffic of Outbound to increase RT due to the existence of two service discovery and routing, which will also be reflected in the later data section. Obviously, when landing the Service Mesh in the final state, it is necessary to remove the service discovery and routing logic in the RPC SDK and save the corresponding CPU and memory overhead.

two。 Support complex service governance function routing for e-commerce business in a short time

In the Alibaba e-commerce business scenario, the routing features are rich and varied. In addition to supporting routing strategies such as unitalization and environment isolation, service routing has to be completed according to the method name, call parameter and application name of the RPC request. Alibaba's internal Java RPC framework supports these routing policies by embedding Groovy scripts. The business side configures the Groovy routing template on the operation and maintenance console, and when SDK initiates the call, it will execute the script to complete the application of the routing policy.

The future Service Mesh does not intend to provide a routing policy customization scheme as flexible as the Groovy script, so as to avoid restricting the evolution of Service Mesh itself because it is too flexible. Therefore, we decided to take the opportunity of Mesh to remove the Groovy script. Through the scenario analysis of the Groovy script used in the landing application, we abstract a set of solutions that are in line with the cloud native: expand VirtualService and DestinationRule in the native CRD of Istio, and add routing configuration segments required by the RPC protocol to express routing policies.

At present, the strategies of unitization and environment isolation under Alibaba's environment are customized in the standard routing module of Istio/Envoy, and there is inevitably some hack logic. In the future, we plan to design a set of Wasm-based routing plug-ins in addition to Istio/Envoy 's standard routing strategies, so that those simple routing strategies exist in the form of plug-ins. In this way, it not only reduces the intrusion into the standard routing module, but also meets the business needs of service routing customization to a certain extent. The envisaged architecture is shown in the following figure:

Current limit

For performance considerations, Alibaba's internal Service Mesh solution does not use the Mixer component in Istio. Current restriction is realized by using the Sentinel component widely used in Alibaba, which can not only form a joint force with the already open source Sentinel, but also reduce the migration cost of Alibaba's internal users (directly compatible with the existing configuration of the business to limit current). In order to facilitate Mesh integration, a number of internal teams have worked together to develop C++ version of Sentinel. The whole function of current limit is realized through Envoy's Filter mechanism. We have built a corresponding Filter (a term in Envoy, which represents an independent functional module for processing requests) on top of Dubbo protocol, and each request will be processed by Sentinel Filter. The configuration information required for current restriction is obtained from Nacos through Pilot and sent to Envoy through xDS protocol.

3. The resource cost of Envoy is too high.

One of the core issues to be solved at the beginning of Envoy is the observability of services, so Envoy has built a large number of stats (that is, statistics) in order to better observe services.

The stats granularity of Envoy is very fine, even as fine as the IP level of the entire cluster. In Alibaba's environment, the Consumer and Provider services of some e-commerce applications add up to hundreds of thousands of IP (each IP carries different meta-information under different services, so the same IP under different services is independent). As a result, Envoy has a huge memory overhead in this area. To this end, we have added a stats switch to Envoy to turn off or turn on IP-level stats. Turning off IP-level stats directly results in 30% memory savings. Next we will follow the community's stats symbol table solution to solve the problem of duplicate stats metric strings, when the memory overhead will be further reduced.

4. Decouple business and infrastructure, so that infrastructure upgrade has no effect on business

One of the core values of Service Mesh landing is to completely decouple the infrastructure from the business logic, and the two can evolve independently. In order to achieve this core value, Sidecar needs to have hot upgrade capabilities so that the upgrade will not cause business traffic disruption, which is quite a challenge to the solution design and technical implementation.

Our hot upgrade adopts a two-process solution, which first pulls up the new Sidecar container and transfers the run-time data with the old Sidecar. After the new Sidecar is ready to send and take over the traffic, let the old Sidecar wait for a certain period of time before exiting, and finally achieve lossless business traffic. The core technology is mainly the use of Unix Domain Socket and RPC node elegant offline function. The following figure shows a rough example of the key process.

Data performance

Accidentally publishing performance data can lead to controversy and misunderstanding, because there are many variables in the scenario of performance data. For example, concurrency, QPS, payload size and so on will have a key impact on the final data performance. For this reason, Envoy officials have never provided the data listed in this article, and the reason behind it is that its author, Matt Klein, is worried about causing misunderstanding. It is worth emphasizing that when time is very tight, the Service Mesh we landed is not in the optimal state, or even the final solution (for example, there is a problem of twice routing on the Consumer side). The reason why we choose to share is to let more colleagues know our progress and status.

This article only lists the data of one of the core applications launched by Shuang 11. From the point of view of stand-alone RT sampling, a machine of Service Mesh is deployed, and the average value of RT on the Provider side is 5.6ms on the 10.36ms side. The RT of the machine near double 11 zero is shown in the following figure:

There is no machine where Service Mesh is deployed, and the average on the Provider side is 5.34ms, and on the 9.31ms side is 9.31ms. The following figure illustrates the RT performance of the machine in the double 11 zero attachment.

In contrast, the RT on the Provider side increased by 0.26 Ms before and after Mesh, while the 1.05ms on the 1.05ms side. Note that this RT difference includes all the time that the business is applied to Sidecar, as well as Sidecar processing. The following figure illustrates the link that brings an increase in latency.

Overall, the core compares the overall average data of all machines with and without Service Mesh over a certain period of time. After Mesh on the Provider side, 0.52ms was increased on the RT side, while 1.63ms was increased on the Consumer side.

In terms of CPU and memory overhead, after Mesh, the CPU consumed by Envoy is maintained at around 0.1core on all core applications, resulting in burrs as Pilot pushes data. In the future, we need to use the incremental push between Pilot and Envoy to optimize the burr. The memory overhead varies greatly with the service of the application and the size of the cluster. At present, it seems that Envoy still has a lot of room for optimization in the use of memory.

From the data performance of all the core applications launched in double 11, the impact of the introduction of Service Mesh on RT is basically the same as the CPU overhead, while the memory overhead varies greatly depending on the service and the size of the cluster.

Prospect

Under the native wave of cloud, Alibaba uses this wave of technology to build a future-oriented technology infrastructure. On the road of development, we will implement the development idea of "borrowing power to open source, back-feeding open source", realize technology universal benefit through open source, and make our own contribution to the popularization of cloud native technology in the future.

Next, our overall technical focus is as follows:

Work with the Istio open source community to enhance Pilot's data push capabilities. In the super-large-scale application scenario where Alibaba has double 11, we have extreme requirements for the data push capability of Pilot. We believe that in the process of pursuing the extreme, we can work with the open source community to accelerate the co-building of global de facto standards. From the internal point of view of Alibaba, we have opened up the co-building with the Nacos team, which will dock with Nacos through the community's MCP agreement, so that the various technical components open source by Alibaba can work together systematically.

Take Istio and Envoy as a whole, further optimize their protocols and their respective management data structures, and reduce their memory overhead through more refined and reasonable data structures.

Focus on solving the large-scale Sidecar operation and maintenance capacity building. Make Sidecar upgrades grayscale, monitorable and rollable

Realize the value of Service Mesh, so that business and technical facilities can evolve independently of each other with higher efficiency.

The highlight of this book

In the practice of Shuang 11 super large K8s cluster, the problems and solutions encountered are described in detail. The best combination of Yunyuan biochemistry: Kubernetes+ container + Shenlong, to achieve the technical details of the core system 100% on the cloud. Double 11 Service Mesh super large-scale landing solution

"Alibaba Cloud's native Wechat official account (ID:Alicloudnative) focuses on micro-services, Serverless, containers, Service Mesh and other technology areas, focuses on cloud native popular technology trends, and large-scale cloud native landing practices, and is the technical official account that best understands cloud native developers."

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.