In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/02 Report--
Author | Sun Jianbo (Alibaba technical expert), Zhao Yuying
Introduction: in the era of cloud origin, the importance of Kubernetes has become increasingly prominent. However, most Internet companies' exploration of Kubernetes did not go smoothly as expected, and Kubernetes's own complexity is enough to deter a group of developers. In this article, Sun Jianbo, a technical expert of Alibaba, provided some experience and suggestions based on Alibaba's practical process of Kubernetes application management in an interview, in order to be helpful to developers.
In the Internet era, developers mostly use top-level architecture design, such as multi-cluster deployment and distributed architecture, to achieve rapid switching when resource-related problems occur, do a lot of things to make elasticity easier, and improve resource utilization through mixed computing tasks, while the emergence of cloud computing solves the transition from CAPEX to OPEX.
The era of cloud computing allows developers to focus on the application value itself. compared with the past, developers have to invest a lot of energy in storage, network and other infrastructure in addition to business modules, which are now as easy to use as water, electricity and coal. The infrastructure of cloud computing has a series of capabilities such as stability, high availability and elastic scalability, in addition to solving a series of "best practices" of application development, such as monitoring, audit, log analysis, grayscale release and so on. It used to be that an engineer needed to be very comprehensive to build a highly reliable application, but now with enough knowledge of infrastructure products, these best practices can come in handy. However, in the face of natural and complex Kubernetes, many developers are powerless.
Nick Young, chief engineer of Atlassian's Kubernetes team, the company behind Jira and code base Bitbucket, said in an interview:
Although the strategy chosen by Kubernetes was correct (at least no other possible options have been found so far), and many of the problems encountered at this stage have been solved, the deployment process is extremely difficult.
So, is there a good solution?
Too complicated Kubernetes
"if you ask me to say that the problem with Kubernetes is of course 'too complicated'," Sun Jianbo said in an interview, "however, this is actually caused by the positioning of Kubernetes itself."
Sun Jianbo added that the positioning of Kubernetes is "platform for platform". Its direct users are neither application developers nor application operators, but "platform builder", that is, infrastructure or platform-level engineers. However, for a long time, we often misuse Kubernetes projects, and a large number of application operation and maintenance personnel, and even application research and development, are collaborating directly around Kubernetes's very low-level API, which is one of the fundamental reasons for many people to complain that "Kubernetes is really too complicated."
It's as if a Java Web engineer has to deploy and manage business code directly using Linux Kernel system calls, which naturally makes Linux too complex. Therefore, the current Kubernetes project actually lacks a higher level of encapsulation to make the project more friendly to the upper software development and operation and maintenance personnel.
If the above positioning can be understood, it makes sense for Kubernetes to design the API object as all-in-one, just like Linux Kernel's API, without distinguishing who the user is. However, when developers really want to manage applications based on K8s and connect with R & D, operation and maintenance engineers, they must consider this problem and how to solve this problem in a standard and unified way like another layer of Linux Kernel API, which is why Aliyun and Microsoft jointly open cloud native application model Open Application Model (OAM).
Stateful application support
In addition to the natural complexity problem, Kubernetes's support for stateful applications has always been a problem that many developers spend a lot of time studying and solving. It is not impossible to support it, but there is no relatively optimal solution. At present, the mainstream solution for stateful applications in the industry is Operator, but it is actually very difficult to write Operator.
In the interview, Sun Jianbo said that this is because Operator is essentially an "advanced version" of K8s client, but the design of K8s API Server is a "heavy client" model, which of course is to simplify the complexity of API Server itself, but it also makes both K8s client library and Operator based on it become extremely complex and difficult to understand: they all contain a lot of implementation details of K8s itself, such as reflector, cache store, informer and so on. These should not be the concerns of Operator writers. Operator writers should be domain experts in stateful applications themselves (such as TiDB engineers), not K8s experts. This is now the biggest pain point of K8s stateful application management, and this may require a new Operator framework to solve this problem.
On the other hand, the support of complex applications is not only simple to write Operator, but also needs technical support for stateful application delivery, which is intentionally or inadvertently ignored by all kinds of continuous delivery projects in the community. In fact, continuously delivering an Operator-based stateful application is not at all the same magnitude as the technical challenge of delivering a stateless K8s Deployment. This is also an important reason why Sun Jianbo's team advocates the "Application delivery hierarchical Model" in the CNCF Application delivery Domain Group (CNCF SIG App Deliver): as shown in the following figure, the four-tier models are "Application definition", "Application delivery", "Application Operation and Automation" and "platform layer" respectively. Only through the cooperation of different capabilities of these four layers can we truly deliver stateful applications with high quality and efficiency.
Cdn.com/118448334b24ceab7fbd2bb01baee10398412c2a.png ">
For example, the design of the Kubernetes API object is "all-in-one", that is, all participants in the application management process must collaborate on the same API object. As a result, developers will see that in API object descriptions such as K8s Deployment, some fields may still be subject to multiple × × ×, whether it is application development, application operation and maintenance, or K8s automation capabilities such as HPA, they may all need to control the same field in an API object. The most typical case is the parameter replica. However, who own this field is a very thorny question.
To sum up, since the positioning of K8s is the Linux Kernel of the cloud era, Kubernetes must constantly make breakthroughs in Operator support, API layer and the definition of various interfaces, so that more ecological participants can better build their capabilities and values based on K8s.
Alibaba's large-scale Kubernetes practice
Today, the application scenario of Kubernetes in Ali economy covers all aspects of Ali's business, including e-commerce, logistics, offline computing, etc., which is also one of the main forces supporting Ali 618, double 11 and other Internet-level promotion. Ali Group and Ant Financial Services Group run dozens of super-large-scale K8s clusters, of which the largest cluster has about 10,000 machine nodes, and this is actually not the upper limit of capacity. Each cluster serves tens of thousands of applications. On Aliyun Kubernetes service (ACK), we also maintain a K8s cluster with tens of thousands of users, which is second to none in scale and technical challenges in the world.
Sun Jianbo revealed that Ali started application containerization as early as 2011, when it started to build containers based on LXC technology, and then began to use self-developed container technology and orchestration and scheduling system. There is nothing wrong with the whole system itself, but as an infrastructure technical team, the goal must be to hope that Ali's basic technology stack can support a wider upper ecology and continue to evolve and upgrade, so it took the whole team more than a year to gradually make up for the scale and performance deficiency of K8s. Overall, upgrading to K8s is a very natural process, and the whole practice process is actually very simple:
First, to solve the problem of application containerization, we need to make rational use of the container design pattern of K8s; second, to solve the problem of application definition and description, which needs to be implemented by reasonable use of application definition tools and models such as OAM,Helm, and to be able to dock with existing application management capabilities; third: to build a complete application delivery chain, where you can consider the use and integration of various continuous delivery capabilities.
If the above three steps are completed, we will have the ability to dock with R & D, operation and maintenance, and the upper PaaS, and be able to explain clearly the value of our platform. Then we can start the pilot project and replace the following infrastructure step by step without affecting the existing application management system.
Kubernetes itself does not provide a complete application management system, which is built based on K8s for the whole cloud native ecology, and can be shown in the following figure:
Helm is one of the most successful examples, it is located at the top of the entire application management system, that is, layer 1, and there are various YAML management tools such as Kustomize and packaging tools such as CNAB, which all correspond to layer 1.5. Then there are application delivery projects such as Tekton, Flagger, Kepton, and so on, corresponding at layer 2. Operator, and the various workload components of K8s, such as Deployment and StatefulSet, correspond to layer 3. Finally, there is the core function of K8s, which is responsible for managing workload containers, encapsulating infrastructure capabilities, providing API for different workloads docking underlying infrastructure, and so on.
In the early days, the biggest challenge for the team came from size and performance bottlenecks, but this solution was also the most straightforward. Sun Jianbo said that with the gradual increase in scale, we see that the biggest challenge to roll out K8s on a large scale is actually how to carry out application management and docking the upper ecology based on K8s. For example, we need unified control from dozens of teams, hundreds of Controller; for different purposes, and we need to deliver production applications from different teams at a frequency of nearly 10,000 times a day. The release and expansion strategies of these applications may be completely different. We also need to dock with dozens of more complex upper-level platforms and schedule and deploy different forms of jobs to achieve the highest resource utilization. These demands are the problems to be solved by Alibaba's Kubernetes practice, and scale and performance are only one part of them.
In addition to the native functions of Kubernetes, a large number of infrastructure will be developed within Alibaba to connect these functions in the form of K8s plug-ins. As the scale expands, discovering and managing these capabilities in a unified way has become a key issue.
In addition, there are many PaaS stocks in Alibaba, which are built by the cloud in different business scenarios. For example, some users want to upload a Java War package to run, and some users want to upload an image to run it. Behind these requirements, Ali teams have done a lot of application management work for users, which is also the reason for the emergence of inventory PaaS, and the docking process of these inventory PaaS and Kubernetes may cause various problems. At present, Ali is using OAM, a unified standard application management model, to help these PaaS dock and move closer to the K8s chassis to achieve standardization and cloud biochemistry.
Decoupling transportation and R & D
Through decoupling, Kubernetes projects and corresponding cloud service providers can expose different dimensions of declarative API that are more in line with the needs of users for different roles. For example, application developers only need to declare in the YAML file that "application A needs to use 5G read / write space", while application operators only need to declare in the corresponding YAML file that "Pod A wants to mount 5G read / write data volumes". The focus brought by "letting users only care about what they care about" is the key to reducing the learning threshold and difficulty for Kubernetes users.
Sun Jianbo says that most of the current solutions are actually "pessimistic." For example, Alibaba's internal PaaS platform, in order to reduce the burden of R & D use, has long been open to R & D to set only 5 Deployment fields. Of course, this is because of the design of K8s YAML "all-in-one", which makes the complete YAML too complex for R & D, but it also leads to the ability of K8s itself, which in most cases has no body feel for R & D. For the operation and maintenance of the PaaS platform, he thinks that K8s YAML is too simple to describe the operation and maintenance capabilities of the platform, so he has to add a lot of annotation to the YAML file.
In addition, the core problem here is that for operators, the result of this "pessimistic treatment" is that he is too "dictatorial", doing a lot of detailed work and thankless. For example, the expansion strategy is now entirely decided by the operation and maintenance side. However, R & D, as the actual people who write the code, has the most say in how to expand the application, and the R & D personnel very much hope to tell the operation and maintenance staff their opinions, so as to make K8s more flexible and really meet the needs of capacity expansion. But this demand cannot be realized in the current system.
Therefore, the purpose of "decoupling R & D from operation and maintenance" is not to separate the two, but to provide R & D with a standard, efficient way to communicate with operation and maintenance, which is also a problem to be solved by OAM application management model. Sun Jianbo said that one of the main functions of OAM is to provide a set of standards and norms for research and development to express demands from their own point of view, and then this set of standards "you know, I know, the system knows", then the above problems can be easily solved.
Specifically, OAM is a standard specification that focuses on describing applications. With this specification, the application description can be completely separated from the details of infrastructure deployment and management applications. The design benefits of xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx For example, in the actual production environment, whether it is Ingress, CNI or Service Mesh, these seemingly consistent concepts of operation and maintenance vary greatly in different Kubernetes clusters. By separating the definition of the application from the operation and maintenance capabilities of the cluster, we can make the application developers focus more on the value of the application itself, rather than the operation and maintenance details such as "where the application is deployed".
In addition, × × platform architects can easily encapsulate platform operation and maintenance capabilities into reusable components, thus allowing application developers to focus on integrating these operation and maintenance components with code, thus building reliable applications quickly and easily. The goal of OAM is to make simple application management easier and complex application delivery more controllable. Sun Jianbo said that in the future, the team will focus on gradually promoting this system to the cloud ISV and software distributors, so that the K8s-based application management system will really become the mainstream of the cloud era.
Guest introduction: sun Jianbo, Alibaba technical expert. Members of the Kubernetes project community. Alibaba is currently involved in large-scale cloud native application delivery and management, and participated in writing the technical book "Docker Container and Container Cloud" in 2015. He has worked in Qiniu and has participated in the cloud process of project-related applications such as time series database, streaming computing, log platform, etc.
At the ArchSummit Global architect Summit in Beijing on December 6-7 this year, Mr. Sun Jianbo will continue to share "experiences and lessons learned from Alibaba's Kubernetes application management practice". He will introduce Ali's existing practice in the process of decoupling R & D and operation and maintenance, as well as the problems in the practice itself, as well as the implementation of standardization and unified solutions, as well as further thinking about the community.
"Alibaba Yun × × icloudnative × × erverless, containers, Service Mesh and other technical fields, focus on cloud native popular technology trends, cloud native large-scale landing practice, do the best understanding of cloud native development × × detailed information × × Bayun native"] (http://mp.weixin.qq.com/s?__biz=MzUzNzYxNjAzMg==&mid=2247487322&idx=1&sn=a179a68918f599cba0f0ce579f17028e&chksm=fae50495cd928d83da512177b7ee591cec05f51f1a6151c8ac5650eb5a996ccd757c15466cda&token=4897735&lang=zh_CN#rd)).
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.