Analysis of the present situation of CI/CD system based on Kubernetes practice flexibility 09/21 Update SLTechnology News&Howtos

Analysis of the present situation of CI/CD system based on Kubernetes practice flexibility

2025-09-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article shows you the Kubernetes practice flexibility of the CI/CD system status analysis, the content is concise and easy to understand, absolutely can make your eyes bright, through the detailed introduction of this article, I hope you can get something.

First of all, briefly explain what Kubernetes is to help you understand. Kuberentes is a container orchestration system available for production. On the one hand, Kuberentes makes a pool of all Node resources in the cluster, and then the unit it dispatches is Pod, of course, there can be multiple containers in Pod. It's like a person holding an ECS resource or computing resource in his left hand and a container in his right hand and matching them so that it can be used as a container choreography system.

The concept of Cloudnative is now often mentioned by everyone, and many people wonder what does Cloudnative have to do with Kuberentes? How can we tell if an application is Cloudnative? I think there are three criteria for judging:

First, it can pool resources.

Second, the application can quickly access the pool's network. There is a layer of independent network in Kuberentes, and then I just need to know which service name I want to access, that is, some of the functions of various service discovery, which can be accessed quickly through service mesh.

The third is the failover function, if there is a host in a pool, or a node down is lost, and then the whole application is not available, this is definitely not a Cloudnative application.

Comparing these three points, we can see that Kuberentes has done a very good job. First of all, let's look at the concept of a resource pool. A large cluster of Kuberentes is a resource pool. We no longer have to care about which host my application is going to run on. I just need to publish our deployed yaml file to Kuberentes, it will automatically do these scheduling, and it can quickly access the network of our entire application, and then the failover will also be automatic. Next I'm going to share how to implement a flexible CI/CD system based on Kuberentes.

The present situation of CI/CD

First of all, take a look at the current situation of CI/CD. The concept of CI/CD has actually been proposed for many years, but with the evolution of technology and the introduction of new tools, it is gradually enriched in the whole process and implementation. The first thing we usually do with CI/CD is code submission, which triggers an event and then does an automatic build on a CI/CD system.

The following figure reflects the current status of CI/CD:

It contains a lot of processes, starting from the code submission to trigger an event, and then it can do a layer of Build, first through maven to do a build, then do a unit test, and then do code specification scan, and then deploy. Next, do an end-to-end UI level test, which is an automated testing tool for UI.

Then a stress test (performance stress test) is done, which is just a layer of processing in the development test environment. Then it can be extended in the QA environment and eventually to the UAT environment, and the entire pipeline is a very long process. CI/CD has a wide range of use, it can submit the entire IT from code writing to the code repository as a starting point, from here on, each step can be included in the scope of CI/CD. However, with its more and more scope of control, the whole process becomes more and more complex, and it takes up more and more resources.

If you know C++, you will know that C++ used to be a famous language that has been built for a long time. One of the authors of the Go language, who is also the author of the C language, talked about writing Go because he didn't want to write code once and compile it for so long. Therefore, one of the great advantages of the Go language when it first emerged was that the compilation time was very short. The compilation time of Go is really very short.

I once used an i5 notebook to compile the Kubernetes source code to do a complete build, about 45 minutes of compilation, this is a very long process. So, even if you have made great optimizations and improvements to the compilation process, as long as the project is big, the build phase is already very long, not to mention some of the automated tests that follow. After everything is included, the CI/CD process consumes a lot of resources and takes up a lot of time.

Selection of CI/CD tools

Next, let's take a look at the choice and development of these tools for CI/CD. First of all, the oldest must be Jenkins. In fact, before the rise of container technology, CI/CD was almost equivalent to Jenkins. However, after the emergence of container technology, many new CI/CD tools have emerged, such as the Drone tool, which is a CI/CD tool implemented entirely based on containers. It integrates very well with the container, and its construction process is implemented entirely in the container.

In addition, there is Gitlab CI, which is mainly characterized by good integration with Gitlab code management tools. Jenkins 2.0 began to introduce the pipeline as code feature, and pipeline as code can help us automatically generate a jenkins file.

In Jenkins 1.0, if we want to configure a pipeline, we need to log in to Jenkins to build a project, and then write some shell in it. Although this can achieve the same effect, but it has one of the biggest drawbacks, that is, poor replicability and mobility. And it is naturally separated from Devops, for example, the Jenkins system is generally managed by operators, and developers write code, but developers have no idea how to build the code and where to release it. This results in the separation of development and operation and maintenance. But with the advent of pipeline as code, jenkins file and the source code can be placed in the same warehouse.

First of all, it has a very big advantage is that the release process can also be incorporated into version management, so that an error can be traced back. This is a very big change, but in fact, in our communication with customers, we found that although many people have upgraded to the 2.0 series on Jenkins, their usage is still completely in the 1.0 series, and many users do not use jenkins file this way. The other is the support for containers, around 2016, when the support for containers is very weak, it will be very troublesome to run Jenkins in the container and build Docker at the same time.

However, Drone supports containers very well. First of all, it runs completely in Docker mode, which means that your build environment is also in a container, you need to build a Docker build image, and then when it is pushed out, it also runs in the container, and then it requires a privilege permission. It has several advantages in this way. First of all, it will not cause any residue to the host. For example, as soon as your container is destroyed, some intermediate files generated in the construction will be completely destroyed, but if you use Jenkins, it will precipitate a lot of temporary files over time, and it will take up more and more space. You need to do some cleaning on a regular basis, and you can't empty it with one click during the cleaning process, so it's a troublesome process.

Then there is another particular headache about the plug-in management Jenkins, which is the plug-in upgrade. First you log in at Jenkins, and then you do the plug-in upgrade. If I want to temporarily test or debug a Jenkins in a new environment, I may need to upgrade these plug-ins every time I build a new environment. And all the configurations we just talked about in Jenkins also need to be reconfigured, which is a very tedious process.

But Drone this tool, it has a particularly good place, that is, all plug-ins are Docker containers, for example, if you use this plug-in in pipeline, you just declare to use this plug-in, you do not have to manage where to download the plug-in, and then how to install it, everything is fully automatic, as long as it is said that your network can access the plug-in container image, this is very convenient.

Then when it comes to ecological construction, the biggest advantage of Jenkins is that it has a lot of plug-ins, that is, you have all kinds of things you want to use, and its foundation is very good, and your plug-ins are very capable of implementation. For example, pipeline is in this way. Although it is available from 1. 0 to 2. 0, it is implemented entirely through plug-ins. But now the development of Jenkins is beginning to feel like a second spring. He began to significantly increase his support for Kuberentes. First of all, starting with JenkinsX, it integrates some ecological tools related to Kuberentes, such as Harbor and Helm. It is very convenient to do some construction on the Kuberentes cluster, and put some solid choreography files for services into Helm.

In addition, now it has a new subproject called config as code, that is to say, it has made some configurations in all the Jenkin, which can be output into a code form, that is, the migration or replication of the entire Jenkins is a very convenient improvement.

After talking about so much, in fact, the last thing we choose is Jenkins, because the most important thing is the ecological construction, they are already very good. What we're going to talk about today is doing an elasticity. We already have this plug-in on CI/CD Jenkins, but some people in the Drone community have mentioned it, but we haven't seen it yet.

System Business scenario of CI/CD

Then let's take a look at the business scenario of CI/CD its system. It has a typical scenario and characteristics. First of all, it is for developers, which is relatively rare, because developers are generally a little more picky. So if the system is not robust enough, or if the response time is longer, you will often be complained.

Then there are timeliness requirements, because after our code is written and submitted, we do not want to queue up all the time in the construction of this code, we want to start building immediately, and we are rich in resources. The other is that the peaks and troughs occupied by its resources are very obvious. Just because developers can't submit code all the time, some people may submit it several times a day, and some people will submit it many times.

Because I've seen a sharing before, and one person drew a curve that reflects the task of building his own company. Their company probably has the highest code delivery volume at three or four o'clock every afternoon, and it is relatively flat the rest of the time. This means that at three or four o'clock in their company, programmers submit code and start paddling. Then, as the demand for CI/CD resources becomes higher and higher, building a cluster is a must. Is to improve the load capacity and shorten the queuing time of tasks. Of course, one of the bad things about a real cluster is that it actually has only one Master, which can also be improved through plug-ins.

But that's not what we're going to talk about today, because a lot of times Master is temporarily unavailable, and then it's acceptable to recover quickly. Then in addition, it needs to meet a variety of build scenarios. A company may use a lot of development environments, such as although we all use Java, but there may be 1.5,1.6,1.7 differences. If you don't use Java, you'll probably need a lot of build environments in many other languages such as Python, Go, NodeJS, and so on. And if we don't introduce containers, these build environments can't be reused, and maybe a host can only be used by PHP.

Containers can give us CI/CD systems to inject new capabilities, namely, the ability to isolate the environment. We can use Kubernetes to inject more capabilities into the CI/CD system, and then contradictions arise. Developers always want the CI/CD system to respond quickly to an event submitted by code, but every company's resources cannot be unlimited. Because as mentioned above, if there is a peak of code submission at 3 or 4 p. M. every day, it may take 30 or 40 machines to meet the build task. But I can't just drive 30 or 40 machines here every day, just for three or four o'clock in the afternoon, maybe build for an hour or two.

Kubernetes can inject new capabilities into jenkins, making CI/CD systems resilient. What is the goal we expect? When there is a build task, you can automatically add new machines or new computing power for our resources, but at the same time, you can automatically release these resources when I need them.

That's what we expect, and Kuberentes can do this for Jenkins. Kuberentes as a container choreographed system, it can provide the ability, it can quickly pop up some new instances, and automatically schedule them to idle machines, do a resource pool, do a schedule in the resource pool, and after he has finished performing the task, it can do a recovery. And if the Jenkins Master is also deployed on the Kuberentes, it can do a failover to the Master, that is, if our system can tolerate it, even if the Master is dead, I can quickly transfer it to another machine, the response time will not be very long.

Kubernetes-piugin

This is also implemented by a plug-in, which is directly called Kuberentes-plugin. What this plug-in can provide is that it directly manages a Kuberentes cluster. After it is installed in Jenkins, it can listen to the construction task of Jenkins. There is a build task, when waiting for a resource, it can apply to Kuberenetes for a new resource, apply for a new Pod to do the automatic build, it will automatically clean up.

Let's give a brief introduction to its capabilities, because after the plug-in is installed, it also has a modification to the syntax of pipeline. Let's take a look at an example later. But even at this point, it still won't work. First of all, the cluster planning of Kuberentes is still a problem. Let's say I have a cluster with 30 nodes, and the real master is deployed on it, and then I install those plug-ins, and after doing a management, we can find that there is a new task, and it starts a new Pod to complete the execution of the build task.

After execution, the Pod automatically destroys resources that do not occupy the cluster. Usually we can do some other tasks on this cluster, but there is still something wrong with this, that is, how big our cluster is planned, and when we don't do the construction task of this cluster, we can do some other tasks on it. But if you are doing a task and suddenly there are some build tasks, it may cause resource conflict problems.

Kubernetes Autoscaler

Overall, there are still some imperfections, so we can take advantage of some of the less common features of Kuberentes to solve the problem we just talked about. One of these two things is called Autoscaler, the other is called Virtual node. Let's take a look at Autoscaler,Autoscaler as an official component of Kubernetes. Three capabilities are supported under Kuberentes's group:

Cluster Autoscaler, which can automatically scale cluster nodes.

Vertical Pod Autoscaler, scaling resources in the vertical direction of the Pod of the cluster. Because Kuberentes itself has HPA to do horizontal Pod scaling, node number scaling; this feature is not yet available for production.

Addone Resizer is the addone on Kuberentes, such as Ingress Controler and DNS, which can adjust the allocation of resources according to the number of Node.

Cluster autoscaler

What I want to talk about is Cluster autoscaler, which is to scale up the number of cluster node nodes. First of all, let's take a look at this. This is a way to implement Autoscaler on Aliyun's container service. Let's take a look at this diagram, which is a scenario in which HPA and Autoscler are used together.

When HPA listens to monitored events and finds that the resource utilization has risen to a certain extent, HPA will automatically notify workload to pop up a new Pod and a new Pod. At this time, the cluster resources may be insufficient, so the Pod may be pending here. It will trigger the event of Autoscaler, and Autoscaler will pop up a new Node according to the template of ESS that we configured before, and then automatically add Node to our cluster. It takes advantage of the template function customized by ESS, and it can support a variety of Node instance types, ordinary instances, gpu, and preemptive instances.

Virtual node

Then the second is that the Virtual node,Virtual node implementation is based on Microsoft's open source Virtual Kubelet project. It makes a virtual Kubelet and registers it with the Kubernetes cluster. But if it's not easy to understand, imagine MySQL proxy, and then he disguises himself as a MySQL server, and then the back end may manage a lot of MySQL server, and it can help you do some SQL query routing or splicing automatically.

Virtual kubelet does similar work, that is to say, it registers to Kubernetes that I am a node, but in fact, its backend may manage a lot of resources on the entire public cloud. It may dock with some ECI of the public cloud or these VPC. This is a general diagram of it.

What they docked with on Aliyun is Ariyun's ECI to make an elastic container instance, and its response time is very fast, because it does not need to add Node to the cluster. It can achieve this performance of about 100 Pod per minute. And we can declare the usage of this resource on Pod, which is a very fast response time.

Then we just said that we can make new changes to our CI/CD flexible system by using these two ways. We do not have to plan the size of our cluster very early, we can make the cluster size automatically do some scaling actions when needed. But after you have done these actions, and after we have done these actions that put the real ones in the container, we have introduced some new thresholds: docker-outside-of-docker and docker in docker.

When we run Jenkins in Docker, there are usually two ways. One is to mount the host docker.sock into the container and let Jenkins communicate with the local docker daemon through this file, and then it does docker build to build images, or push these images to the remote repository, so all the images generated in it will be piled on the local machine. This is a problem.

When used in some serverless scenarios, it has some limitations because serverless itself does not allow socket files to be hung in it. The other is docker in docker, which has the advantage of launching a new Docker daemon in the container, and all its intermediates and construction products are destroyed together with the container, but it has a problem, it just needs the permission of privilege.

Most of the time we try not to use it. The other is that when you can do it on the host when you do docker build, if it already has an image, it will directly use this image, but if you use docker in docker, it will pull the entry again every time, and it will take a certain amount of time to pull the image, which depends on our usage scenarios.

New build tool-- Kaniko

At this time, a new open source tool from Google, Kaniko, was introduced. What it does is the docker in docker way. It has a very big advantage is that it does not rely on Docker, and so it does not need privilege permission to use user-mode mode in the container to fully build docker image. The user mode executes the command of Dockerfile, which builds the image completely.

This is a more expected flexible CI/CD system. Then at this time, that is to say, from the real node to the underlying computing resources are all flexible expansion, and to meet the delivery requirements, we can manage our resources in a very fine way.

Demo demo

Then we can take a look at the Demo demo:

Https://github.com/hymian/webdemo here is an example I have prepared, focusing on this Jenkinsfile file, which defines the pod template of agent, which contains two containers, a build for golang and a build for image.

And then we build it now. Started to build, just started, because now we have only one, only one master in this environment, so he will not have a build node. As you can see, it now launches a new Pod, and the Pod is added as a node, but because I define a label in this Pod template, it does not have this node, so its Pod state is pending. So what we show in the build log is that the agent node is offline.

But we have defined an auto-scaling thing in this cluster. When there are no nodes, it will automatically assign a new node to join. You can see that a node is joining. I can wait for a moment. That is to say, there may be a minute or two during this time.

When there are not enough nodes, it can automatically expand its capacity and join the cluster, which will take a little while. Because this time will be a little longer, because first of all, it triggers my expansion, it has a polling time difference. The whole process may take about three minutes. Let me take a look at my server. There were three servers just now, and now this server has just joined, this is preemption, this is something I just joined.

This is an exception, because this node is adding to the cluster, so it shows that it is an exception. This is what we look at from the command line. OK, there are already four nodes, and one node has been added. At this time, we look at Pod, and at this time agent is being created. At this time, you may have a small detail. You can take a look at it. It has three containers, but I just defined it in this. It's actually that there are only two containers in the Pod, and this is the place we just wrote on PPT.

The JNLP container, which is automatically injected by plugin, reports an intermediate status of the build to master in real time through this container, and I send out its log. This is the agent node in the initialization process of a thing at this time the slave node is already running. I have finished the output on my side, and the build is complete.

The above content is the current situation analysis of the CI/CD system of Kubernetes practical flexibility. Have you learned the knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.