Example Analysis of Cloud Native Jenkins on AWS 04/27 Update SLTechnology News&Howtos

Example Analysis of Cloud Native Jenkins on AWS

2025-04-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly shows you "sample analysis of cloud native Jenkins on AWS", which is easy to understand and well-organized. I hope it can help you solve your doubts. Let me lead you to study and learn the article "sample analysis of cloud native Jenkins on AWS".

We use Jenkins to build a continuous delivery pipeline, and like many other teams, we have created a lot of workflows and automation around Jenkins over the years. Jenkins was the key to our team's success, allowing us to successfully enter production 677 times in the last quarter, with an average build and deployment time of 12 minutes.

Most of our applications and infrastructure can be thought of as cloud native, but Jenkins services didn't quite fit into this category at the time: services run on a single server, while many tasks run directly on master, and some of its manual configuration includes secret, plug-ins, scheduled tasks, and the general expansion of startup hacking, which has accumulated since it was first built in 2014.

Jenkins has not only become a single service and a single point of failure, but also it is a great risk for enterprises to dismantle and rebuild Jenkins.

We decided that we had to make a change. This blog explains how we use AWS services such as Terraform, Packer, Docker, Vault, and ELB, ASG, ALB or EFS to implement Jenkins Cloud-native, and what we have learned along the way.

Jenkins statu

The key question we had to face at the time was: what state do we need to restore if we put the Jenkins service in a container / auto-scaling instance?

The answer to the question is not simple, and it is worth mentioning that a Jenkins special interest group (SIG) has identified all the storage components that cause this Jenkins state. This is a great starting point because we have to at least make sure that all the storage types listed in that article are taken into account.

Short cut

This is not a new problem. Many teams use Docker containers to run Jenkins, and the official Jenkins Docker image is well maintained. As explained in the Jenkins Dokcer Mirror document:

Docker run-p 8080 8080-p 5000015 50000-v jenkins_home:/var/jenkins_home jenkins/jenkins:lts

This will make the workspace exist / var/jenkins_home. All Jenkins data (including plug-ins and configurations) is stored in the above directory. Creating an explicit volume makes it easy to manage and attach to another container for upgrades.

"the above example mounts the jenkins_home on the host, including all Jenkins states." The directory can then be stored on an external disk, such as an Kubernetes persistent storage volume. Or, if Jenkins is running on EC2, the directory can exist on an external EBS or EFS volume.

This is an effective approach, but we don't think it meets our standards because jenkins_home includes not only state but also configuration. Block storage has a large number of user cases, but a small configuration change requires snapshot recovery, which does not seem to be a good solution. In addition, we are not trying to deflect the problem: external storage cannot avoid problems such as manual configuration and credentials stored in the file system.

SCM rescue

In the past, we used the Jenkins backup plug-in, which basically backs up configuration changes in source control, allowing configuration recovery. The design idea of this plug-in is great, but we decided not to use it because we can't easily control which data is backed up, and the plug-in hasn't been updated since 2011.

In that case, what if we create jenkins_home as a personal Git repo and automatically commit changes to Jenkins? The key here is to exclude any separately stored binaries, secrets, or large files (described in more detail later). Our .gitignore file is as follows:

/ .SamlMuidpWhen metadata.xmlsamlsamlMakeystore.jkssamlMakeystore.jmssaml.xml.xmlshakeystore.xml.com Sync-configuration.success.log/secret.key/secret.key.not-so-secret/secrets//updates//workspaces/

Almost all plain text configurations are being persisted in Git. To provide this configuration for Jenkins, all we have to do is check that repo; things on startup are taking shape.

Secrets

Jenkins has to visit a lot of places, which means we need a secure secret storage space. Because we are heavy users of HashiCorpVault, we naturally chose this tool, but unfortunately, Vault does not cover all scenarios. For example, the scm-branch-source pipelined plug-in requires SCM's authentication credentials and defaults to the Jenkins credential plug-in. Every time we retrieve these dynamically from Vault, we need to synchronize a repository, which can lead to errors and requires extra effort to maintain.

That's why we use a mix of Vault and Jenkins credentials: 1. In the startup instance, Jenkins authenticates and VAult uses IAM authentication method. two。 A boot script retrieves other encryption keys used by Jenkins master.key and the credential plug-in. Please refer to this article for more details. 3. Credentials stored on jenkins_home/credentials.xml can now be decrypted and accessed by Jenkins.

Completely replacing the credential plug-in with Vault is something we may explore in the future, but we are happy that this approach meets the security requirements and can be easily integrated with the rest of Jenkins.

Tasks and workspace data

The problem gets tricky from this step: jenkins_home/jobs and jenkins_home/workspaces contains a mixture of unstructured data, artifacts, and plain text. This information is valuable and can help us audit and understand the previous pipelined build. These build sizes are large and are not suitable for SCM synchronization, so both directories are excluded from .gitignore.

So where do we store these? We believe that block storage is the most suitable for storing this kind of data. As a heavy user of AWS, it makes perfect sense to use EFS because EFS's file storage is scalable, highly available, and accessible over the network, making it easy to use. We used Terraform to integrate AWS EFS resources and developed a regular backup plan with AWS backup service.

At startup, we mount the EFS volume, symbolic link jenkins_home/jobs, and jenkins_home/workspaces to the EFS directory, and then start the Jenkins service.

Next, the Jenkins service is the only interface that can read and write task / workspace data. It is worth mentioning that we have a Jenkins task that periodically deletes tasks and workspace data from a few weeks ago so that the data does not increase all the time.

Coding Jenkins with Packer and Terraform

You might want to know how these come together? I didn't even say where to run Jenkins! We used Kubernetes extensively and spent some time thinking about running Jenkins as a container, but we decided to use Packer and EC2 to run Jenkins master and to run these tasks with transient EC2 instances.

While the idea of running both master and worker as containers is useful, we can't find a place to store Jenkins in the current Kubernetes cluster. And just for Jenkins to build a new cluster seems to be a bit of "killing the chicken with a cow knife". In addition, we want to retain key parts of the infrastructure that are decoupled from the rest of the services. In that case, if the Kubernetes upgrade has an impact on our app, we want to at least use Jenkins to roll back. There is another problem with running "Docker in Docker", which has a solution, but it needs to be explained, because we often use the Docker command in our build.

Its architecture is as follows:

You can use EC2 instances to make the transition smoother: we ran pipeline work with temporary worker node through the Jenkins EC2 plug-in and called this logic on declarative pipelined code, so using Dokcer proxy nodes without refactoring is an addition. The rest of the work is Packer and Terraform code, which is already familiar to us.

Plug-in

Because plug-ins are also states! One of the problems we want to solve in this project is to better audit and manage plug-ins. In manual scenarios, plug-in management may be out of control, making it difficult to understand when and why the plug-in was installed.

Most Jenkins-level plug-in configurations can be found in the regular Jenkins configuration xml documentation, but installing plug-ins also causes jar artifacts, metadata, pictures, and other files to exist in jenkins_home/plugin directories.

One way is to store plug-ins in EFS, but we want to keep EFS usage to a minimum, which doesn't solve the problem, it just shifts the problem. This is why we chose to "Packer" the plug-in installation.

Basically, in our AMI definition, there is a plug-in file that lists plug-ins and versions, roughly as follows:

# Datadog Plugin required to send build metrics to Datadogdatadog:0.7.1# Slack Plugin required to send build notifications to Slackslack:2.27

Then, our AMI provision script parses the file and installs the plug-in and the selected version with Jenkins CLI:

# Wrapper function for jenkins_clijenkins_cli () {java-jar "$JENKINS_CLI_JAR"-http-auth "${user}: ${pw}"$@"} for plugin in "${plugins [@]}"; do echo "Installing $plugin" jenkins_cli install-plugin "$plugin"-deploydone

Then, any new plug-ins that need to be installed or version upgrades that upgrade to the currently installed version require GitHub Pull Request, which triggers the building of a new AMI. Perfect!

Install other softwar

By definition, Jenkins requires a lot of software to be installed to create, test, and deploy. First of all, we don't want master node to run any tasks, so we avoid installing any task-related software. The main task of Master is to provide interfaces and orchestrate builds on other short-lived worker node.

This means that we can install the required tools on worker node, but we decided to use docker run as much as possible. This is because we are a multilingual organization using other programming languages such as Scala, Java, Node, Golang, Python, and so on. Maintaining different build tools for all of these software stacks can make worker node setup a little complicated.

Taking JavaScript as an example, we want Jenkins to run yarn commands against app such as install and test. Simply load the checked repo directory into the Docker container as a volume and run any commands from that container. The following is an example of using Groovy workflow code:

Def node (command, image) {def nodeCmd = ['docker run-I-- rm','-u 1000mm, / / Run as non-root user'- v ~ / .npmrc: / home/node/.npmrc:ro','- v ~ / .yarn: / home/node/.yarn','- e YARN_CACHE_FOLDER=/home/node/.yarn/cache', "- v ${env.WORKSPACE}: / app",'--workdir / app' "${image}"] .join ('') sh "${nodeCmd} ${command}"}

Then, after we check the warehouse, we can call this function:

Checkout scmnode ('yarn install-- frozen-lockfile',' node:12.6.0-alpine')

Nice ending! Because we don't have to install and maintain multiple versions of the tools we use on worker machine, except for the Docker daemon or kubectl. We also believe that the build command is consistent between the local and CI environments because the same Docker image is used.

Keep in mind cache dependencies when creating with temporary node. For example, after a worker node rebuild, we lost the sbt cache, which caused the creation time to slow down because the cache had to be rebuilt. If external dependencies are not available, this can even lead to failure. We decided to cache the related dependencies on another external EFS in order to get a faster and more reliable build.

Conclusion

Jenkins is a great tool, but it is slightly inadequate in managing external states, so it is difficult to create a Jenkins in a cloud native way. Our approach is not perfect, but we believe it combines the best of both and ensures security, simplicity and flexibility. Happily, after we have completed the project and migrated all the production build to the new Jenkins service, we can terminate the master server and allow automatic scaling to complete the reconstruction in a few minutes without affecting the previously stored state.

The above is all the contents of the article "sample Analysis of Cloud Native Jenkins on AWS". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.