In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces "what are the details that are easy to be ignored in deploying Kubernetes applications". In daily operation, I believe many people have doubts about the details that are easy to be ignored in deploying Kubernetes applications. The editor consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful to answer the doubts about "what are the details that are easy to be ignored in deploying Kubernetes applications?" Next, please follow the editor to study!
1. Configure Pod requests and restrictions
Let's start by configuring a simple environment that can run Pod. Kubernetes does do a good job of dealing with Pod scheduling and failure states, but we also realize that deployment can sometimes be challenging if the Kubernetes scheduler cannot measure exactly how many resources are required for Pod to run successfully. This challenge is the root of the design of resource request and restriction mechanism. Currently, there is still a lot of controversy about best practices for setting application requests and restrictions. In fact, the work is more like an art than a simple science. Next, let's talk about GumGum's internal views on this issue:
Pod request: this is the main metric used by the scheduler to measure the best deployment method for Pod.
Let's take a look at the description in the Kubernetes documentation:
The filtering step finds a set of Pod where feasible. For example, the PodFitsResources filter checks whether the candidate node has sufficient available resources to meet specific resource requests made by Pod.
Internally, we use application requests in such a way that, by setting up, we estimate the resource requirements when the application is running the actual workload properly. On this basis, the scheduler can place the nodes more reasonably. Initially, we wanted to set the request higher to ensure that each Pod had sufficient resources. However, we soon found that this approach will greatly increase the scheduling time and cause some Pod to be unable to schedule completely. The result is actually similar to what we see when we do not specify a resource request at all: in the latter case, because the control plane does not know how many resources the application needs, the scheduler often "expels" the Pod and does not reschedule it. It is the key component of this scheduling algorithm that makes it impossible for us to get the desired scheduling effect.
Pod limit: the direct restriction on Pod represents the maximum amount of resources allowed by each container in the cluster.
Also look at the description in the official documentation:
If you set a memory limit for 4GiB for the container, kubelet (and the container runtime) enforces this limit. The runtime will prevent the container from using resource capacity that exceeds the configured limit. For example, when the amount of memory consumed by a process in the container exceeds the approved amount, the system kernel terminates the resource allocation attempt and prompts an out of memory (OOM) error.
The actual amount of resources used by the container can be higher than its request, but never higher than the configuration limit. Obviously, it is difficult to set the limit index correctly, but it is also very important. Ideally, we want the resource requirements of Pod to change throughout the process lifecycle without interfering with other processes on the system-which is what the restriction mechanism is all about. Unfortunately, we cannot clearly give the most appropriate setting value, and we can only follow the following process to adjust it:
Using load testing tools, we can simulate benchmark traffic levels and observe Pod resource usage (including memory and CPU).
We set the Pod request at a very low level while keeping the Pod resource limit at about 5 times the requested value, and then observe its behavior. When the request is too low, the process cannot be started and often raises mysterious Go runtime errors.
What needs to be emphasized here is that the stricter the resource restrictions, the more difficult it is to schedule Pod. This is because Pod scheduling requires the target node to have sufficient resources. For example, if you have very limited resources (only 4GB memory), even running lightweight Web server processes is likely to be very difficult. In this case, you need to scale out, and each new container should run on a node that also has at least 4GB available memory. If such a node does not exist, you need to introduce a new node into the cluster to handle the Pod, which will undoubtedly increase the startup time. In short, be sure to find a minimum "boundary" between resource requests and limitations to ensure fast and balanced expansion.
two。 Configure Liveness and Readiness probes
Another interesting topic that is often discussed in the Kubernetes community is how to configure Linvess and Readiness probes. The rational use of these two probes can bring us a mechanism to run fault-tolerant software and minimize downtime. However, if not configured correctly, they can also have a serious performance impact on the application. Let's take a look at the basic situation of these two probes and how to judge how to use them:
Liveness probe: "used to indicate whether the container is running. If the Liveness probe fails, kubelet will close the container and the container will begin to implement the restart policy. If the container does not provide a Liveness probe, its default state is considered successful." -Kubernetes documentation
The resource requirements for Liveness probes must be low because they need to be run frequently and Kubernetes needs to be notified when the application is running. Note that if it is set to run once per second, the system will need to bear an additional request processing capacity of 1 per second. Therefore, it is important to carefully consider how to deal with these additional requests and corresponding resources. At GumGum, we set the Liveness probe to respond when the main component of the application is running, regardless of whether the data is fully available (such as data from a remote database or cache). For example, we will set a specific "health" endpoint in the application, which is simply responsible for returning the 200 response code. As long as the response is still being returned, it indicates that the process has been started and can process the request (but the traffic has not been formally generated).
Readiness probe: "indicates whether the container is ready to process the request. If the Readiness probe fails, the endpoint controller removes the IP address of the Pod from all service endpoints that match the Pod."
The Readiness probe is much more expensive to run because its purpose is to continuously inform the back end that the entire application is running and ready to receive requests. There is a lot of debate in the community about whether this probe should access the database. Considering the overhead caused by the Readiness probe (which needs to be run frequently, but can be flexibly adjusted frequently), we decided that in some applications we started to "provide traffic" only after the record was returned from the database. Through the careful design of the Readiness probe, we have been able to achieve higher levels of availability and zero downtime deployment.
But if it is really necessary to check the ready status of database requests at any time through the Readiness probe of the application, please control the amount of resources used for query operations as much as possible, for example...
SELECT small_item FROM table LIMIT 1
Here are the configuration values we specified for these two probes in Kubernetes:
LivenessProbe:httpGet:path: / api/livenessport: httpreadinessProbe:httpGet:path: / api/readinessport: http periodSeconds: 2
You can also add some other configuration options:
How many seconds after the initialDelaySeconds- container starts, the probe begins to actually run
The waiting interval between two periodSeconds- probes
How many seconds does it take for timeoutSeconds- to determine that a Pod is in a failed state. Equivalent to the traditional time-out indicator.
How many times has the failureThreshold- probe failed before sending a restart signal to Pod
How many times after the successThreshold- probe is successful before you can determine that the Pod is ready (usually after Pod startup or failure recovery)
3. Set the default Pod network policy
Kubernetes uses a "flat" network topology; by default, all Pod can communicate directly with each other. However, combined with the actual use cases, this communication capability is often unnecessary or even unacceptable. A potential security hazard is that if a vulnerable application is exploited, the attacker can gain full access and send traffic to all Pod on the network. Therefore, it is also necessary to apply the minimum access principle in the Pod network, and ideally specify which containers are allowed to connect to each other through the network policy.
Taking the following simple policy as an example, you can see that it will deny all ingress traffic in a particular namespace:
-apiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata:name: default-deny-ingressspec:podSelector: {} policyTypes:- Ingress4. Perform custom behavior through Hooks and Init containers
One of the core goals we hope to achieve in the Kubernetes system is to try to provide near-zero downtime deployment support for existing developers. However, different applications often have different shutdown methods and resource cleaning processes, so the overall goal of zero downtime is difficult to achieve. The first thing facing us is the difficulty of Nginx. We notice that when you start the rolling deployment of Pod, the active connection is discarded before it is successfully terminated. After extensive online research, it turns out that Kubernetes does not wait for Nginx to exhaust its connection resources before terminating Pod. With pre-stop hook, we were able to inject this feature and thus achieve zero downtime.
Lifecycle:preStop:exec:command: ["/ usr/local/bin/nginx-killer.sh"]
Isnginx-killer.sh:
#! / bin/bashsleep 3PIDs $(cat / run/nginx.pid) nginx-s quitwhile [- d / proc/$PID]; doecho "Waiting while shutting down nginx..." sleep 10done
Another practical example is to handle the startup task of a specific application through the Init container. Some popular Kubernetes projects also use init-containers such as Istio to inject Envoy processing code into Pod. The Init container is particularly useful if you need to complete the onerous process of database migration before the application starts. You can also set a higher resource limit for this process to ensure that it is not affected by the restrictions set by the main application.
Another common pattern is to provide secrets access to the init-conatiner, and the container publishes these credentials to the primary Pod, thereby preventing the secret from being granted access through the main application Pod ontology. Also look at the statement in the description document:
The Init container can safely run utilities or custom code to prevent it from compromising the security of the application container image. By stripping these unnecessary tools, you can limit the attack surface of the application container image.
5. Kernel tuning
Finally, let's talk about one of the most advanced technologies. Kubernetes itself is a highly flexible platform that helps you run workloads in the most appropriate way. At GumGum, we have a variety of high-performance applications that have extremely stringent requirements on running resources. After extensive load testing, we found that it was difficult for an application to handle the necessary traffic load while using the Kubernetes default settings. But Kubernetes allows us to run a high-privilege container by modifying it to configure kernel run parameters that are appropriate for a particular Pod. With the following sample code, we modified the maximum number of open connections in Pod:
InitContainers:- name: sysctlimage: alpine:3.10securityContext:privileged: truecommand: ['sh','-net.core.somaxconn=32768, "sysctl-w net.core.somaxconn=32768"]
This is an advanced technology that is used less frequently. If your application is difficult to run healthily in a high-load scenario, you may need to adjust some of its parameters. It is recommended that you refer to the details of parameter tuning and optional values in the official documentation.
At this point, the study on "what are the details that are easy to be ignored in deploying Kubernetes applications" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.