How to realize Kubernetes observability Monitoring 07/11 Update SLTechnology News&Howtos

How to realize Kubernetes observability Monitoring

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

How to achieve Kubernetes observability monitoring, in view of this problem, this article introduces the corresponding analysis and answers in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.

We'll show you how to accomplish the basic Kubernetes observability task: getting a golden metric or golden signal from an application running on a Kubernetes cluster. We don't need to modify any code or make any configuration, just install Linkerd (an open source ultra-light service grid) to do this. We will describe what a service grid is, what the term observability means, and how the two relate to each other in the context of Kubernetes.

Monitoring Kubernetes applications with Service Grid

If you just got used to Kubernetes. Congratulations! But what do you need to do now? One of the first observability tasks for any Kubernetes user is monitoring, and you need to know when problems occur so that you can fix them quickly.

Kubernetes observability is a very broad topic, and there are a lot of discussions on the Internet about the nuances between observability and monitoring, distributed tracking and logging. In this article, we will focus on a basic problem: getting a golden metric or golden signal from an application running on a cluster without changing any code. We will install a Linkerd, an open source ultra-lightweight service grid. Unlike most service grids, Linkerd only needs to be installed on the cluster for a few minutes and does not require configuration.

Simple as it is, Linkerd contains a very powerful measurement pipeline. Once installed, it automatically detects and reports success rates, traffic levels, and response delays by observing HTTP (or gRPC) and TCP communication between all components running on the cluster.

Linkd can automatically report metrics for services that are often referenced as gold metrics for services.

What is the gold metric? Why are they important?

If you already know what the golden parameter is, please skip this section!

The gold indicator or gold signal is the first indicator you need to know whether the application is up and running as expected. These metrics provide you with rough signals about the health of the service without knowing the actual functionality of the service.

Cindy Sridharan wrote in her blog post on monitoring and observability that when the alarm is not directly driven, the monitoring data should be optimized to provide an aerial view of the overall health of the system.

The golden metrics defined by Google SRE Books are:

Latency-a way to measure the speed of a service. It is the time spent on a service request, usually measured as a percentage. The 99th percentile delay of 5ms means that 99% of requests are served in 5ms or less. Traffic-lets you know how busy a service is or how complex its requirements are. It is usually measured by the number of requests for services per second. Error-the number of failed requests. It is usually combined with total traffic to generate a success rate-the ratio of successful requests to incorrect requests. Saturation-measure the load of your system

By observing the traffic of the service, Linkerd can simply provide latency, traffic, and error measurements-optimistically, Linkerd provides these data in the form of success rates. (the fourth indicator, saturation, is often ignored in monitoring discussions because it needs to know the inside of the service and usually tracks other indicators, such as traffic and latency.)

These metrics are sometimes referred to as RED metrics for services:

Rate-- the number of requests being processed by your service per second. The number of failed requests per second by Errors-. Duration-- Distribution of time spent per request

No matter what you call them, the beauty of Linkerd is that it not only records the traffic of these metrics, but also aggregates and reports them so that we can use them easily. We will see below.) This enables us to monitor our applications. Once we are able to monitor our application, we can receive an alarm when something goes wrong, study its long-term performance, and test and improve its reliability and performance.

Golden indicator: the easiest way to install: visit the Kubernetes cluster and install Linkerd CLI

Let's assume that you have a functioning Kubernetes cluster and a kubectl command pointing to it. In this section, we'll take you through the abbreviated version of the linkd getting started guide to install Linkerd and a demo application on this cluster (we'll get the best metrics).

First, install the Linkerd command line (or download it directly from the Linkerd release page.) :

Curl-sL https://run.linkerd.io/install | sh

Export PATH=$PATH:$HOME/.linkerd2/bin

Verify that the Kubernetes cluster can handle the linkd; installation Linkerd; and verify the installation:

Linkerd check-pre

Linkerd install | kubectl apply-f-

Linkerd check

Finally, install the Emojivoto demo application, which is the application we want to get golden metrics. If you take a closer look at the following command, you will see that we are actually adding linkerd (which we call injection) to the application, and then deploying the application to Kubernetes. (if you want to know how this works, please check out our documentation https://linkerd.io/2/tasks/adding-your-service/).

Curl-sL https://run.linkerd.io/emojivoto.yml\

| | linkerd inject -\ |

| | kubectl apply-f-|

Yeah, that's it. This is all the tools you need for your application and the ability to access your golden metrics! Now let's take a look at them.

View metrics in Grafana

Want to see all these useful charts and dashboards? No problem! Run linkd dashboard-show grafana and open a link to the command output. You will see the top-level dashboard of Linkerd, which contains an overview of the metrics it collects and a breakdown of each namespace. Scroll down to our application's namespace (ns/emojivoto) and observe the following chart:

View metrics through linkd CLI

We can also use the linkd stat command to view the metrics of the application.

All of this data can also be found in Linkerd's dashboard, which you can access by running Linkerd dashboard:

Looking at the Grafana chart (or the linkd dashboard), you can immediately see that the voting service is not doing very well-its success rate is quite low! Adding gold metrics to our application immediately allows us to see the problems that may arise in the application.

Is it really that simple? The answer is yes! All we need to do is install Linkerd and inject it into our application. At the bottom, when linkd is added to a service, it automatically detects any HTTP and gRPC calls to the service's pod. Because it can parse these protocols, it can record the response classes and delays of these calls and aggregate them together, in which case they are merged into a small internal instance of a time series database called Prometheus. When you view gold metrics through Linkerd's dashboard and CLI, Linkerd fetches them from this internal Prometheus instance and provides you with all of these metrics without modifying the application code.

What else can Linkerd do?

We've seen how to use Linkerd to get golden metrics, which is the first step in achieving system observability, that is, getting a high-level view of what's going on in complex applications. But the indicators are just the beginning. As you continue your journey of monitoring and observability, you are sure to encounter two other common tools: logging and distributed link tracing.

Distributed tracing involves detecting the application to measure the length of time the request spends in the service. When our application uses many microservices that communicate with each other, tracing is a good tool to debug slow requests and find out which services are the bottleneck. Linkerd can help with distributed tracing, although a service grid does not do much in terms of distributed tracing.

Similar to distributed tracking, Linkerd also provides a powerful dynamic request tracking tool, tap. The tap command is similar to tcpdump for microservices: it allows you to view real-time requests (examples) sent to or from a particular service. Tap is a powerful tool for debugging Kubernetes services in production.

Finally, application logging is, of course, one of the first things developers should do when they suspect that a particular process is not working properly. When running a service grid, it is sometimes useful to see what is going on inside the grid. Although Linkerd can't provide you with application logs, the Linkerd logs command provides an easy way to at least see what's going on inside Linkerd.

The answer to the question on how to achieve Kubernetes observability monitoring is shared here. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel for more related knowledge.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.