Build efk with K8s 07/12 Update SLTechnology News&Howtos

Build efk with K8s

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

* * Construction of efk * *

Elasticsearch is a real-time, distributed and scalable search engine that allows full-text and structured search. It is usually used for indexing and searching large amounts of log data, as well as for searching many different types of documents.

Elasticsearch is usually deployed with Kibana, and Kibana is a powerful data visualization Dashboard,Kibana of Elasticsearch that allows you to browse Elasticsearch log data through the web interface.

Fluentd is a popular open source data collector. We will install Fluentd on the Kubernetes cluster node, obtain the container log files, filter and transform the log data, and then pass the data to the Elasticsearch cluster, where it is indexed and stored.

Let's start with configuration and start an extensible Elasticsearch cluster, then create a Kibana application in the Kubernetes cluster, and finally run Fluentd through DaemonSet so that it can run a Pod on each Kubernetes worker node.

1. Create an Elasticsearch cluster

Before creating an Elasticsearch cluster, let's create a namespace.

Create a new kube-efk.yaml

Kubectl apply-f kube-efk.yaml

Kubectl get ns to see if there is a namespace for this efk.

Here we use three Elasticsearch Pod to avoid the "brain crack" problem in the high-availability multi-node cluster. When one or more nodes are unable to communicate with other nodes, there will be "brain cracks" and there may be several master nodes.

One key point is that you should set the parameter discover.zen.minimum_master_nodes=N/2+1, where N is the number of nodes in the Elasticsearch cluster that match the primary node. For example, we have three nodes here, which means N should be set to 2. In this way, if one node is temporarily disconnected from the cluster, the other two nodes can choose a new primary node, and the cluster can continue to run when the last node tries to rejoin, which is a parameter to keep in mind when extending the Elasticsearch cluster.

First, create a headless service named elasticsearch, and create a new file, elasticsearch-svc.yaml, with the following contents:

A Service named elasticsearch is defined, and the label app=elasticsearch is specified. When we associate the Elasticsearch StatefulSet with this service, the service will return the DNS A record of the Elasticsearch Pods with the label app=elasticsearch, and then set the clusterIP=None to make the service headless. Finally, we define ports 9200 and 9300, respectively, for interacting with REST API and for communication between nodes.

And then we create this headless service.

Kubectl apply-f elasticsearch-svc.yaml

Now that we have set up a headless service and a stable domain name .Elasticsearch.logging.svc.cluster.local for Pod, let's create a specific Pod application for Elasticsearch through StatefulSet.

Kubernetes StatefulSet allows us to assign a stable identity and persistent storage to Pod. Elasticsearch needs stable storage to ensure that the data of Pod remains unchanged after rescheduling or restarting, so StatefulSet is needed to manage Pod.

We use a StorageClass object called es-data-db, so we need to create this object in advance. We use NFS here as the storage backend, so we need to install a corresponding provisioner driver.

Let's create elasticsearch-storageclass.yaml first.

Then we create a pvc to correspond to this storageclass

Elasticsearch-pvc.yaml

Finally, we create this statefulset

Elasticsearch-statefulset.yaml

Then we use kubectl to create

Kubectl apply-f elasticsearch-storageclass.yaml

Kubectl apply-f elasticsearch-pvc.yaml

Kubectl apply-f elasticsearch-statefulset.yaml

Then we look at the operation of pod

After the Pods deployment is complete, we can check that the Elasticsearch cluster is functioning properly by requesting a REST API. Use the following command to forward local port 9200 to the port corresponding to the Elasticsearch node, such as es-cluster-0:

And then we open another window.

Normally, there should be such a message.

Seeing the information above indicates that our Elasticsearch cluster named k8s-logs has successfully created three nodes: es-cluster-0,es-cluster-1, and es-cluster-2, and the current primary node is es-cluster-0.

two。 Create a Kibana service

After the Elasticsearch cluster starts successfully, we can deploy the Kibana service and create a new file named kibana.yaml. The corresponding file contents are as follows:

Above, we define two resource objects, one Service and Deployment. For testing convenience, we set Service to NodePort, which is easy to configure in Kibana Pod. The only thing we need to note is that we use ELASTICSEARCH_URL as an environment variable to set the endpoint and port of Elasticsearch cluster, and we can use Kubernetes DNS directly. This endpoint corresponds to elasticsearch. Since it is a headless service, the domain will be resolved into an IP address list of 3 Elasticsearch Pod.

And then we create this service.

Kubectl apply-f kibana.yaml

After a while, our kibana service is up.

If the Pod is already in Running status, it proves that the application has been successfully deployed, and then you can access the Kibana service through NodePort, and open http://:30245 in the browser. If you see the welcome screen below, it proves that Kibana has been successfully deployed to the Kubernetes cluster.

3. Deploy Fluentd

Fluentd is an efficient log aggregator, written in Ruby, and can be well extended. For most enterprises, Fluentd is efficient enough and consumes relatively few resources, while another tool, Fluent-bit, is lighter and takes up less resources, but plug-ins are not rich enough compared to Fluentd, so overall, Fluentd is more mature and more widely used, so we also use Fluentd as a log collection tool here.

working principle

Fluentd grabs log data through a given set of data sources and forwards them to other services, such as Elasticsearch, object storage, and so on, after processing (converted to a structured data format). Fluentd supports more than 300 log storage and analysis services, so it is very flexible in this respect. The main steps are as follows:

First, Fluentd acquires data from multiple log sources

Structure and mark up the data

Then send the data to multiple target services according to the matching tags

Log source configuration

For example, in order to collect all container logs on the Kubernetes node, we need to configure the log source as follows:

Routing configuration

The above is the configuration of the log source. Let's take a look at how to send log data to Elasticsearch:

Match: identifies a target tag followed by a regular expression that matches the log source. Here we want to capture all the logs and send them to Elasticsearch, so we need to configure them as * *.

Id: a unique identifier of the target.

Type: supported output plug-in identifier, which we want to output to Elasticsearch here, so it is configured as elasticsearch, which is a built-in plug-in for Fluentd.

Log_level: specify the log level to be captured, which is configured as info here, which means that any logs at or above this level (INFO, WARNING, ERROR) will be routed to Elsasticsearch.

Host/port: define the address of Elasticsearch, and you can also configure authentication information. Our Elasticsearch does not require authentication, so you can specify host and port directly here.

The logstash_format:Elasticsearch service searches the log data by building a reverse index, and setting logstash_format to true,Fluentd will forward structured log data in logstash format.

Buffer: Fluentd allows caching when the target is unavailable, for example, if the network fails or Elasticsearch is not available. Buffer configuration also helps to reduce the IO of the disk.

4. Installation

To collect logs from the Kubernetes cluster, deploy the Fluentd application directly with the DasemonSet controller, so that it can collect logs from the Kubernetes node, ensuring that a Fluentd container is always running on each node in the cluster. Of course, you can directly use Helm for one-click installation, in order to learn more about the implementation details, we still use manual installation here.

First, we specify the Fluentd configuration file through the ConfigMap object, and create a new fluentd-configmap.yaml file with the following contents:

In the above configuration file, we configure the docker container log directory and the log collection of docker and kubelet applications. The collected data is processed and sent to the elasticsearch:9200 service.

Then create a new fluentd-daemonset.yaml file with the following contents:

We mount the ConfigMap object fluentd-config created above into the Fluentd container through volumes. In addition, in order to flexibly control which nodes' logs can be collected, we also add a nodSelector attribute here:

In addition, since our cluster is built using kubeadm, and master nodes are tainted by default, we need to add tolerance to collect logs of master nodes as well:

Then we create the above configmap object and daemonset service

We can see that pod is running normally.

Then we go to the kibana page and click discover

You can configure the Elasticsearch index we need here. The logs we collected in the previous Fluentd configuration file are in logstash format. Here, you only need to enter logstash-* in the text box to match all the log data in the Elasticsearch cluster, and then click next to go to the following page:

On this page, configure which field to use to filter log data by time. In the drop-down list, select the @ timestamp field, and then click Create index pattern. After the creation is completed, click Discover in the left navigation menu, and then you can see some histograms and recently collected log data:

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.