How to use ELK log collection and unified processing under micro-service 10/21 Update SLTechnology News&Howtos

How to use ELK log collection and unified processing under micro-service

2025-10-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article shows you how to use ELK log collection and unified processing under micro-service. The content is concise and easy to understand, which will definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.

The practice of the various components of micro-service will involve tools. The editor will introduce some sharp tools in the daily development of micro-service, which will help us to build a more robust micro-service system. And help to solve the problems and performance bottlenecks in the micro-service system.

We will focus on the log collection solution ELK in the microservice architecture (ELK is the abbreviation of Elasticsearch, Logstash and Kibana), or ELKB, that is, ELK + Filebeat, in which Filebeat is a lightweight transport tool for forwarding and centralizing log data.

Why do you need a distributed log system

In previous projects, if you want to locate the bug or performance problems of business services through logs in the production environment, you need operation and maintenance personnel to query log files one by one. As a result, the efficiency of troubleshooting is very low.

Under the micro-service architecture, multiple service instances are deployed on different physical machines, and the logs of each micro-service are stored in different physical machines. If the cluster is large enough, it becomes very inappropriate to consult the log in the traditional way mentioned above. Therefore, you need to centralize the management of logs in a distributed system, where there are open source components such as syslog, which are used to collect and summarize logs on all servers.

However, after centralizing log files, we are faced with statistics and retrieval of these log files, which services have alarms and anomalies, which need detailed statistics. Therefore, when there is an online failure before, it is common to see developers and operation and maintenance personnel download the logs of the service and retrieve and count them based on some commands under Linux, such as grep, awk and wc. This method is inefficient, heavy workload, and for higher query, sorting and statistics requirements and a large number of machines still use this method is inevitably a bit inadequate.

ELKB distributed log system

ELKB is a complete distributed log collection system, which well solves the problems mentioned above in log collection, retrieval and analysis. ELKB refers to Elasticsearch, Logstash, Kibana and Filebeat, respectively. The whole set of components provided by elastic can be seen as a MVC model, logstash corresponds to the logic control controller layer, Elasticsearch is a data model model layer, and Kibana is the view view layer. Logstash and Elasticsearch are written and implemented based on Java, while Kibana uses the node.js framework.

The functions of these components and their roles in the log collection system are described below.

Installation and use of Elasticsearch

Elasticsearch is a real-time full-text search and analysis engine, which provides three major functions of collecting, analyzing and storing data; it is a set of open REST and JAVA API structures that provide efficient search functions and scalable distributed systems. It is built on the Apache Lucene search engine library.

Elasticsearch can be used to search a variety of documents. It provides scalable search, has near real-time search, supports multi-tenancy, is capable of expanding hundreds of service nodes, and supports PB-level structured or unstructured data.

Elasticsearch is distributed, which means that the index can be divided into shards, and each shard can have 0 or more copies. Each node hosts one or more shards and acts as a coordinator to delegate operations to the correct shards. Rebalancing and routing are done automatically. Related data is usually stored in the same index, which consists of one or more primary shards and zero or more replication shards. Once the index is created, the number of primary tiles cannot be changed.

Elasticsearch is a real-time distributed search and analysis engine, which is used as full-text search, structured search, analysis and the combination of these three functions. It is document-oriented, which means that it stores the whole object or document. Elasticsearch not only stores documents, but also indexes the contents of each document so that it can be retrieved. In Elasticsearch, you index, retrieve, sort, and filter documents-rather than row data.

For convenience, we install Elasticsearch directly using docker:

$docker run-d-name elasticsearch docker.elastic.co/elasticsearch/elasticsearch:5.4.0

It should be noted that after Elasticsearch starts, simple settings need to be made. Xpack.security.enabled is enabled by default. For convenience, cancel login authentication. We log in to the container and execute the following command:

# enter the launched container $docker exec-it elasticsearch bash# to edit the configuration file $vim config/elasticsearch.ymlcluster.name: "docker-cluster" network.host: 0.0.0.0http.cors.enabled: truehttp.cors.allow-origin: "*" xpack.security.enabled: false# minimum_master_nodes need to be explicitly set when bound on a public IP# set to 1 to allow single node clusters# Details: https://github.com/elastic/elasticsearch/pull/17288discovery.zen.minimum_master_nodes: 1

After modifying the configuration file, exit the container and restart the container. In order to preserve the configuration for later use, we need to create a new image from the container. First, get the corresponding ContainerId of the container. Then submit a new image based on the container.

$docker commit-a "add config"-m "dev" a404c6c174a2 es:latestsha256:5cb8c995ca819765323e76cccea8f55b423a6fa2eecd9c1048b2787818c1a994

So we get a new mirror es:latest. We run the new image:

Docker run-d-- name es-p 9200 discovery.type=single-node 9200-p 9300 discovery.type=single-node 9300

By visiting the built-in endpoints provided by Elasticsearch, we check to see if the installation is successful.

[root@VM_1_14_centos ~] # curl 'http://localhost:9200/_nodes/http?pretty'{ "_ nodes": {"total": 1, "successful": 1, "failed": 0}, "cluster_name": "docker-cluster", "nodes": {"8iH5v9C-Q9GA3aSupm4caw": {"name": "8iH5v9C" "transport_address": "10.0.1.14 build_hash", "host": "10.0.1.14", "ip": "10.0.1.14", "version": "5.4.0", "build_hash": "780f8c4", "roles": ["master", "data", "ingest"] "attributes": {"ml.enabled": "true"}, "http": {"bound_address": ["[:]: 9200"], "publish_address": "10.0.1.14true 9200", "max_content_length_in_bytes": 104857600}

As you can see, we have successfully installed Elasticsearch,Elasticsearch as a storage source for log data information, providing us with efficient search performance.

We also installed Elasticsearch's visualization tool: elasticsearch-head. The installation method is very simple:

$docker run-p 9100Suzhou 9100 mobz/elasticsearch-head:5

Elasticsearch-head is a client plug-in for monitoring the status of Elasticsearch, including data visualization, adding, deleting, modifying and searching operations, etc.

The interface after installation is as follows:

Installation and use of logstash

Logstash is a data analysis software, the main purpose is to analyze log logs. The principle of its use is as follows:

The data source first passes the data to logstash, and here we use Filebeat to transfer log data. It mainly consists of three parts: Input data input, Filter data source filtering and Output data output.

Logstash filters and formats the data (converted to JSON format), then sends it to Elasticsearch for storage, and builds the search index. Kibana provides the front-end page view, which can be searched on the page, so that the results become chart visualization.

Let's start the installation using logstash. First download and decompress the logstash:

# download logstash$ wget https://artifacts.elastic.co/downloads/logstash/logstash-5.4.3.tar.gz# to extract logstash$ tar-zxvf logstash-5.4.3.tar.gz

The download speed may be slow, so you can choose a domestic mirror source. After the decompression is successful, we need to configure logstash, mainly the input, output, and filtering we mentioned.

[root@VM_1_14_centos elk] # cat logstash-5.4.3/client.confinput {beats {port = > 5044 codec = > "json"} output {elasticsearch {hosts = > ["127.0.0.1 logstash-app-error-% 9200"] index = > "logstash-app-error-% {+ YYYY.MM.dd}"} stdout {codec = > rubydebug}}

Input support files, syslog, beats, we can only choose one of them when configuring. Here we configure the filebeats mode.

Filtering is used to handle specific behaviors, to handle the flow of events that match specific rules. Common filters includes grok parsing irregular text and transforming it into structured format, geoip adding geographic information, drop discarding partial events and mutate modifying documents, and so on. The following is an example of filter usage:

Filter {# defines which field the client's IP is geoip {source = > "clientIp"}}

The output supports Elasticsearch, file, graphite and statsd. By default, the data of the filter button is exported to Elasticsearch. When we do not need to export to ES, we need to specifically declare which way to output, and support the configuration of multiple output sources at the same time.

An event can go through multiple outputs during processing, but once all the outputs is executed, the event also completes the life cycle.

We output the log information to Elasticsearch in the configuration. After the configuration file is done, we start logstash:

$bin/logstash-f client.confSending Logstash's logs to / elk/logstash-5.4.3/logs which is now configured via log4j2.properties [2020-10-30T14:12:26056] [INFO] [logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {: changes= > {: removed= > [] : added= > [http://127.0.0.1:9200/]}}[2020-10-30T14:12:26,062][INFO] [logstash.outputs.elasticsearch] Running healthcheck to see if an Elasticsearch connection is working {: healthcheck_url= > http://127.0.0.1:9200/, : path= > "/"} log4j:WARN No appenders could be found for logger (org.apache.http.client.protocol.RequestAuthCache). Log4j:WARN Please initialize the log4j system properly.log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. [2020-10-30T14:12:26209] [WARN] [logstash.outputs.elasticsearch] Restored connection to ES instance {: url= > #} [2020-10-30T14:12:26225] [INFO] [logstash.outputs. Elasticsearch] Using mapping template from {: path= > nil} [2020-10-30T14:12:26288] [INFO] [logstash.outputs.elasticsearch] Attempting to install template {: manage_template= > {"template" = > "logstash-*" "version" = > 50001, "settings" = > {"index.refresh_interval" = > "5s"}, "mappings" = > {"_ default_" = > {"_ all" = > {"enabled" = > true, "norms" = > false}, "dynamic_templates" = > [{"message_field" = > {"path_match" = > "message", "match_mapping_type" = > "string", "mapping" = > {"type" = > "text", "norms" = > false} {"string_fields" = > {"match" = > "*", "match_mapping_type" = > "string", "mapping" = > {"type" = > "text", "norms" = > false, "fields" = > {"keyword" = > {"type" = > "keyword"}], "properties" = > {"@ timestamp" = > {"type" = > "date", "include_in_all" = > false}, "@ version" = > {"type" = > "keyword", "include_in_all" = > false} "geoip" = > {"dynamic" = > true, "properties" = > {"ip" = > {"type" > "ip"}, "location" = > {"type" = > "geo_point"}, "latitude" = > {"type" = > "half_float"}, "longitude" = > {"type" = > "half_float"} [2020-10-30T14:12:26304] [INFO] [logstash.outputs.elasticsearch] New Elasticsearch output {: class= > "LogStash::Outputs::ElasticSearch" : hosts= > [#]} [2020-10-30T14:12:26312] [INFO] [logstash.pipeline] Starting pipeline {"id" = > "main", "pipeline.workers" = > 4, "pipeline.batch.size" = > 125, "pipeline.batch.delay" = > 5 "pipeline.max_inflight" = > 500} [2020-10-30T14:12:27226] [INFO] [logstash.inputs.beats] Beats inputs: Starting input listener {: address= > "0.0.0.0 Pipeline main started 5044"} [2020-10-30T14:12:27319] [INFO] [logstash.pipeline] Pipeline main started [2020-10-30T14:12:27422] [INFO] [logstash.agent] Successfully started Logstash API endpoint {: port= > 9600}

According to the log output from the console, we know that logstash has started normally.

Installation and use of Kibana

Kibana is a Web-based graphical interface for searching, analyzing, and visualizing log data stored in Elasticsearch metrics. Kibana calls the data returned by the API of Elasticsearch for visualization. It uses Elasticsearch's REST interface to retrieve data, not only allowing users to create custom dashboard views of their own data, but also allowing them to query and filter data in a special way.

The installation of Kibana is relatively simple. We can install it based on docker:

Docker run-- name kibana-e ELASTICSEARCH_URL= http://127.0.0.1:9200-p 5601 ELASTICSEARCH_URL= 5601-d kibana:5.6.9

In the startup command, we specified the environment variable for ELASTICSEARCH, which is the local 127.0.0.1 9200.

Installation and use of Filebeat

Filebeat is a lightweight transport tool used to forward and centralize log data. Filebeat monitors specified log files or locations, collects log events, and forwards them to Logstash, Kafka, Redis, and so on, or directly to Elasticsearch for indexing.

Let's start with the installation and configuration of Filebeat:

# download filebeat$ wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-5.4.3-linux-x86_64.tar.gz$ tar-zxvf filebeat-5.4.3-linux-x86_64.tar.gz$ mv filebeat-5.4.3-linux-x86_64 filebeat# into the directory $cd filebeat# configuration filebeat$ vi filebeat/client.ymlfilebeat.prospectors:- input_type: log paths:-/ var/log/*.logoutput.logstash: hosts: ["localhost:5044"]

In the configuration of filebeat, input_type supports input from Log, Syslog, Stdin, Redis, UDP, Docker, TCP, NetFlow. The above configuration reads log information from log. And configured to enter only the log files in the / var/log/ directory. Output configures Filebeat to use logstash and uses logstash to perform additional processing on the data collected by Filebeat.

Once configured, we start Filebeat:

$. / filebeat-e-c client.yml2020/10/30 06 beat.go:285 31.764391 beat.go:285: INFO Home path: [/ elk/filebeat] Config path: [/ elk/filebeat] Data path: [/ elk/filebeat/data] Logs path: [/ elk/filebeat/logs] 2020-10-30 06Freight 46 beat.go:186 31.764426 beat.go:186: INFO Setup Beat: filebeat Version: 5.4.32020/10/30 06:46:31.764522 logstash.go:90: INFO Max Retries set to: 32020/10/30 06:46:31.764588 outputs.go:108: INFO Activated logstash as output plugin.2020/10/30 06:46:31.764586 metrics.go:23: INFO Metrics logging every 30s2020/10/30 06:46:31.764664 publish.go:295: INFO Publisher name: VM_1_14_centos2020/10/30 06:46:31.765299 async.go : 63: INFO Flush Interval set to: 1s2020/10/30 06:46:31.765315 async.go:64: INFO Max Bulk Size set to: 20482020/10/30 06:46:31.765563 beat.go:221: INFO filebeat start running.2020/10/30 06:46:31.765592 registrar.go:85: INFO Registry file set to: / elk/filebeat/data/registry2020/10/30 06:46:31.765630 registrar.go:106: INFO Loading registrar data from / elk/filebeat/data/registry2020 / 10/30 06:46:31.766100 registrar.go:123: INFO States Loaded from registrar: 62020/10/30 06:46:31.766136 crawler.go:38: INFO Loading Prospectors: 12020/10/30 06:46:31.766209 registrar.go:236: INFO Starting Registrar2020/10/30 06:46:31.766256 sync.go:41: INFO Start sending events to output2020/10/30 06:46:31.766291 prospector_log.go:65: INFO Prospector with previous states loaded: 02020/10/30 06:46:31.766390 prospector.go:124: INFO Starting prospector of type: log Id: 25367299177876733812020/10/30 06:46:31.766422 crawler.go:58: INFO Loading and starting Prospectors completed. Enabled prospectors: 1202020Accord 306 spool_size 46 spooler.go:63 31.766430 spooler.go:63: INFO Starting spooler: spool_size: 2048; idle_timeout: 5s2020/10/30 06 Partition 47 INFO No non-zero metrics in the last 01.764888 metrics.go:34: INFO No non-zero metrics in the last 30s2020/10/30 06 14776 4929 metrics.go:34: INFO No non-zero metrics in the last 30s2020/10/30 0648 INFO No non-zero metrics in the last 01.765134 metrics.go:34: INFO No non-zero metrics in the last 30s

When you start Filebeat, it starts one or more inputs that are looked up in the location specified for the log data. For each log found by Filebeat, Filebeat starts the collector. Each collector reads a single log to get new content, sends the new log data to libbeat,libbeat, aggregates events, and sends the aggregated data to the output configured for Filebeat.

The practice of using ELKB

After installing the ELKB components, we began to integrate them. First take a look at the process by which ELKB collects logs.

Filebeat listens to the log file of the application, and then sends the data to logstash,logstash to filter and format the data, such as JSON format; then logstash sends the processed log data to Elasticsearch,Elasticsearch for storage and establishes the search index; Kibana provides a visual view page.

After running all the components, first take a look at the index changes in elasticsearch-head:

You can see that there is one more index of filebeat-2020.10.12, indicating that the ELKB distributed log collection framework has been built successfully. To visit http://localhost:9100, let's take a look at the data of the index:

As you can see from the above two screenshots, new log data is generated in the mysqld.log file in the / var/log/ directory. There is a lot of log data. In the production environment, we need to filter according to the actual business and deal with the corresponding log format.

Elasticsearch-head is a simple Elasticsearch client, more complete statistics and search requirements, need to improve the Elasticsearch analysis ability with the help of Kibana,Kibana, can analyze data more intelligently, perform mathematical transformation and cut the data into blocks according to the requirements.

Visit http://localhost:5601 to get the log information in the figure above. Filebeat listens to the mysql log and displays it on Kibana. Kibana can better deal with large amounts of data and create histograms, line charts, scatter charts, histograms, pie charts and maps, which are not shown here.

The above content is how to use ELK log collection and unified processing under micro-service. Have you learned any knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.