In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article introduces how to understand the difference between open source log management solution ELK and EFK, the content is very detailed, interested friends can refer to, hope to be helpful to you.
Preface
The mainstream ELK (Elasticsearch, Logstash, Kibana) has been transformed into EFK (Elasticsearch, Filebeat or Fluentd, Kibana). Fluentd is generally recommended for container cloud log solutions. Let's take a look at what changes have taken place from ELK to EFK. At the same time, I also recommend that you learn about Grafana Loki.
Overview of ELK and EFK
With the increasing complexity of various software systems, especially after deployment to the cloud, it is basically not feasible to log in to each node to view the log of each module. Because not only is it inefficient, but sometimes because of security, it is not possible for engineers to directly access each physical node. And now large-scale software systems are basically deployed in clusters, which means that for each service, multiple identical POD will be launched to provide services, and each container will generate its own log. Only from the generated log, you have no idea which POD is generated, so it is more difficult to view distributed logs.
So in the cloud age, you need a solution that collects and analyzes log. First of all, you need to collect the log distributed in all corners into a centralized place for easy viewing. After collection, you can also carry out a variety of statistical analysis, and even use the popular big data or maching learning methods for analysis. Of course, such a log solution is also needed for traditional software deployment, but this article mainly introduces it from a cloud perspective.
ELK is such a solution, and it is basically the de facto standard. ELK is an acronym for three open source projects, as follows:
E: Elasticsearch L: Logstash K: Kibana
The main role of Logstash is to collect and process distributed log; Elasticsearch is a place to store log centrally, and more importantly, it is an engine for full-text retrieval and analysis, which allows users to view and analyze massive data in near real-time. Kibana is the front-end GUI developed for Elasticsearch, which makes it convenient for users to query the data stored in Elasticsearch with a graphical interface, and also provides a variety of analysis modules, such as the function of building dashboard.
Personally, I think it is more appropriate to understand the L in ELK as Logging Agent. Elasticsearch and Kibana are basically the standard scheme for storing, retrieving, and analyzing log, while Logstash is not the only solution for collecting log. Fluentd and Filebeats can also be used to collect log. So now there are abbreviations such as ELK,EFK on the Internet.
The general architecture is shown in the following figure. Usually a small cluster has three nodes, on which dozens or even hundreds of containers may be running. We only need to start an instance of logging agent (the concept of DaemonSet in kubernetes) on each node.
The difference and relationship among Filebeats, Logstash and Fluentd
It is necessary to make a brief explanation of the connections and differences among Filebeats, Logstash and Fluentd. Filebeats is a lightweight solution for collecting local log data. The official description of Filebeats is as follows. You can see that the Filebeats function is relatively simple, it can only collect local log, but can not do anything with the collected Log, so usually Filebeats usually needs to send the collected log to Logstash for further processing.
Filebeat is a log data shipper for local files. Installed as an agent on your servers, Filebeat monitors the log directories or specific log files, tails the files, and forwards them either to Elasticsearch or Logstash for indexing
Both Logstash and Fluentd have the ability to collect and process log. There are many comparisons between the two on the Internet. A link to a well-written article is provided below. The two are functionally equal, but Logstash consumes more memory, and the solution to this Logstash is to use Filebeats to collect log from each leaf node, and of course Fluentd also has a corresponding Fluent Bit.
Https://logz.io/blog/fluentd-Logstash/
Another important difference is that Fluentd is more abstract, shielding users from the tedious underlying details. The author's original words are as follows:
Fluentd's approach is more declarative whereas Logstash's method is procedural. For programmers trained in procedural programming, Logstash's configuration can be easier to get started. On the other hand, Fluentd's tag-based routing allows complex routing to be expressed cleanly.
Although the author says to make a neutral comparison between the two (Logstash and Fluentd), the bias is actually obvious:). This article is also mainly based on Fluentd, but the overall ideas are the same.
In addition, Filebeats, Logstash, Elasticsearch and Kibana are open source projects belonging to the same company. The official documents are as follows:
Https://www.elastic.co/guide/index.html
Fluentd is an open source project of another company. The official documents are as follows:
Https://docs.fluentd.org
About ELK
Introduction to ELK
ELK is a complete log collection and presentation solution provided by Elastic, which is the acronym of three products, namely Elasticsearch, Logstash and Kibana.
Elasticsearch is a real-time full-text search and analysis engine, which provides three functions: collecting, analyzing and storing data.
Logstash is a tool for collecting, analyzing, and filtering logs.
Kibana is a Web-based graphical interface for searching, analyzing and visualizing log data stored in Elasticsearch metrics
ELK log processing flow
The figure above shows the log collection process under a typical ELK scenario in a Docker environment:
Logstash extracts log information from each Docker container
Logstash forwards the log to Elasticsearch for indexing and saving
Kibana is responsible for analyzing and visualizing log information
Because Logstash is not good at data collection, and as an Agent, its performance is not up to standard. Based on this, Elastic released a series of beats lightweight acquisition components.
The Beat component we want to practice here is that Filebeat,Filebeat is built on beats and is applied to the implementation of log collection scenarios to replace the next-generation Logstash collector of Logstash Forwarder in order to collect more quickly, stably, lightweight and with low consumption. It can easily interface with Logstash and directly with Elasticsearch.
This experiment directly uses Filebeat as Agent, it will collect the record changes in the log file of json-file introduced in the first article "Docker logs & logging driver", and send the log directly to Elasticsearch for indexing and saving, its processing flow becomes the following figure, you can also think that it can be called EFK.
Installation of ELK Suite
In this experiment, we use Docker to deploy a minimum-scale ELK operating environment, of course, in the actual environment, we may need to consider high availability and load balancing.
First, pull the integrated image sebp/elk. The tag version selected here is latest:
Docker pull sebp/elk:latest
Note: since it includes the entire ELK scheme, you need to be patient for a while.
Use the integrated image sebp/elk to start running ELK with the following command:
Docker run-it-d-- name elk\-p 5601 name elk 5601\-p 9200 sebp/elk:latest 9200 sebp/elk:latest
After running, you can visit http://192.168.4.31:5601 to see the effect of Kibana:
Of course, there are currently no indexes and data for ES to display, so visit http://192.168.4.31:9200 to see if the API interface of Elasticsearch is available:
If some errors are found during startup, so that the ELK container cannot be started, you can refer to the article "Common errors in ElasticSearch startup". If your host memory is less than 4G, it is recommended to increase the configuration setting ES memory usage size, in case you can't start. For example, the following additional configuration limits ES memory usage up to 1G:
Docker run-it-d-- name elk\-p 5601 ES_MIN_MEM=512m 5601\-p 9200 ES_MIN_MEM=512m 9200\-p 5044 ES_MIN_MEM=512m\-e ES_MAX_MEM=1024m\ sebp/elk:latest
If you prompt max virtual memory areas vm.max_map_count [65530] is too low when starting the container, please refer to increase to at least [262144]
# Edit sysctl.con vi / etc/sysctl.conf # add the following configuration vm.max_map_count=655360 # and execute the command sysctl-p
Filebeat configuration
Install Filebeat
Download Filebeat
Here we download Filebeat via rpm. Note that we download the version corresponding to our ELK here (ELK is 7.6.1, and download 7.6.1 here to avoid errors):
Wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.6.1-x86_64.rpm rpm-ivh filebeat-7.6.1-x86_64.rpm
Configure Filebeat
Here we need to tell Filebeat which log files to monitor and where to send the logs, so we need to modify the configuration of Filebeat:
Nano / etc/filebeat/filebeat.yml
The content to be modified is:
(1) which logs are monitored
Filebeat.inputs: # Each-is an input. Most options can be set at the input level, so # you can use different inputs for various configurations. # Below are the input specific configurations. -type: log # Change to true to enable this input configuration. Enabled: true # Paths that should be crawled and fetched. Glob based paths. Paths:-/ var/lib/docker/containers/*/*.log
Paths:/var/lib/docker/containers/*/*.log is specified here, and it is important to note that enabled is set to true.
(2) where to send the log?
#-Elasticsearch output-output.elasticsearch: # Array of hosts to connect to. Hosts: [192.168.4.31 9200 "] # Optional protocol and basic auth credentials. # protocol: "https" # username: "elastic" # password: "changeme"
Here you specify to send it directly to Elasticsearch and configure the interface address of ES.
Note: if you want to send it to Logstash, use the following configuration, uncomment it and configure it:
#-Logstash output-# output.Logstash: # The Logstash hosts # hosts: ["localhost:9200"] # Optional SSL. By default is off. # List of root certificates for HTTPS server verifications # ssl.certificate_authorities: ["/ etc/pki/root/ca.pem"] # Certificate for SSL client authentication # ssl.certificate: "/ etc/pki/client/cert.pem" # Client Certificate Key # ssl.key: "/ etc/pki/client/cert.key"
Start Filebeat
Since Filebeat is already registered as a systemd service at the time of installation, you only need to start it directly:
Systemctl start filebeat
Set up boot boot:
Systemctl enable filebeat
Check the Filebeat startup status:
Systemctl status filebeat
The above actions are summarized as a script:
Wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.6.1-x86_64.rpm rpm-ivh filebeat-7.6.1-x86_64.rpm echo "please input elk host_ip" read host_ip sed-I "s / enabled: false/ enabled" / etc/filebeat/filebeat.yml sed-I "s /\ / var\ / log\ /\ * .log / / var\ / lib\ / docker\ / containers\ / \ * /\ * .log / g "/ etc/filebeat/filebeat.yml sed-I" slocalhost etc/filebeat/filebeat.yml systemctl start filebeat systemctl enable filebeat systemctl status filebeat 9200 max ${host_ip}: 9200 max g "/ etc/filebeat/filebeat.yml systemctl start filebeat systemctl enable filebeat systemctl status filebeat
Kibana configuration
Next we will tell Kibana which logs in Elasticsearch to query and analyze, so we need to configure an Index Pattern. We know from Filebeat that Index is filebeat-timestamp, so here we define Index Pattern as filebeat-*.
Click Next Step, and here we choose Time Filter field name as @ timestamp:
Click the Create index pattern button to complete the configuration.
At this point, we click the Discover menu on the left side of Kibana to see the log information of the container:
Taking a closer look at the details, let's look at the message field:
As you can see, our focus is on message, so we can also filter the information that only looks at this field:
Here is just a simple display of imported ELK log information, in fact, ELK also has a lot of rich ways to play, such as analysis aggregation, cool Dashboard and so on. The author is also a preliminary use here, so this is the end of the introduction.
Fluentd introduction
About Fluentd
Previously, we used Filebeat to collect Docker log information, based on Docker's default json-file logging driver. Here we use Fluentd, an open source project, to replace json-file collection container logs.
Fluentd is an open source data collector designed to handle data streams, using JSON as the data format. It uses a plug-in architecture with high scalability and high availability, as well as highly reliable information forwarding. Fluentd is also a member project of the Cloud Native Foundation (CNCF), which follows the Apache 2 License protocol and its GitHub address is https://github.com/fluent/fluentd/. Fluentd takes up less memory and the community is more active than Logstash, and you can refer to this article "Fluentd vs Logstash" for a comparison.
Therefore, the entire log collection and processing process becomes the following figure, and we use Filebeat to forward the logs collected by Fluentd to Elasticsearch.
Of course, we can also use the Fluentd plug-in (fluent-plugin-elasticsearch) to send logs directly to Elasticsearch, and we can replace Filebeat according to our own needs, thus forming a Fluentd = > Elasticsearch = > Kibana architecture, also known as EFK.
Run Fluentd
Here we run a Fluentd collector through the container:
Docker run-it-d-name fluentd\-p 24224 name fluentd\-p 24224:24224/udp\-v / etc/fluentd/log:/fluentd/log\ fluent/fluentd:latest
The default Fluentd uses port 24224, and its logs are collected under the path we mapped.
In addition, we need to modify the configuration file of Filebeat to add / etc/fluentd/log to the monitoring directory:
# = Filebeat inputs = = filebeat.inputs: # Each-is an input. Most options can be set at the input level, so # you can use different inputs for various configurations. # Below are the input specific configurations. -type: log # Change to true to enable this input configuration. Enabled: true # Paths that should be crawled and fetched. Glob based paths. Paths:-/ etc/fluentd/log/*.log
After adding the monitoring configuration, you need to re-restart the filebeat:
Systemctl restart filebeat
Run the test container
To verify the effect, here we Run two containers and set their log-dirver to fluentd:
Docker run-d\-log-driver=fluentd\-log-opt fluentd-address=localhost:24224\-log-opt tag= "test-docker-A"\ busybox sh-c 'while true; do echo "This is a log message from container A"; sleep 10; done 'docker run-d\-log-driver=fluentd\-log-opt fluentd-address=localhost:24224\-log-opt tag= "test-docker-B"\ busybox sh-c 'while true; do echo "This is a log message from container B"; sleep 10; done;'
Here, by specifying the log-driver of the container and setting a tag for each container, it is convenient for us to verify and view the log later.
Verify the effect of EFK
At this point, enter Kibana again to view the log information, and you can filter the log information of the container you just added through the tag information you just set:
Simulation log generation stress test tool
Https://github.com/elastic/rally
Https://pypi.org/project/log-generator/
Https://github.com/mingrammer/flog
This paper starts with the basic composition of ELK, introduces the basic processing flow of ELK, builds an ELK environment from 0, and demonstrates the case of collecting container log information based on Filebeat. Then, by introducing Fluentd, an open source data collector, we demonstrate how to collect logs based on EFK.
On how to understand the difference between open source log management solution ELK and EFK is shared here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.