Building ELK Log Analysis system with centos 7 07/12 Update SLTechnology News&Howtos

Building ELK Log Analysis system with centos 7

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

I. the composition of ELK

ELK consists of three open source tools, ElasticSearch, Logstash and Kiabana, and its official website is https://www.elastic.co/cn

Elasticsearch: is an open source distributed real-time analysis search engine, based on the full-text search engine library Apache Lucens, while hiding the complexity of Apache Luces. Elasticsearch packages all functions into a separate service and provides a simple RESTful API interface, which has the characteristics of distributed, zero configuration, automatic discovery, automatic index slicing, index replica mechanism, RESTful style interface, multiple data sources, automatic search load and so on.

Logstash: is a completely open source tool, mainly used for log collection, data processing, and output to Elasticsearch

Kibana: it is also an open source and free tool. Kibana can provide a graphical log analysis Web interface for Logstash and Elasticsearch, which can summarize, analyze and search important data logs. 1. How ELK works as shown below:

Logstash collects the Log generated by APPServer and stores it in the Elasticsearch cluster, while Kibana queries the data from the ES cluster to generate charts and returns it to Browser.

To put it simply, log processing and analysis generally requires the following steps:

Centralized management of logs

Format (Logstash) and output the log to Elasticsearch

Index and store formatted data (Elasticsearch)

Front-end data display (Kibana); 2. Brief introduction of Elasticsearch

Elasticsearch is a search server based on Lucene, which is stable, reliable, fast, and has good horizontal scalability. It is designed for distributed environment and is widely used in cloud computing. Elasticsearch provides a full-text search engine with distributed multi-user capability, based on RESTful Web interface. Through this interface, users can communicate with Elasticsearch through the browser. Elasticsearch is developed using Java and is distributed as open source under the Apache license terms. Wikipedia, Stack, Overflow, GitHub and so on are based on Elasticsearch to build search engines, so they have the characteristics of real-time search, stability, reliability, speed, easy installation and use.

3. The basic core concepts of Elasticsearch.

Near real-time (NRT): Elasticsearch is a search voucher with near real-time search speed, which is very fast, with only a slight delay from searching a document until it can be searched (typically 1s).

Cluster: a cluster is organized by one or more nodes, stores user data on all nodes, provides indexing and search functions together, elects the master node, and provides joint indexing and search across nodes. Each cluster has a marked name. The default is Elasticsearch. Each node resides in the cluster name and is added to its cluster. A cluster can have only one node. In order to have better fault tolerance, multiple nodes are usually configured. When configuring a cluster, it is recommended to configure it in cluster mode.

Node (node): a single server, multiple nodes are organized into a cluster, and each node stores data and participates in the indexing and search functions of the cluster. Like clusters, nodes are identified by name, and by default, character names are randomly assigned when the node starts, or can be customized. Nodes can be added to the cluster by specifying the cluster name. By default, each node can join the Elasticsearch cluster. If there are multiple nodes in the cluster, they will automatically form a cluster called Elasticsearch

Index: similar to a "library" in a relational database. After indexing a document, you can use Elasticsearch to search for the document, or you can simply understand the index as a place to store data, and you can easily conduct full-text indexing. The type of storage database (Type) is included under the index. TYPE is similar to the "table" in the relational database and is used to store specific data, while the Type contains the Document, which is equivalent to the "record" of the Yui relational database. A document is a basic information unit that can be indexed.

Shards and replicas: Elasticsearch divides the index into several parts, each of which is called a shard, and each shard is a fully functional independent index. The number of shards is generally specified before the index is created and cannot be changed after the index is created. There are two main reasons for shards:

Split horizontally to increase storage capacity

Distributed parallel cross-sharding operation to improve performance and throughput

A good data storage scheme requires that data be available no matter what kind of failure (such as node unavailable), and have high storage efficiency. To do this, Elasticsearch copies one or more copies of the index fragment, which is called a copy. A copy is another backup of the index for data redundancy and load sharing. By default, Elasticsearch automatically shares the load on index requests.

In short, the index can be divided into several fragments. These shards can also be copied 0 times (no replication) or multiple times. When there is a copy, the shard as the replication source is called the primary shard, and the shard as the replication target is called the replication shard. The number of shards and copies can be specified when the index is created. After the index is created, you can change the number of copies, but not the number of fragments. By default, each index in Elasticsearch is shredded into five primary shards and one copy. In a two-node scenario, each index will have 5 main shards and 5 other replica shards, with a total of 10 shards for each index.

4. Logstash introduction

Logstash is written in JRuby language and runs on the Java Virtual Machine (JVM). It is a powerful data processing tool that can achieve data transfer, format processing and format output. Logstash has a powerful plug-in function, which is often used for log processing. Logstash can be configured with a single agent, combined with other open source software to achieve different functions.

The idea of Logstash is simple. It only does three things: data input, data processing (such as filtering, rewriting, etc.), and data output. By combining input and output, the requirement of locking can be realized. When Logstash processes logs, a typical deployment architecture diagram is as follows:

The main components of Logstash:

Shipper: log collector. Responsible for monitoring the changes of local log files and collecting the latest log files in time. Typically, the remote agent side (agent) only needs to run this component

Indexer: log storage. Responsible for receiving logs and writing to local files

Broker: log Hub. Responsible for connecting multiple Shipper and Indexer

Search and Stronage: allows events to be searched and stored

Web Interface: Web-based display interface

It is precisely because the above components can be deployed independently in the Logstash architecture that provide better cluster scalability.

Logstash uses pipeline to collect, process and output logs, which is similar to the pipeline command of Linux system, sending the processing structure of the former process to the latter process to continue processing. In Logstash, there are three phases, namely, input (input), processing (Filter, unnecessary), and output (output), the relationship of which is shown in figure:

As shown in the figure, the whole process collects data for Input, Filter processes data, and Output outputs data. Each stage can also be specified in a variety of ways, such as the output can be either output to Elasticsearch or specified to stdout to print on the console. This plug-in organization makes the extension and customization of Logstash very convenient.

5. Kibana introduction

Kibana is an open source analysis and visualization platform for Elasticsearch, which is mainly designed to work with Elasticsearch to search and view data stored in Elasticsearch indexes, and to analyze and display advanced data through various charts. Kibana can make the data look clear at a glance. Its operation is very simple, and the browser-based user interface allows users to browse in real time from any location. Kibana can quickly create a dashboard to display query dynamics in real time. Kibana is very easy to use, and you can detect Elasticsearch index data simply by adding an index.

1) main functions of Kibana

Seamless integration of Elasticsearch: the Kibana architecture is customized for Elasticsearch and allows any (structured and unstructured) data to be added to the Elasticsearch index. Kibana also takes full advantage of the powerful search and analysis functions of Li Yongle Elasticsearch.

Integrate data: Kibana can make massive quantities easier to understand. According to the data content, you can create vivid bar charts, line charts, scatter charts, histograms, pie charts and maps for users to view.

Complex data analysis: Kibana enhances the analytical ability of Elasticsearch to analyze data more intelligently, perform data conversion, and cut data into blocks according to requirements.

Benefit more team members: powerful database visualization interfaces enable all business positions to benefit from data collection

Flexible interface and easier sharing: using Kibana makes it easier to create, save and share data, and communicate visual data quickly

Simple configuration: Kibana is very simple to configure and enable, and the user experience is very friendly. Kibana has its own Web server, which can be started and run quickly.

Visual multiple data sources: Kibana can easily integrate data from Logstash, ES-Hadoop, Beats multi-third-party technologies into Elasticsearch, supporting third-party technologies such as Apache, Flume, Fluentd, etc.

Simple data export: Kibana can easily export the data of interest, merge with other data sets, quickly model and analyze, and find new results.

Second, build the ELK platform

The case environment is as follows:

Prepare to install the environment:

Prepare three servers and configure the network parameters according to the above environment table. Here is 192.168.100.10-30. Secondly, turn off the firewall and Selinux. The Centos01 and Centos02 nodes allocate 4GB (> 2GB) memory, and the Apache node allocates 1GB memory. This case connects to the WAN on its own.

The functions of this case are as follows:

Configure ELK log analysis cluster, use Logsttash to collect logs, and use Kibana to view analysis logs

For all packages used in this case, please visit: https://pan.baidu.com/s/1OK49eAIwbvwIV5swe0-8-w

Extraction code: yiad

1. Centos01 basic environment configuration [root@localhost ~] # hostnamectl set-hostname centos01 [root@localhost ~] # bash [root@centos01 ~] # [root@centos01 ~] # vim / etc/hosts192.168.100.10 centos01192.168.100.20 centos02 [root@centos01 ~] # java-versionopenjdk version "1.8.0,131" OpenJDK Runtime Environment (build 1.8.0_131-b12) OpenJDK 64-Bit Server VM (build 25.131-b12) Mixed mode) [root@centos01 ~] # vim / etc/sysconfig/selinux SELINUX=disabled [root@centos01 ~] # systemctl stop firewalld [root@centos01 ~] # reboot2, Centos02 basic environment configuration [root@localhost ~] # hostnamectl set-hostname centos02 [root@localhost ~] # bash [root@centos02 ~] # [root@centos02 ~] # vim / etc/hosts192.168.100.10 centos01192.168.100.20 centos02 [root@centos02 ~] # java-versionopenjdk version "1.8.0x131" OpenJDK Runtime Environment (build 1.8.0_131-b12) OpenJDK 64-Bit Server VM (build 25.131-b12) Mixed mode) [root@centos02 ~] # vim / etc/sysconfig/selinux SELINUX=disabled [root@centos02 ~] # systemctl stop firewalld [root@centos02 ~] # reboot3, Centos01 install Elasticsearch software [root@centos01 ~] # ls anaconda-ks.cfg elasticsearch-5.6.16.rpm initial-setup-ks.cfg [root@centos01 ~] # yum-y install elasticsearch-5.6.16.rpm [root@centos01 ~] # vim / etc/elasticsearch/elasticsearch.yml cluster.name: ELK node.name: centos01 network.host: 192.168.100.10 http.port: 9200 Discovery.zen.ping.unicast.hosts: ["centos01" "centos02"] [root@centos01 ~] # systemctl daemon-reload [root@centos01 ~] # / etc/init.d/elasticsearch start [root@centos01 ~] # systemctl start elasticsearch [root@centos01 ~] # systemctl enable elasticsearch [root@centos01 ~] # netstat-anptu | grep 9200 tcp6 00 192.168.100.10 netstat 9200:: * LISTEN 1557/java 4, Centos02 installs Elasticsearch software [root@centos02 ~] # ls anaconda-ks.cfg elasticsearch-5.6.16.rpm initial-setup-ks.cfg [root@centos02 ~] # yum-y install elasticsearch-5.6.16.rpm [root@centos02 ~] # vim / etc/elasticsearch/elasticsearch.yml cluster.name: ELK node.name: centos02 network.host: 192.168.100.20 http.port: 9200 Discovery.zen.ping.unicast.hosts: ["centos01" "centos02"] [root@centos02 ~] # systemctl daemon-reload [root@centos02 ~] # / etc/init.d/elasticsearch start [root@centos02 ~] # systemctl start elasticsearch [root@centos02 ~] # systemctl enable elasticsearch [root@centos02 ~] # netstat-anptu | grep 9200 tcp6 00 192.168.100.20 netstat 9200:: * LISTEN 1557/java 5, Access to two nodes through the client

Configure client IP address to access centos01 through browser

Configure client IP address to access centos02 through browser

6 、 Install elasticsearch-head graphical management ELK tools 1) centos01 node [root@centos01 ~] # lsanaconda-ks.cfg elasticsearch-5.6.16.rpm initial-setup-ks.cfg node-v4.2.2-linux-x64.tar.gz [root@centos01 ~] # tar zxvf node-v4.2.2-linux-x64.tar.gz-C / usr/local/ [root@centos01 local] # mv node-v4.2.2-linux-x64/ node [root@centos01 Local] # ln-s / usr/local/node/bin/npm / usr/local/bin/npm [root@centos01 local] # ln-s / usr/local/node/bin/node / usr/local/bin/node [root@centos01 ~] # vim / etc/profile export NODE_HOME=/usr/local/nodeexport PATH=$PATH:$NODE_HOME/binexport BODE_PATH=$NODE_HOME/lib/node_modules/ [root @ centos01 ~] # source / etc/profile [root@centos01 ~] # vim / etc/ Elasticsearch/elasticsearch.yml http.cors.enabled: true http.cors.allow-origin: "*" [root@centos01 ~] # git clone git://github.com/mobz/elasticsearch-head.git [root@centos01 ~] # lsanaconda-ks.cfg elasticsearch-5.6.16.rpm elasticsearch-head initial-setup-ks.cfg node-v4.2.2-linux-x64.tar.gz [root@centos01 ~] # mv elasticsearch-head / usr/ Local/ [root@centos01 ~] # cd / usr/local/elasticsearch-head/ [root@centos01 elasticsearch-head] # npm install-g grunt-cli [root@centos01 elasticsearch-head] # grunt- version grunt-cli v1.3.2 [root@centos01 ~] # vim / usr/local/elasticsearch-head/Gruntfile.js 99 keepalive: true Hostname: "* [root@centos01 ~] # vim / usr/local/elasticsearch-head/_site/app.js 4374 this.base_uri = this.config.base_uri | | this.prefs.get (" app-base_uri ") | |" http://192.168.100.10:9200"; " [root@centos01 ~] # cd / usr/local/elasticsearch-head/ [root@centos01 elasticsearch-head] # npm install [root@centos01 elasticsearch-head] # grunt server& [root@centos01 ~] # / etc/init.d/elasticsearch restart [root@centos01 ~] # netstat-anptu | grep 9200 tcp6 00 192.168.100.10 netstat 9200: * LISTEN 1557/java [root@centos01 ~] # netstat-anptu | grep 9100 tcp6 00: 9100: * LISTEN 3400/grunt

The configuration of the centos02 node is the same as that of the centos01 node, except that you can change the IP address and refer to the self-configuration of the centos01 node.

Now you can visit http://192.168.100.10:9100 through the browser to view the cluster information

7 、 Install logstash [root @ centos01 ~] # lsanaconda-ks.cfg elasticsearch-5.6.16.rpm initial-setup-ks.cfg logstash-5.5.1.rpm node-v4.2.2-linux-x64.tar.gz [root@centos01 ~] # rpm-ivh logstash-5.5.1.rpm [root@centos01 ~] # ln-s / usr/share/logstash/bin/logstash / usr/local/bin/ [root@centos01 ~] # mkdir-p / usr/share/logstash/ Config [root@centos01 ~] # ln-s / etc/logstash/* / usr/share/logstash/config/ [root@centos01 ~] # systemctl start logstash [root@centos01 ~] # systemctl enable logstash [root@centos01 ~] # logstash-e 'input {stdin {} output {stdout {}}' The stdin plugin is now waiting for input:www.baidu.com 2019-12-19T07:44:26.487Z centos01 www.baidu.com [root@centos01 ~] # logstash-e 'input {stdin {}} output {stdout {codec= > rubydebug}}' The stdin plugin is now waiting for input:www.baidu.com {"@ timestamp" = > 2019-12-19T07:48:34.006Z "@ version" = > "1", "host" = > "centos01", "message" = > "www.baidu.com"} [root@centos01 ~] # logstash-e 'input {stdin {} output {elasticsearch {hosts = > ["192.168.100.10 message 9200"]}' The stdin plugin is now waiting for input:www.baidu.com

Centos02 and centos01 are in the same configuration. Please configure them yourself.

Next, visit 9100 through the client browser to view the log.

8 、 Install Kibana [root @ centos01 ~] # lsanaconda-ks.cfg initial-setup-ks.cfg logstash-5.5.1.rpmelasticsearch-5.6.16.rpm kibana-5.5.1-x86_64.rpm node-v4.2.2-linux-x64.tar.gz [root@centos01 ~] # rpm-ivh kibana-5.5.1-x86_64.rpm [root@centos01 ~] # vim / etc/kibana/kibana.yml server.port: 5601 Server.host: "0.0.0.0" elasticsearch.url: "http://192.168.100.10:9200" [root@centos01 ~] # systemctl start kibana [root@centos01 ~] # systemctl enable kibana

Centos02 is the same as the above configuration. Change the IP address of the url connected to elasticsearch to the IP address of centos02, and configure it yourself.

Port 5601 can now be accessed through a client browser

9 、 Configure the monitoring website server [root@centos03 ~] # yum-y install httpd [root@centos03 ~] # systemctl start httpd [root@centos03 ~] # systemctl enable httpd [root@centos03 ~] # lsanaconda-ks.cfg initial-setup-ks.cfg logstash-5.5.1.rpm [root@centos03 ~] # rpm-ivh logstash-5.5.1.rpm [root@centos03 ~] # vim / etc/logstash/conf.d/apache_error.conf input { File {path = > "/ var/log/httpd/error_log" type = > "error" start_position = > "beginning"}} output {if [type] = = "error" {elasticsearch {hosts = > ["192.168.100.10 type 9200"] Index = > "apache_error-% {+ YYYY.MM.dd}"}} [root@centos03 ~] # systemctl start logstash.service [root@centos03 ~] # systemctl enable logstash.service [root@centos03 ~] # / usr/share/logstash/bin/logstash-f / etc/logstash/conf.d/apache_error.conf 1) can now be accessed by the browser to verify whether the monitoring is successful

2) View the log of the monitoring website server

3) restart the website server and monitoring service; the browser will view the log of the monitoring website server again.

Add the index on the kibana server by yourself.

-this is the end of this article. Thank you for reading-

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.