ELK Log Server installation and deployment 07/01 Update SLTechnology News&Howtos

ELK Log Server installation and deployment

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/03 Report--

Our public account

From the team members to write earnestly

A brief introduction:

ELK is made up of three open source tools, which are simply explained as follows:

Elasticsearch is an open source distributed search engine, its characteristics are: distributed, zero configuration, automatic discovery, index automatic slicing, index copy mechanism, restful style interface, multiple data sources, automatic search load and so on.

Logstash is a completely open source tool that collects, filters, and stores your logs for later use (e.g., search).

Kibana is also an open source and free tool that provides a friendly Web interface for log analysis for Logstash and ElasticSearch to help you aggregate, analyze, and search important data logs.

Scenario analysis:

Logs mainly include system logs, application logs, security logs and so on. Through the log, the operation and maintenance personnel and developers can know the software and hardware information of the server, check the errors in the configuration process and the causes of the errors. Regular analysis of logs can understand the load, performance and security of the server, so as to take timely measures to correct errors.

Usually, logs are scattered and stored on different devices. If you manage dozens or hundreds of servers, you are still checking logs using the traditional method of logging in to each machine in turn. Does this feel tedious and inefficient? As a top priority, we use centralized log management, such as open source syslog, to collect and summarize logs on all servers.

After centralizing the management of logs, log statistics and retrieval has become a more troublesome thing. Generally, we can use Linux commands such as grep, awk and wc to achieve retrieval and statistics. However, it is hard to avoid using this method for higher query, sorting and statistics requirements and a large number of machines.

Here, the use of open source real-time log analysis ELK platform can perfectly solve our above problems, of course, there are other platforms or tools can be used, here only discuss ELK, the official website: https://www.elastic.co

So far, the latest stable version on elk's official website is 5.4.0.

The effect needs to be achieved:

1. The system messages log is imported into elasticsearch by local beat (the data does not do any processing), and finally can be queried through kibana.

2. The Apache access log is imported into elasticsearch by remote beat (the data is processed). Through kibana, any field in the log can be searched and displayed, or fuzzy query can be combined. That is, the apache log is stored in elasticsearch in json format.

3. Nginx access logs, Apache access logs and system logs of different clients are imported into elasticsearch through different matching condition regular processing. Nginx and Syslog need to write simple corresponding regular expressions.

Main points for attention:

1. The Elk version number remains the same.

2. It is best to keep the operating system version of all nodes consistent and use the current stable version of centos7.3 as far as possible. The configuration of the three nodes of Elk needs to be a little higher than the other nodes, which is 2C4G, and the other nodes are all 2C2G. The memory is too low. These are all the pits I stepped on. All nodes are required to have access to the extranet, and software packages need to be installed.

3. Turn off firewall and selinux.

4. Elk uses tar package to install all software for unification. Installing with yum, especially logstash, will encounter a lot of holes.

5. The building process is not difficult, but it is difficult for various projects to debug each other, and the difficulty is the advanced use of elk.

Description:

The purpose of this article is to get you started. For more advanced applications and usage of elk, please refer to the official website or other technical documentation. All the applications are deployed separately here in order to deploy to the docker container later. Of course, you can also deploy them all on one server.

Details:

IP address hostname usage installation software 192.168.2.25apache client httpd, filebeat192.168.2.26nginx client nginx, filebeat192.168.2.27logstash log analysis processing logstash, filebeat192.168.2.28elasticsearch storage data elasticsearch192.168.2.30kibana query data kibana

Installation steps:

1. Install jdk,jdk on the three nodes of Elk. You can download it from the official oracle website. The version number can be different from mine.

Click (here) to collapse or open

Rpm-ivh jdk-8u102-linux-x64.rpm

2. Install the elasticsearch node

Click (here) to collapse or open

Wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.4.0.tar.gz

Tar zxvf elasticsearch-5.4.0.tar.gz

Mv elasticsearch-5.4.0 / usr/local/elasticsearch

Cd / usr/local/elasticsearch/config

Back up the elasticsearch default configuration file to prevent modification errors

Cp elasticsearch.yml elasticsearch.yml.default

After editing it is as follows:

To add elasticsearch users, tar package startup must be run using ordinary users

Click (here) to collapse or open

Useradd elasticsearch

Chown-R elasticsearch:elasticsearch / usr/local/elasticsearch

Open the sysctl.conf file and add the following:

Click (here) to collapse or open

Vm.max_map_count = 655360

Sysctl-p / etc/sysctl.conf

Open / etc/security/limits.conf file, modify the number of open file handles

Click (here) to collapse or open

* soft nofile 65536

* hard nofile 65536

* soft nproc 65536

* hard nproc 65536

Su-elasticsearch

Cd / usr/local/elasticsearch

Bin/elasticsearch

It will take some time to start for the first time, because some initialization actions need to be done. If the startup is not successful, please check the relevant log of elasticsearch. Note that the above only starts debugging in the foreground, needs to add & in the background, and needs to be restarted.

Check to see if the port is open

Simple testing of curl

3. Install the logstash node

Click (here) to collapse or open

Wget https://artifacts.elastic.co/downloads/logstash/logstash-5.4.0.tar.gz

Tar zxvf logstash-5.4.0.tar.gz

Mv logstash-5.4.0 / usr/local/logstash

Download filebeat on logstash and start it to listen for additions to the data source file that are processed by logstash and uploaded to es.

Click (here) to collapse or open

Wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-5.4.0-linux-x86_64.tar.gz

Tar zxvf filebeat-5.4.0-linux-x86_64.tar.gz

Mv filebeat-5.4.0-linux-x86_64 / usr/local/filebeat

Cd / usr/local/filebeat

Cp filebeat.yml filebeat.yml.default

Edit the filebeat.yml file as follows:

Start the filebeat service

Click (here) to collapse or open

Cd / usr/local/filebeat

. / filebeat &

Note that filebeat does not have a listening port. It mainly depends on logs and processes.

Create a new local file messages-log, and you can fetch several messages files of the local system, as follows:

Note that the file record information monitored by filebeat is in / usr/local/filebeat/data/registry.

Finally, create a new logstash startup specified test.conf configuration file, which is as follows:

Logstash has three regions: input, filter and output by default. Generally, you need to configure at least input and output.

Logstash's own default logstash.yml configuration file chooses not to be modified!

Cd / usr/local/logstash

First of all, simply test that logstash does not specify a configuration file to start

Click (here) to collapse or open

Bin/logstash-e 'input {stdin {}} output {stdout {}'

If we enter hello world manually, it will also output hello world.

Then specify the configuration file test.conf to start. Note that this is started in the foreground to facilitate debugging.

Check if ports 5044 and 9600 are open

After waiting for a while, the following message output should appear, which is the last line of definition in test.conf to be output to the screen.

But the configuration file is also entered into elasticsearch, so let's verify it:

Note that there is only one piece of data in the following figure. If you want to see the complete data, we will use kibana to view it.

4. Install the kibana node

Click (here) to collapse or open

Wget https://artifacts.elastic.co/downloads/kibana/kibana-5.4.0-linux-x86_64.tar.gz

Tar zxvf kibana-5.4.0-linux-x86_64.tar.gz

Mv kibana-5.4.0-linux-x86_64 / usr/local/kibana

Cd / usr/local/kibana/config

Cp kibana.yml kibana.yml.default

Edit kibana.yml profile

Start the kibana service

Click (here) to collapse or open

Bin/kibana

Check to see if the port is open

Open a browser and enter http://192.168.2.30:5601

Click the create button, and then click the discover button above. Note that if there is no data, take a look at the import time @ timestamp and compare it with the current time. Kibana only displays the data of the last 15 minutes by default. If it exceeds 15 minutes, please select the appropriate time. From kibana, you can see that the four pieces of data in messages-log have been imported normally. This completes the first effect of our implementation. But it's just getting the process through, and there's more we need to do next. Note that you can create an index in kibana only after you import the data into es.

Now we need to achieve the second effect, we first clean up the data in elasticsearch, it doesn't matter whether we delete it or not, it's just to demonstrate the location of the data storage in es.

Rm-rf / usr/local/elasticsearch/data/nodes

Shut down the elasticsearch service and restart, and the nodes directory that has just been deleted will be initialized and created again. Refresh the discover button on the kibana page again, change the timeline to the data of the last 5 years, and really cannot find the data.

5. Install the apache node. For simple testing, I will directly install the yum.

Click (here) to collapse or open

Yum install httpd-y

Systemctl start httpd

Using a browser to access http://192.168.2.25 appears the apache main interface, view the log shows as follows, in order to facilitate the demonstration I only take 6 pieces of data here, 4 status codes are "200", a" 403 ", a" 404 ", now we need to import these pieces of data into elasticsearch through logstash, and then query through kibana.

The apache node installs filebeat as a client

The installation steps are referred to above.

The configuration file is as follows:

Start the filebeat service

Click (here) to collapse or open

. / filebeat &

Stop the logstash service, and then re-specify a test02.conf configuration file with an additional filter area. Here, the apache log is matched according to the grok rule, and each field in the log is imported in json format, as follows:

The% {COMBINEDAPACHELOG} regularity in the figure above is included with the default logstash. The specific location is as follows:

/ usr/local/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-4.0.2/patterns/grok-patterns

There are two COMMONAPACHELOG parameters in the grok-patterns file in the above figure. The above one is the COMMONAPACHELOG parameter, which is the log format used by apache as the nginx backend server, while the following COMBINEDAPACHELOG parameter is to directly call the above COMMONAPACHELOG parameter plus two parameters as the log format used by the web server. Here I use apache as the web server, so just use the COMBINEDAPACHELOG parameter! If it is used as a web server on the nginx backend, just use the COMMONAPACHELOG parameter! The format of each parameter is separated by a colon. Before the variable is defined in grok-pattrens, the name of the variable can be customized. Each% represents a matching parameter.

Check the configuration file test02.conf for syntax errors before starting logstash

Logstash is officially launched. Here, because there is more data, only one piece of data is intercepted.

From the figure above, we can see that each field of the apache log has been imported into elasticsearch in json format, and some additional fields have been added. For example, the most confusing fields are timestamp and @ timestamp. The former is the access time of apache, and the latter can be understood as the logstash processing time, which is 8 hours later than our Beijing time. I think this time is rarely used. We query the accuracy of the number of data items from kibana. 6hits indicates that there are 6 pieces of data, which is exactly the same as the number in our access_log above.

Click on the arrow of any piece of data, and then click json, and we see that all the fields of the apache log have been stored in json format, such as request, status code, request size, and so on.

Try fuzzy search, searching for a status code of 404 within a certain access time.

Search status code greater than 400 less than 499

From the image above, we basically know that when the search conditions become more and more rigorous, our only way is to split our data into elasticsearch by field. In this way, the search is what we need. This basically completes the second effect that we want to achieve.

Next, we need to store apache, nginx and system logs in elasticsearch according to different log formats. First, each machine needs to collect system logs, and then collect server logs for different businesses. Here apache collects apache and Syslog, and nginx also collects nginx and Syslog.

6. Install the nginx node, and nginx is used as the front-end reverse proxy server

Click (here) to collapse or open

. Yum install epel-release-y

. Yum install nginx-y

First, let's take a look at nginx's default log format.

Usually, we add three forwarding parameters to the log: the address returned by the back-end server, the status code returned by the back-end program, and the response time of the back-end program.

Note that the nginx log format is not available by default in logstash's grok, but like apache, it is basically used as a web server, and many field parameters can be shared. Here, a COMMONNGINX parameter is added directly to the grok-patterns file.

COMMONNGINXLOG% {COMBINEDAPACHELOG}% {QS:x_forwarded_for} (?:% {HOSTPORT1:upstream_addr} | -) (% {STATUS:upstream_status} | -) (% {BASE16FLOAT:upstream_response_time} | -)

Before the $http_x_forwarded_for parameter, you can call apache directly, and the last four are defined by yourself. Note that the variables before the colon must be defined. For example, HOSTPORT1 and STATUS are the variable names that logstash does not have by default, so we need to use regular matching to add the following to the grok-patterns file:

Save and exit, and then call the COMMONNGINXLOG parameter directly!

Now it's time to define the system log. Although it is available by default, it does not meet our needs very well. We manually write a regular to add to the grok-patterns file.

SYSLOG% {SYSLOGTIMESTAMP:syslog_timestamp}% {SYSLOGHOST:syslog_hostname}% {DATA:syslog_program} (?:\ [% {POSINT:syslog_pid}\]):% {GREEDYDATA:syslog_message}

Of course, you can also test with the help of Grok Debugger or Grok Comstructor tools. When adding custom regularities, you can check "Add custom patterns" in Grok Debugger.

Now you need to debug the nginx forwarding request to the apache server for processing. That is, nginx is the front-end reverse proxy and apache is the back-end server.

Edit the nginx main configuration file nginx.conf and modify location / to look like this:

Start the nginx service

Click (here) to collapse or open

Systemctl start nginx

Install filebeat on nginx (refer to the steps above)

The filebeat configuration file for nginx is as follows:

The new messages_log file for nginx is as follows:

Modify the apache main configuration file httpd.conf file, and modify the format of the log, because it is now used as a back-end web server, and there is no need to record information such as agent.

Get rid of this line of comments

CustomLog "logs/access_log" common

Comment the line before it

CustomLog "logs/access_log" combined

Create a test.html file under the / var/www/html directory under the default root directory of the apache server, and you can write whatever you want:

Restart the apache service

Systemctl restart httpd

Access the nginx service

The following interface appears in the nginx log, indicating that it is normal.

Click (here) to collapse or open

192.168.9.106-[10/May/2017:09:14:28 + 0800] "GET / test.html HTTP/1.1" 20014 "" Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36 "-" 192.168.2.25 Mozilla/5.0 80 200 0.002

192.168.9.106-[10/May/2017:09:14:28 + 0800] "GET / favicon.ico HTTP/1.1" 404 209 "http://192.168.2.26/test.html"" Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36 "-" 192.168.2.25 GET 80 404 0.001

The following interface appears in the apache log, indicating that it is normal.

Click (here) to collapse or open

192.168.2.26-[10/May/2017:09:14:31 + 0800] "GET / test.html HTTP/1.0" 20014

192.168.2.26-[10/May/2017:09:14:31 + 0800] "GET / favicon.ico HTTP/1.0" 404 209

The filebeat configuration file for apache is as follows:

The new messages_log file for apache is as follows:

Here all the configuration and test files are ready, nginx and apache server log each 2 data, system log 2, that is, 8 data.

Finally, it's our highlight. The test03.conf configuration file for logstash is as follows:

Note that the regular matching of apache is modified by me this time, because it is used as a back-end server. In order to verify the correctness of data import, clear the data in elastucsearch and the record point information imported by filebeat on nginx and apache clients. Note that to clear the record point information, stop the filebeat service first, then delete the registry file and then start the filebeat service. Elasticsearch cleanup data refer to the above steps.

Start logstash and capture only one system's access log import. You can see that the system log is also imported into elasticsearch for storage according to the log format field.

From the kibana, you can see that the data is exactly 8.

Let's take a casual look at the imported json format of a system log, which is mainly divided into four fields according to SYSLOG rules: syslog_timestamp, syslog_hostname, syslog_program, and syslog_message.

Let's take a look at the imported json format of the nginx log, where the fields of nginx are not explained one by one.

This achieves the three effects we need to show at the beginning, of course, this is only a very basic construction and configuration, for some more advanced use of ELK, please refer to the official documentation.

Summary of frequently asked questions:

1. New content of listening file is imported repeatedly.

Generally, it is caused by editing the new content of the file directly, and the correct way is echo "xxx" > > filename.

2. There is no data in Kibana, but there is in elasticsearch.

It may be that the time for querying the data is not selected correctly.

3. Logstash starts very slowly

Install the epel source, then install haveged and start, restart logstash

4. The logstash installed by Yum can start, but cannot import data into elasticsearch.

In general, there is no big problem with elasticsearch and kibana installed by yum, but logstash is not good, it seems that it is not good to specify configuration files, and there will be many holes.

5. After the data is imported, it is not presented in json in accordance with the prescribed rules.

Generally speaking, the format and regularity of the data do not match.

High definition original turtle operation and maintenance wuguiyunwei.com

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.