How to solve the practical problem of ELK 07/01 Update SLTechnology News&Howtos

How to solve the practical problem of ELK

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

Today, I will talk to you about how to solve the practical problem of ELK. Many people may not know much about it. In order to make you understand better, the editor has summarized the following for you. I hope you can get something according to this article.

Summary of practical knowledge points of ELK

1. Transcoding problem

This problem is mainly garbled in Chinese.

Codec= > plain transcoding in input:

Codec = > plain {charset = > "GB2312"}

Encode the text of GB2312 into the encoding of UTF-8.

You can also implement code conversion in filebeat (recommended):

Filebeat.prospectors:-input_type: log paths:-c:\ Users\ Administrator\ Desktop\ performanceTrace.txt encoding: GB2312

2. Delete the redundant lines in the redundant log

If ([message] = ~ "^ 20.9 -\ task\ request,.*,start\ time.*") {# use extra lines to be deleted by the regular drop {}}

Log example:

2018-03-20 10 Request String: {"UserName": "15046699023", "Pwd": "ZYjyh727", "DeviceType": 2, "DeviceId": "PC-20170525SADY", "EquipmentNo": null, "SSID": "pc", "RegisterPhones": null, "AppKey": "ab09d78e3b2c40b789ddfc81674bc24deac", Request String: {"UserName": "15046699023", "Pwd": "ZYjyh727", "DeviceId": "PC-20170525SADY", "EquipmentNo": null, "SSID": "pc", "RegisterPhones": null, "AppKey": "ab09d78e3b2c40b789ddfc81674bc24deac" "Version": "2.0.5.3"}-End-Response String: {"ErrorCode": 0, "Success": true, "ErrorMsg": null, "Result": null, "WaitInterval": 30}-End

3. Grok handles many different lines of logs.

Log example:

2018-03-20 10 Request String: {"UserName": "15046699023", "Pwd": "ZYjyh727", "DeviceType": 2, "DeviceId": "PC-20170525SADY", "EquipmentNo": null, "SSID": "pc", "RegisterPhones": null, "AppKey": "ab09d78e3b2c40b789ddfc81674bc24deac" "Version": "2.0.5.3"}-End-Response String: {"ErrorCode": 0, "Success": true, "ErrorMsg": null, "Result": null, "WaitInterval": 30}-End

In logstash filter, grok processes three lines:

Match = > {"message" = > "^ 20.9 -\ task\ request,.*,start\ time\:% {TIMESTAMP_ISO8601:RequestTime}" match = > {"message" = > "^ -\ Request\ String\:\ {\" UserName\ ":\"% {NUMBER:UserName:int}\ ",\" Pwd\ ":\" (?. *)\ ",\" DeviceType\ ":% {NUMBER:DeviceType:int},\" DeviceId\ ":\" (?. *)\ " \ "EquipmentNo\": (?. *),\ "SSID\": (?. *),\ "RegisterPhones\": (?. *),\ "AppKey\":\ "(?. *)\",\ "Version\":\ "(?. *)\" (?. *)} match = > {"message" = > "^--\ Response\ String\:\ {\" ErrorCode\ ":% {NUMBER:ErrorCode:int} \ "Success\": (? [a murz] *),\ "ErrorMsg\": (?. *),\ "Result\": (?. *),\ "WaitInterval\":% {NUMBER:WaitInterval:int}\--\\ End.* "}. Wait for multiple lines

4. Log multi-line merge processing-multiline plug-in (key)

Example:

① log

2018-03-20 10 DEBUG Debug-task request,task Id:1cbb72f1-a5ea-4e73-957cMai6d20e9e12a7a Magic start time:2018-03-20 10:43:59-- Request String: {"UserName": "15046699903", "Pwd": "ZYjyh727", "DeviceType": 2, "DeviceId": "PC-20170525SADY", "EquipmentNo": null, "SSID": "pc", "RegisterPhones": null, "AppKey": "ab09d78e3b2c40b789ddfc81674bc24deac" "Version": "2.0.5.3"}-End-Response String: {"ErrorCode": 0, "Success": true, "ErrorMsg": null, "Result": null, "WaitInterval": 30}-End

② logstash grok's processing of multiple rows after merging. The following is the same for merging multiple lines, as follows:

Filter {grok {match = > {"message" = > "^% {TIMESTAMP_ISO8601:InsertTime}\. * -\ task\ request,.*,start\ time:% {TIMESTAMP_ISO8601:RequestTime}\ Request\ String\:\ {\" UserName\ ":\"% {NUMBER:UserName:int}\ ",\" Pwd\ ":\" (?. *)\ ",\" DeviceType\ ":% {NUMBER:DeviceType:int},\" DeviceId\ ":\" (?. *)\ " \ "EquipmentNo\": (?. *),\ "SSID\": (?. *),\ "RegisterPhones\": (?. *),\ "AppKey\":\ "(?. *)\",\ "Version\":\ "(?. *)\"\}\ -\ End\ nUV -\ Response\ String\:\ {"ErrorCode\":% {NUMBER:ErrorCode:int},\ "Success\": (? [AMALZ] *) \ "ErrorMsg\": (?. *),\ "Result\": (?. *),\ "WaitInterval\":% {NUMBER:WaitInterval:int}\--\ End "}

Use the multiline plug-in in filebeat (recommended):

① introduces multiline

Pattern: which line does regular match merge from?

Negate:true/false, match to the pattern part to start merging, or unworthy to merge.

Match:after/before (need to understand for yourself)

After: merge after matching to the pattern section. Note: in this case, the last row of logs will not be matched.

Before: merge before matching to the pattern section (recommended).

After ② version 5.5 (before as an example)

Filebeat.prospectors:-input_type: log paths:-/ root/performanceTrace* fields: type: zidonghualog multiline.pattern:'. *\ "WaitInterval\":. *--\ End' multiline.negate: true multiline.match: before

Before ③ version 5.5 (after as an example)

Filebeat.prospectors:-input_type: log paths:-/ root/performanceTrace* input_type: log multiline: pattern:'^ 20.00 'negate: true match: after

Use the multiline plug-in in logstash input (recommended when there is no filebeat):

① introduces multiline

Pattern: which line does regular match merge from?

Negate:true/false, match to the pattern part to start merging, or unworthy to merge.

What:previous/next (need to understand for yourself)

Previous: the after equivalent of filebeat

Next: the before equivalent of filebeat.

② usage

Input {file {path = > ["/ root/logs/log2"] start_position = > "beginning" codec = > multiline {pattern = > "^ 20.*" negate = > true what = > "previous"}

Use the multiline plug-in in logstash filter (not recommended):

Reasons for not recommending:

When filter sets multiline, pipline worker is automatically reduced to 1

Version 5.5 officially removes multiline. You need to download it if you want to use it. The download command is as follows:

/ usr/share/logstash/bin/logstash-plugin install logstash-filter-multiline

Example:

Filter {multiline {pattern = > "^ 20.*" negate = > true what = > "previous"}}

5. Use of date in logstash filter

Log example:

2018-03-20 10:44:01 [33] DEBUG Debug-task request,task Id:1cbb72f1-a5ea-4e73-957c rotor 6d20e9e12a7a memery start time:2018-03-20 10:43:59

Date uses:

Date {match = > ["InsertTime", "YYYY-MM-dd HH:mm:ss"] remove_field = > "InsertTime"}

Note: match = > ["timestamp", "dd/MMM/YYYY H:m:s Z"]

Match this field. The format of the field is: day / month / month / year hour / minute / second time zone, or match = > ["timestamp", "ISO8601"] (recommended)

Date introduction:

This is to replace the key that matches the time in the log with the time of @ timestamp, because the time of @ timestamp is the time when the log was sent to logstash, not the real time in the log.

6. Classify and deal with multiple types of logs (key points)

Add type classifications to the configuration of filebeat:

Filebeat: prospectors:-paths: #-/ mnt/data/WebApiDebugLog.txt*-/ mnt/data_total/WebApiDebugLog.txt* fields: type: WebApiDebugLog_total-paths:-/ mnt/data_request/WebApiDebugLog.txt* #-/ mnt/data/WebApiDebugLog.txt* fields: type: WebApiDebugLog_request -paths:-/ mnt/data_report/WebApiDebugLog.txt* #-/ mnt/data/WebApiDebugLog.txt* fields: type: WebApiDebugLog_report

With if in logstash filter, different classes can be treated differently:

Filter {if [fields] [type] = = "WebApiDebugLog_request" {# pair request if ([message] = ~ "^ 20.records -\ task\ report,.*,start\ time.*") {# Delete report lines drop {}} grok {match = > {"..."}

Use if in logstash output:

If [fields] [type] = = "WebApiDebugLog_total" {elasticsearch {hosts = > ["6.6.6.6 type 9200"] index = > "logstashl-WebApiDebugLog_total-% {+ YYYY.MM.dd}" document_type = > "WebApiDebugLog_total_logs"}

Second, optimize the overall performance of ELK.

1. Performance analysis

Server hardware Linux:1cpu4GRAM

Suppose each log 250Byte.

Analysis:

① logstash-Linux:1cpu 4GRAM

500 logs per second

Remove 660 logs per second from ruby

1000 data per second after removing grok.

② filebeat-Linux:1cpu 4GRAM

2500-3500 data per second

Every day each machine can handle: 24 hours, 60 minutes, 60 secs * 3000*250Byte=64800000000Bytes, about 64g.

③ bottleneck fetches data from Redis in logstash and stores it in ES, then opens one logstash to process about 6000 data per second, and turns on two logstash to process about 10000 data per second (cpu is basically full)

The startup process of ④ logstash consumes a lot of system resources because java, ruby, and other environment variables are checked in the script, and resource consumption returns to normal after startup.

2. About the choice of collecting logs: logstash/filter

There is no principle that requires the use of filebeat or logstash, both of which have the same function as shipper.

The difference is:

Logstash is a heavyweight compared to beat because it integrates many plug-ins, such as grok and ruby.

Logstash takes up more resources after startup, and there is no need to consider the difference between them if the hardware resources are sufficient.

Logstash is based on JVM and supports cross-platform, while beat is written in golang and AIX does not support it.

Jdk (jre) 1.732bit jre 64bit is not supported on the AIX 64bit platform.

Filebeat can be directly input to ES, but there is a situation in the system where logstash is directly input to ES, which will cause complex retrieval due to different index types. It is recommended to input it to the source of els.

Summary:

Logstash/filter has its own advantages, but I recommend the choice: configure filebeat on each log server that needs to be collected, because it is lightweight, it is used to collect logs; then output it to logstash to deal with the logs; and then output it to els by logstash.

3. Optimize the related configuration of logstash

Parameters that can be optimized can be optimized according to your own hardware:

Number of ① pipeline threads, which is officially recommended to be equal to the number of CPU cores

Default configuration-> pipeline.workers: 2

Can be optimized to-- > pipeline.workers: number of CPU kernels (or several times the number of CPU kernels).

Number of threads in the actual output of ②

Default configuration-> pipeline.output.workers: 1

Can be optimized to-- > pipeline.output.workers: no more than the number of pipeline threads.

Number of events sent by ③ each time

Default configuration-- > pipeline.batch.size: 125

Can be optimized to-- > pipeline.batch.size: 1000.

④ transmission delay

Default configuration-- > pipeline.batch.delay: 5

Can be optimized to-- > pipeline.batch.size: 10.

Summary:

Specify the number of pipeline worker by setting the-w parameter, or you can modify the configuration file logstash.yml directly. This increases the number of threads in filter and output, and it is safe to set it to several times the number of cores in cpu if necessary, where threads are idle.

By default, each output is active on a pipeline worker thread, and you can set the workers setting in the output output, which is not greater than the number of pipeline worker.

You can also set the number of batch_size output, for example, the ES output is the same as the batch size.

After filter sets multiline, pipline worker is automatically set to 1. If filebeat is used, it is recommended to use multiline in beat. If logstash is used as shipper, it is recommended to set multiline in input rather than multiline in filter.

JVM configuration file in Logstash:

Logstash is a Java-based program that needs to run in JVM and can be configured for JVM by configuring jvm.options. For example, the size of memory, garbage cleaning mechanism and so on. The memory allocation of JVM should not be too large or too small, because it will slow down the operating system. Too small to start. The default is as follows:

Xms256m# lower limit memory usage

The upper limit of Xmx1g# uses memory.

4. Problems related to the introduction of Redis

Filebeat can be input directly to logstash (indexer), but logstash has no storage function. If you need to restart, you need to stop all connected beat, and then stop logstash, causing trouble for operation and maintenance. In addition, if logstash is abnormal, data will be lost. Redis is introduced as a data buffer pool. When the logstash is stopped abnormally, you can see that the data is cached in Redis from the client of Redis.

Redis can use list (up to 4294967295 articles) or publish-subscribe storage mode

Redis optimizes the ELK buffer queue:

Bind 0.0.0.0 # do not listen on the local port

Requirepass ilinux.io # plus password for safe operation

Do only queues, there is no need for persistent storage, turn off all persistence functions:

Snapshots (RDB files) and append files (AOF files) for better performance

Save "" disables snapshots

Appendonly no closes RDB.

Turn off the memory elimination strategy and set the memory space to as large as possible

When maxmemory 0 # maxmemory is 0, we have no restrictions on the memory usage of Redis.

5. Optimized configuration of Elasticsearch nodes.

Server hardware configuration, OS parameters:

1) / etc/sysctl.conf configuration

Vim / etc/sysctl.conf

① vm.swappiness = 1 # ES it is recommended to set this parameter to 1 to significantly reduce the size of the swap partition and force the use of memory. Note Do not set it to 0 here, which will most likely cause OOM ② net.core.somaxconn = 65535 # defines the length of the upper limit of the listening queue per port ③ vm.max_map_count= 262144 # limits the number of VMA (virtual memory areas) that a process can have. A virtual memory area is a contiguous area of virtual address space. When the number of VMA exceeds this value, OOM ④ fs.file-max = 518144 # sets the number of max of file handles allocated by the Linux kernel

[root@elasticsearch] # sysctl-p takes effect.

2) limits.conf configuration

Vim / etc/security/limits.conf

Elasticsearch soft nofile 65535 elasticsearch hard nofile 65535 elasticsearch soft memlock unlimited elasticsearch hard memlock unlimited

3) in order to keep the above parameters in effect, you need to set two places:

Vim / etc/pam.d/common-session-noninteractive vim / etc/pam.d/common-session

Add the following attributes:

Session required pam_limits.so

It may take effect after reboot.

JVM profile in Elasticsearch

-Xms2g

-Xmx2g

Set Xms and Xmx to be equal to each other.

The more heaps available to Elasticsearch, the more memory available for caching. Note, however, that too many heaps can cause you to pause garbage collection for a long time.

Set the Xmx to no more than 50% of the physical RAM to ensure that there is enough physical memory for the kernel file system cache.

Do not set Xmx above the threshold that JVM uses to compress object pointers; the exact cutoff value is different, but close to 32 GB. Do not exceed 32 gigabytes. If there is a lot of space, run a few more instances and don't let one instance have too much memory.

Elasticsearch profile optimization parameters:

1) vim elasticsearch.yml

Bootstrap.memory_lock: true # locks memory without using swap # cache, thread and other optimizations are as follows: bootstrap.mlockall: true transport.tcp.compress: true indices.fielddata.cache.size: 40% indices.cache.filter.size: 30% indices.cache.filter.terms.size: 1024mb threadpool: search: type: cached size: 100 queue_size: 2000

2) set environment variables

Vim / etc/profile.d/elasticsearch.sh export ES_HE AP _ SIZE=2g # Heap Size does not exceed half of physical memory and is less than 32G.

Cluster optimization (I did not use the cluster):

ES is distributed storage. When the same cluster.name is set, it will automatically discover and join the cluster.

The cluster automatically elects a master and re-elects when the master goes down.

In order to prevent "brain fissure", the number of people in the cluster is recommended to be odd.

To effectively manage the node, you can turn off the broadcast discovery. Zen.ping.multicast.enabled: false, and set unicast node group discovery.zen.ping.unicast.hosts: ["ip1", "ip2", "ip3"].

6. Performance check

Check the performance of inputs and outputs:

Logstash runs at the same speed as the services it connects to, and it can be as fast as input and output.

Check the system parameters:

1) CPU

Note whether the CPU is overloaded. In Linux/Unix systems, you can use top-H to view process parameters and totals.

If CPU is too high, skip to the section on checking the JVM heap and check the Logstash worker settings.

2) Memory

Note that Logstash runs in the Java virtual machine, so it only uses the memory you allocate to it.

Check that other applications are using a large amount of memory, which will cause Logstash to use hard disk swap, which can occur when the application takes up more memory than physical memory.

3) I am monitoring disk I am O checking disk saturation

Using Logstash plugin (for example, using file output) the disk becomes saturated.

Disk saturation also occurs when a large number of errors occur and Logstash generates a large number of error logs.

In Linux, you can use iostat,dstat or other commands to monitor disk Imax O.

4) Monitoring network IBO

When using a large number of network operations of input, output, it will lead to network saturation.

You can use dstat or iftop to monitor the network in Linux.

Check JVM heap:

A small heap setting can cause CPU usage to be too high because of JVM's garbage collection mechanism.

A quick way to check this setting is to set heap to twice the size and then detect performance improvements. Do not set the heap beyond the physical memory size and reserve at least 1G of memory for the operating system and other processes.

You can use something like the jmap command line or VisualVM to calculate JVM heap more accurately.

After reading the above, do you have any further understanding of how to solve the practical problem of ELK? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.