How to build distributed Log system with Elasticsearch+Fluentd+Kafka 07/03 Update SLTechnology News&Howtos

How to build distributed Log system with Elasticsearch+Fluentd+Kafka

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

In this issue, Xiaobian will bring you about Elasticsearch+Fluentd+Kafka how to build a distributed log system. The article is rich in content and analyzed and described from a professional perspective. After reading this article, I hope you can gain something.

preface

ELK is being replaced by EFK due to logstash's large memory footprint and relatively less flexibility. EFK mentioned in this article is Elasticsearch+Fluentd+Kfka, in fact K should be Kibana for log display, this piece does not do demonstration, this article only describes the data collection process.

premise

docker

docker-compose

apache kafka services

Architecture Data Collection Process

Data generation Monitoring data collected using cadvisor containers and transmitted to Kafka.

The data transmission link is as follows: Cadvospr->Kafka->Fluentd->elasticsearch

Each service can be scaled out, adding services to the logging system.

profile

docker-compose.yml

version: "3.7"

services: elasticsearch: image: elasticsearch:7.5.1 environment: - discovery.type=single-node #Start in standalone mode ports: - 9200:9200 cadvisor: image: google/cadvisor command: -storage_driver=kafka -storage_driver_kafka_broker_list=192.168.1.60:9092(kafka service IP:PORT) -storage_driver_kafka_topic=kafeidou depends_on: - elasticsearch fluentd: image: lypgcs/fluentd-es-kafka:v1.3.2 volumes: - ./:/ etc/fluent - /var/log/fluentd:/var/log/fluentd

of which:

The data generated by cadvisor will be transferred to 192.168.1.60 kafka service of this machine,topic is kafeidou

elasticsearch is specified as single-node startup (discovery.type=single-node environment variable), which is to facilitate the overall effect of the experiment

fluent.conf

# type http

# port 8888

@type kafka

brokers 192.168.1.60:9092

format json

topic kafeidou

@type copy

# @type stdout

@type elasticsearch

host 192.168.1.60

port 9200

logstash_format true

#target_index_key machine_name

logstash_prefix kafeidou

logstash_dateformat %Y.% m.% d

flush_interval 10s

of which:

The plug-in type copy is to copy the data received by fluentd, to facilitate debugging, to print the data in the console or store it in a file, this configuration file is closed by default, and only provides the necessary es output plug-in.

If necessary, you can open @type stdout to debug whether data is received.

The input source is also configured with an http input configuration, which is turned off by default and is also used for debugging, putting data into fluentd.

You can run the following command on linux:

curl -i -X POST -d 'json={"action":"write","user":"kafeidou"}' http://localhost:8888/mytag

target_index_key parameter, this parameter is the corresponding value of a field in the data as the index of es, for example, this configuration file uses the value of the machine_name field as the index of es.

begun to deploy

Execute in the directory containing the docker-compose.yml file and the fluent.conf file:

docker-compose up -d

After checking that all containers are working correctly, check whether elasticsearch generates the expected data as a verification. Here, check whether the index of es is generated and the amount of data to verify:

-bash: -: No command found

[root@master kafka]# curl http://192.168.1.60:9200/_cat/indices? v

health status index uuid pri rep docs.count docs.deleted store.size pri.store.size

yellow open 55a4a25feff6 Fz_5v3suRSasX_Olsp-4tA 1 1 1 0 4kb 4kb

You can also type http://www.example.com directly into your browser. 192.168.1.60:9200/_cat/indices? v See the results, it will be more convenient.

You can see that I used the machine_name field as the index value here. The result of the query is to generate an index data called 55a4a25feff6, and generate 1 piece of data (docs.count).

So far kafka->fluentd->es such a log collection process is complete.

Of course, the architecture is not fixed. Data can also be collected using the fluentd->kafka->es method. Here do not demonstrate, nothing more than modify the fluentd.conf configuration file, es and kafka related configuration to do the corresponding position of the swap can be done.

You are encouraged to read the official documentation. Fluentd-es plugins and fluentd-kafka plugins can be found on github or fluentd's official website.

The above is how Elasticsearch+Fluentd+Kafka builds a distributed log system shared by Xiaobian. If there is a similar doubt, please refer to the above analysis for understanding. If you want to know more about it, please pay attention to the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.