In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/02 Report--
Flume+Kafka integration
I. preparatory work
Prepare 5 intranet servers to create Zookeeper and Kafka clusters
Server address:
192.168.2.240
192.168.2.241
192.168.2.242
192.168.2.243
192.168.2.244
Server system: Centos 6.564 bit
Download the installation package
Zookeeper: http://apache.fayea.com/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz
Flume: http://apache.fayea.com/flume/1.7.0/apache-flume-1.7.0-bin.tar.gz
Kafka: http://apache.fayea.com/kafka/0.10.0.0/kafka_2.10-0.10.0.0.tgz
The Java environment is required for Zookeeper,Flume,kafka, so install JDk first
Yum install java-1.7.0-openjdk-devel
2. Install and configure zookeeper
Select 3 servers as the zookeeper cluster, and their IP are:
192.168.2.240
192.168.2.241
192.168.2.242
Note: first perform steps (1)-(3) on the first server 192.168.2.240.
(1) decompress: put zookeeper-3.4.6.tar.gz in / opt directory
Tar zxf zookeeper-3.4.6.tar.gz
(2) create a configuration file: make a copy of conf/zoo_sample.cfg named zoo.cfg, and also put it in the conf directory. Then modify the configuration according to the following values:
TickTime=2000
DataDir=/opt/zookeeper/Data
InitLimit=5
SyncLimit=2
ClientPort=2181
Server.1=192.168.2.240:2888:3888
Server.2=192.168.2.241:2888:3888
Server.3=192.168.2.242:2888:3888
The meaning of each parameter:
TickTime: interval between heartbeat detection (milliseconds). Default: 2000
ClientPort: the port on which other applications (such as solr) access ZooKeeper. Default: 2181.
InitLimit: phase of initial synchronization (stage when followers is connected to leader), length of time allowed (number of tick), default: 10
SyncLimit: length of time followers is allowed to synchronize to ZooKeeper (number of tick). Default: 5
DataDir: the path where data (such as managed configuration files) is stored
Server.X:X is the id of a server in the cluster, which is consistent with the id in the myid file. Two ports can be configured on the right, the first for data synchronization and other communications between Fllower and Leader, and the second for voting communication during the Leader election.
(3) create the / opt/zookeeper/Data snapshot directory and create the my id file, which is written to 1.
Mkdir / opt/zookeeper/Data vi / opt/zookeeper/Data/myid 1
(4) copy the configured / opt/zookeeper/ directory on 192.168.2.240 to 192.168.2.241 and 192.168.2.242, respectively. Then modify the contents of the corresponding myid to 2 and 3
(5) start zookeeper cluster
Execute the startup command on each of the three servers
/ opt/zookeeper/bin/zkServer.sh start III. Install and configure kafka cluster
A total of 5 servers, server IP address:
192.168.2.240 node1
192.168.2.241 node2
192.168.2.242 node3
192.168.2.243 node4
192.168.2.244 node5
1. Extract the installation files to the / opt/ directory
Cd / opttar-zxvf kafka_2.10-0.10.0.0.tar.gzmv kafka_2.10-0.10.0.0 kafka
2. Modify server. Properties file
# node1 configuration
Broker.id=0
Port=9092
Advertised.listeners=PLAINTEXT:// 58.246.xx.xx:9092
Advertised.host.name=58.246.xx.xx
# the pit encountered, because I pulled the nginx log back to the company's local server online, these two options must be configured as the router extranet IP address, otherwise the online flume report cannot connect to the kafka node or send log messages.
Advertised.port=9092
Num.network.threads=3
Num.io.threads=8
Num.partitions=5
Zookeeper.connect=192.168.2.240:2181192.168.2.241:2181192.168.2.242:2181
# node2 configuration
Broker.id=1
Port=9093
Advertised.listeners=PLAINTEXT://58.246.xx.xx:9093
Advertised.host.name=58.246.xx.xx
Advertised.port=9093
Num.network.threads=3
Num.io.threads=8
Num.partitions=5
Zookeeper.connect=192.168.2.240:2181192.168.2.241:2181192.168.2.242:2181
# node3 configuration
Broker.id=2
Port=9094
Advertised.listeners=PLAINTEXT:// 58.246.xx.xx:9094
Advertised.host.name=58.246.xx.xx
Advertised.port=9094
Num.network.threads=3
Num.io.threads=8
Num.partitions=5
Zookeeper.connect=192.168.2.240:2181192.168.2.241:2181192.168.2.242:2181
# node4 configuration
Broker.id=2
Port=9095
Advertised.listeners=PLAINTEXT:// 58.246.xx.xx:9095
Advertised.host.name=58.246.xx.xx
Advertised.port=9095
Num.network.threads=3
Num.io.threads=8
Num.partitions=5
Zookeeper.connect=192.168.2.240:2181192.168.2.241:2181192.168.2.242:2181
# node5 configuration
Broker.id=2
Port=9096
Advertised.listeners=PLAINTEXT:// 58.246.xx.xx:9096
Advertised.host.name=58.246.xx.xx
Advertised.port=9096
Num.network.threads=3
Num.io.threads=8
Num.partitions=5
Zookeeper.connect=192.168.2.240:2181192.168.2.241:2181192.168.2.242:2181
Start the Kafka cluster
Execute the following command on all nodes to start the service
/ opt/kafka/bin/kafka-server-start.sh/opt/kafka/config/server.properties &
4. Install and configure Flume
Install two flume, one is installed online, the online log is sent back to the local kafka, and the other is installed locally to transfer the log information of the kafka cluster to HDFS
4.1. install Flume on online server
Collect nginx logs and send them to the company's internal kafka
1. Extract the installation package
Cd / opt
Tar-zxvf apache-flume-1.7.0-bin.tar.gz
2. Create a configuration file
Vi flume-conf.properties adds the following
A1.sources = R1
A1.sinks = K1
A1.channels = C1
# Describe/configure the source
A1.sources.r1.type = exec
A1.sources.r1.command = tail-F/unilifeData/logs/nginx/access.log
A1.sources.r1.channels = C1
# Use a channel which buffers events in memory
A1.channels.c1.type = memory
A1.channels.c1.capacity = 100000
A1.channels.c1.transactionCapacity = 100000
# sinks
A1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
A1.sinks.k1.kafka.topic = unilife_nginx_production
A1.sinks.k1.kafka.bootstrap.servers = 58.246.xx.xxpur9092, 58.246.xx.xxpur9093, 58.246.xx.xxpur9094
A1.sinks.k1.brokerList = 58.246.xx.xxpur9092, 58.246.xx.xxpur9093, 58.246.xx.xxpur9094
A1.sinks.k1.kafka.producer.acks = 1
A1.sinks.k1.flumeBatchSize = 2000
A1.sinks.k1.channel = C1
Start the flume service
/ opt/flume/bin/flume-ng agent-- conf/ opt/flume/conf/--conf-file / opt/flume/conf/flume-conf.properties-- name A1 muri Dflume.root.loggerFormInfoFormLOGFILE &
4.2. Install flume locally
Dump the log to HDFS
1. Extract the installation package
Cd / opt
Tar-zxvf apache-flume-1.7.0-bin.tar.gz
3. Create a configuration file
Nginx.sources = source1
Nginx.channels = channel1
Nginx.sinks = sink1
Nginx.sources.source1.type = org.apache.flume.source.kafka.KafkaSource
Nginx.sources.source1.zookeeperConnect = master:2181,slave1:2181,slave2:2181
Nginx.sources.source1.topic = unilife_nginx_production
Nginx.sources.source1.groupId = flume_unilife_nginx_production
Nginx.sources.source1.channels = channel1
Nginx.sources.source1.interceptors = i1
Nginx.sources.source1.interceptors.i1.type = timestamp
Nginx.sources.source1.kafka.consumer.timeout.ms = 100
Nginx.channels.channel1.type = memory
Nginx.channels.channel1.capacity = 10000000
Nginx.channels.channel1.transactionCapacity = 1000
Nginx.sinks.sink1.type = hdfs
Nginx.sinks.sink1.hdfs.path = hdfs://192.168.2.240:8020/user/hive/warehouse/nginx_log
Nginx.sinks.sink1.hdfs.writeFormat=Text
Nginx.sinks.sink1.hdfs.inUsePrefix=_
Nginx.sinks.sink1.hdfs.rollInterval = 3600
Nginx.sinks.sink1.hdfs.rollSize = 0
Nginx.sinks.sink1.hdfs.rollCount = 0
Nginx.sinks.sink1.hdfs.fileType = DataStream
Nginx.sinks.sink1.hdfs.minBlockReplicas=1
Nginx.sinks.sink1.channel = channel1
Start the service
/ opt/flume/bin/flume-ng agent-conf/ opt/flume/conf/--conf-file / opt/flume/conf/flume-nginx-log.properties-name nginx-Dflume.root.logger=INFO,LOGFILE &
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.