Highly available flume-ng build 04/16 Update SLTechnology News&Howtos

Highly available flume-ng build

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

I. Overview

1. The data is collected and stored on the hdfs by building a high-availability flume. The architecture is as follows:

2. Configure Agent

1.cat flume-client.properties

# name the components on this agent declares the names of source, channel and sink a1.sources = R1 a1.sinks = K1 K2 a1.channels = C1 # Describe/configure the source declares that the type of source is to listen on the local port 5140 a1.sources.r1.type = syslogtcp a1.sources.r1.port = 5140 a1.sources.r1.host = localhost a1.sources.r1.channels = C1 # define sinkgroups where the group policy of K1 and K2 is configured Type is load balancing a1.sinkgroups=g1 a1.sinkgroups.g1.sinks=k1 K2 a1.sinkgroups.g1.processor.type=load_balance a1.sinkgroups.g1.processor.backoff=true a1.sinkgroups.g1.processor.selector=round_robin # define the sink 1 data flow Both are sent to two collector machines via avro. A1.sinks.k1.type=avro a1.sinks.k1.hostname=hadoop1 a1.sinks.k1.port=5150 # define the sink 2 a1.sinks.k2.type=avro a1.sinks.k2.hostname=hadoop2a1.sinks.k2.port=5150 # Use a channel which buffers events in memory specifies that the channel type is memory mode a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 # Bind the source And sink to the channel a1.sources.r1.channels = C1 a1.sinks.k1.channel = C1 a1.sinks.k2.channel=c1

# a2 and a3 have the same configuration as A1

3. Configure Collector

1.cat flume-server.properties

# name the components on this agent declares source, channel, The name of sink collector1.sources = R1 collector1.channels = c1collector1.sinks = K1 # Describe the source declares that the type of source is avrocollector1.sources.r1.type = avro collector1.sources.r1.port = 5150 collector1.sources.r1.bind = 0.0.0.0 collector1.sources.r1.channels = C1 # Describe channels C1 which buffers events in memory specifies that the type of channel is memory mode collector1.channels.c1.type = memory collector1.channels.c1.capacity = 1000 collector1.channels.c1 .transactionCapacity = 100 # Describe the sink K1 to hdfs specifies the sink data flow hdfscollector1.sinks.k1.type = hdfs collector1.sinks.k1.channel = C1 collector1.sinks.k1.hdfs.path = hdfs://master/user/flume/logcollector1.sinks.k1.hdfs.fileType = DataStream collector1.sinks.k1.hdfs.writeFormat = TEXT collector1.sinks.k1.hdfs.rollInterval = 300 collector1.sinks.k1.hdfs.filePrefix =% Y-%m-%d collector1.sinks.k1.hdfs.round = True collector1.sinks.k1.hdfs.roundValue = 5 collector1.sinks.k1.hdfs.roundUnit = minute collector1.sinks.k1.hdfs.useLocalTimeStamp = true

# collector2 configuration is the same as collector1

IV. Start

1. Start fulme-ng on Collector

Flume-ng agent-n collector1-c conf-f / usr/local/flume/conf/flume-server.properties-Dflume.root.logger=INFO,console#-n followed by Agent Name in the configuration file

two。 Start flume-ng on Agent

Flume-ng agent-n A1-c conf-f / usr/local/flume/conf/flume-client.properties-Dflume.root.logger=INFO,console

5. Testing

[root@hadoop5 ~] # echo "hello" | nc localhost 5140 # requires installation of nc17/09/03 22:56:58 INFO source.AvroSource: Avro source R1 started.17/09/03 22:59:09 INFO ipc.NettyServer: [id: 0x60551752, / 192.168.100.15 nc17/09/03 34310 = > / 192.168.100.11 OPEN17/09/03 22:59:09 INFO ipc.NettyServer: [id: 0x60551752 / 192.168.100.15 id 34310 = > / 192.168.100.11 CONNECTED 5150] BOUND: / 192.168.100.11 id 515017 stop 09 22:59:09 INFO ipc.NettyServer: [id: 0x60551752, / 192.168.100.15 id 34310 = > / 192.168.100.11 CONNECTED 5150] CONNECTED: / 192.168.100.1514310170.153101709 23:03:54 INFO hdfs.HDFSDataStream: Serializer = TEXT UseRawLocalFileSystem = false17/09/03 23:03:54 INFO hdfs.BucketWriter: Creating hdfs://master/user/flume/log/2017-09-03.1504494234038.tmp

VI. Summary

There are generally two modes for highly available flume-ng: load_balance and failover. The configuration of load_balance,failover is as follows:

# set failovera1.sinkgroups.g1.processor.type = failovera1.sinkgroups.g1.processor.priority.k1 = 10a1.sinkgroups.g1.processor.priority.k2 = 1a1.sinkgroups.g1.processor.maxpenalty = 10000

Some commonly used source, channel, and sink types are as follows:

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.