Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Analysis of actual production scene of flume

2025-03-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

Requirements: an and B two log servers: the main types of real-time production logs are access.log, nginx.log and web.log. Now it requires:

Collect the access.log, nginx.log and web.log collections from An and B machines to C machine and then collect them into hdfs. However, the required directories in hdfs are:

   / source/logs/access/ date / * *

   / source/logs/nginx/ date / * *

   / source/logs/web/ date / * *

Scenario analysis:

Planning:

Hadoop01 (web01):

    source:access.log 、 nginx.log 、 web.log

    channel:memory

    sink:avro

Hadoop02 (web02):

    source:access.log 、 nginx.log 、 web.log

    channel:memory

    sink:avro

Hadoop03 (data Collection):

    source;avro

    channel:memory

    sink:hdfs

Configuration file:

# exec_source_avro_sink.properties# specifies each core component a1.sources = R1 R2 r3a1.sinks = k1a1.channels = c1#r1a1.sources.r1.type = execa1.sources.r1.command = tail-F / home/hadoop/flume_data/access.loga1.sources.r1.interceptors = i1a1.sources.r1.interceptors.i1.type = statica1.sources.r1.interceptors.i1.key = typea1.sources.r1.interceptors.i1.value = access#r2a1.sources.r2.type = execa1 .sources.r2.command = tail-F / home/hadoop/flume_data/nginx.loga1.sources.r2.interceptors = i2a1.sources.r2.interceptors.i2.type = statica1.sources.r2.interceptors.i2.key = typea1.sources.r2.interceptors.i2.value = nginx#r3a1.sources.r3.type = execa1.sources.r3.command = tail-F / home/hadoop/flume_data/web.loga1.sources.r3.interceptors = i3a1.sources.r3.interceptors.i3.type = statica1.sources .r3.promotors.i3.key = typea1.sources.r3.interceptors.i3.value = web#Describe the sinka1.sinks.k1.type = avroa1.sinks.k1.hostname = hadoop03a1.sinks.k1.port = 41414#Use a channel which buffers events in memorya1.channels.c1.type = memorya1.channels.c1.capacity = 20000a1.channels.c1.transactionCapacity = 10000#Bind the source and sink to the channela1.sources.r1.channels = c1a1.sources.r2.channels = c1a1.sources.r3.channels = c1a1.sinks.k1 .channel = c1#avro_source_hdfs_sink.properties# defines the agent name Source 、 channel 、 The name of sink is a1.sources = r1a1.sinks = k1a1.channels = caches definition sourcea1.sources.r1.type = avroa1.sources.r1.bind = 0.0.0.0a1.sources.r1.port = 4141 add time interceptor a1.sources.r1.interceptors = i1a1.sources.r1.interceptors.i1.type=org.apache.flume.interceptor.TimestampInterceptor$Builder# definition channelsa1.channels.c1.type = memorya1.channels.c1.capacity = 20000a1.channels.c1.transactionCapacity = 1000 blocks definition sinka1.sinks. K1.type = hdfsa1.sinks.k1.hdfs.path=hdfs://myha01/source/logs/% {type} /% Y%m%da1.sinks.k1.hdfs.filePrefix = eventsa1.sinks.k1.hdfs.fileType = DataStreama1.sinks.k1.hdfs.writeFormat = Text# time type a1.sinks.k1.hdfs.useLocalTimeStamp = true# generated files are not generated by the number of a1.sinks.k1.hdfs.rollCount = generated files generate a1.sinks.k1.hdfs by time. RollInterval = 3 files generated by size a1.sinks.k1.hdfs.rollSize = 1048576 bulk writes to hdfs a1.sinks.k1.hdfs.batchSize = number of threads for 20#flume operation hdfs (including new Write, etc.) a1.sinks.k1.hdfs.threadsPoolSize=10# operation hdfs timeout a1.sinks.k1.hdfs.callTimeout=30000# assembly source, channel, sinka1.sources.r1.channels = c1a1.sinks.k1.channel = C1

Test:

# the / home/hadoop/data on hadoop01 and hadoop02 has data files access.log, nginx.log, web.log#. Start flume on hadoop03 first: (store) flume-ng agent-c conf-f avro_source_hdfs_sink.properties-name A1-Dflume.root.logger=DEBUG,console# and then launch the command flume (collect) flume-ng agent-c conf-f exec_source_avro_sink.properties-name A1-Dflume.root.logger=DEBUG,console on hadoop01 and hadoop02

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report