Example Analysis of Flume Framework 04/27 Update SLTechnology News&Howtos

Example Analysis of Flume Framework

2025-04-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article shares with you the content of an example analysis of Flume framework. Xiaobian thinks it is quite practical, so share it with everyone for reference. Let's follow Xiaobian and have a look.

Flume is a distributed framework for mass data collection.

Flume Framework Flowchart

Channel is the cached data. If Sink is transferred to HDFS, the cached data in Channel will be deleted. If Sink is not transferred successfully, Channel is equivalent to making a backup. Sink repeatedly fetches data from Channel.

Deploy a Flume agent on hadoop0

1. Decompress apache-flume-1.4.0-bin.tar.gz and apache-flume-1.4.0-src.tar.gz on hadoop0.

2. Copy all the contents of the decompressed apache-flume-1.4.0-src folder to apache-flume-1.4.0.- bin folder.

3. Modify the names of two files in the conf directory, one is flume-env.sh and the other is flume-conf.properties.

The JAVA_HOME value is set in flume-env.sh.

4. Example: Upload files in disk folder to HDFS through flume.

4.1 Create a file in the conf directory called test.conf with the following contents:

#Configure Agent

#a1 is a proxy name, s1 is the name of source, sink1 is the name of sink, c1 is the name of channel

a1.sources = s1

a1.sinks = sink1

a1.channels = c1

#Configure a source dedicated to reading data from folders

a1.sources.s1.type = spooldir

a1.sources.s1.spoolDir = /apache_logs #Value apache_logs indicates directory of data files

a1.sources.s1.fileSuffix=.abc #Value.abc indicates the filename suffix to be renamed after the data file has been processed

a1.sources.s1.channels = c1 #The value c1 indicates the name of the channel to which the source receives data

#configure a sink dedicated to writing input to hdfs

a1.sinks.sink1.type = hdfs

a1.sinks.sink1.hdfs.path=hdfs://hadoop0:9000/apache_logs #Value indicates destination

a1.sinks.sink1.hdfs.fileType=DataStream #Value DataStream indicates the file type and is uncompressed

a1.sinks.sink1.hdfs.writeFormat=Text #value indicates that the original content of the file is written

a1.sinks.sink1.channel = c1 #The value c1 indicates the source of the sink processing data

#Configure a channel processed in memory

a1.channels.c1.type = memory

a1.channels.c1.capacity = 1000

a1.channels.c1.transactionCapacity = 100

Run:[root@hadoop conf]#../ bin/flume-ng agent --conf conf --conf-file test.conf --name a1 -Dflume.root.looger=DEBUG,console

Thank you for reading! About "Flume framework example analysis" This article is shared here, I hope the above content can be of some help to everyone, so that everyone can learn more knowledge, if you think the article is good, you can share it to let more people see it!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.