In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article is to share with you about the integration of Flume+Kafka+SparkStreaming, the editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article.
1. Architecture
The first step is to dock Flume and Kakfa. Flume grabs the log and writes it to Kafka.
In the second part, Spark Streaming reads the data in Kafka for real-time analysis.
First of all, use the message processing (script) that comes with Kakfa to get the message and get through the docking of Flume and Kafka.
two。 Install flume,kafka
Flume install: http://my.oschina.net/u/192561/blog/692225
Kafka install: http://my.oschina.net/u/192561/blog/692357
Integration of 3.Flume and Kafka
3.1 advantages of integration of the two
Flume prefers the data transfer itself, and Kakfa is a typical message middleware used to decouple producers and consumers.
In terms of architecture, Agent does not send data directly to Kafka, and there is a layer of forward made up of Flume in front of Kafka. There are two reasons for this:
Kafka's API is not friendly to non-JVM language support, and forward provides a more general HTTP interface. The forward layer can do routing, Kafka topic, Kafkapartition key and other logic, further reducing the logic of the agent side.
When the data has data source to flume and then to Kafka, on the one hand, the data can be synchronized to HDFS for offline calculation, on the other hand, it can do real-time calculation. In this paper, the real-time calculation is tested by SparkStreaming.
3.2 Integrated installation of Flume and Kafka
1. Download the Flume and Kafka integrated plug-ins at:
Https://github.com/beyondj2ee/flumeng-kafka- plugin
Copy the flumeng-kafka-plugin.jar from the package directory to the lib directory of the Flume installation directory
two。 Copy the following jar package under the libs directory of the Kakfa installation directory to the lib directory of the Flume installation directory
Kafka_2.11-0.10.0.0.jar
Scala-library-2.11.8.jar
Metrics-core-2.2.0.jar
Extract the flume-conf.properties file in the plug-in: modify it as follows: flume source uses exec
Producer.sources.s.type = execproducer.sources.s.command=tail-F-Number1 / home/eric/bigdata/kafka-logs/a.logproducer.sources.s.channels = C1
Change the topic of the producer agent to HappyBirthDayToAnYuan
Put the configuration into apache-flume-1.6.0-bin/conf/producer.conf
Full producer.conf:
# agentsectionproducer.sources= s1producer.channels = c1producer.sinks= producer.sources.s1.type=exec# configuration data source producer.sources.s1.type=exec# configuration log output file or directory to be monitored producer.sources.s1.command=tail-F-Number1 / home/eric/bigdata/kafka-logs/a.log# configuration data channel producer.channels.c1.type=memoryproducer.channels.c1.capacity=10000producer.channels.c1.transactionCapacity=100# configuration data source output # set Kafka receiver, this is the worst, pay attention to the version Here, set the broker address and port number of Kafka for the output slot type producer.sinks.k1.type= org.apache.flume.sink.kafka.KafkaSink# of Flume 1.6.0. Producer.sinks.k1.brokerList=localhost:9092# sets the Topicproducer.sinks.k1.topic=HappyBirthDayToAnYuan# setting serialization mode of Kafka. Producer.sinks.k1.serializer.class=kafka.serializer.StringEncoder# cascades the three producer.sources.s1.channels=c1producer.sinks.k1.channel=c1.
3.3.Starting kafka flume related services
Start ZK bin/zookeeper-server-start.sh config/zookeeper.properties
Start the Kafka service bin/kafka-server-start.sh config/server.properties
Create a theme
Bin/kafka-topics.sh-create-zookeeper localhost:2181-replication-factor 1-partitions 1-topic HappyBirthDayToAnYuan
View topic
Bin/kafka-topics.sh-list-zookeeper localhost:2181
View topic details
Bin/kafka-topics.sh-describe-zookeeper localhost:2181-topic HappyBirthDayToAnYuan
Delete theme
Bin/kafka-topics.sh-delete-zookeeper localhost:2181-topic test
Create consumers
Bin/kafka-console-consumer.sh-zookeeper localhost:2181-topic test-from-beginning
Start flume
Bin/flume-ng agent-n producer-c conf-f conf/producer.conf-Dflume.root.logger=INFO,console
Send data to flume:
Echo "yuhai" > > a.log
Kafka consumption data:
Note: the current file content is deleted, the server is restarted, and the theme needs to be recreated, but the consumption content has landing files, and the current consumption content does not disappear. The above is what the integration of Flume+Kafka+SparkStreaming is like. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
Import paramikodef sshclient_execmd(hostname, port, username, password, execmd): paramiko
© 2024 shulou.com SLNews company. All rights reserved.