In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
1. Introduction
Kafka is a distributed,partitioned,replicated commit logservice . It provides features similar to JMS, but is completely different in design and implementation, and it is not an implementation of the JMS specification. Kafka classifies messages according to Topic when they are saved, the sender becomes Producer and the message receiver becomes Consumer. In addition, the kafka cluster is composed of multiple kafka instances, and each instance (server) becomes broker. Both kafka clusters, producer and consumer rely on zookeeper to ensure system availability that the cluster holds some meta information.
Kafka is a distributed, publish / subscribe-based messaging system whose architecture includes the following components:
i. The publish of the message is called producer, the subscribe of the message is called consumer, and the intermediate storage array is called broker.
ii. Multiple broker work together, and producer, consumer and broker coordinate requests and forwarding through zookeeper.
iii. Producer generates and pushes (push) data to broker,consumer to pull (pull) data from broker and processes it.
iv. The broker side does not maintain the consumption status of data, which improves the performance. Published messages are stored in a set of servers called Kafka clusters. Each server in the cluster is a Broker. Consumers can subscribe to one or more topics and pull data from Broker to consume these published messages.
v. Direct use of disk storage, linear read and write, fast: avoid data replication between JVM memory and system memory, reduce performance-consuming creation of objects and garbage collection.
vi. Kafka is written in scala and can be run on JVM.
As shown in the figure above, a typical Kafka cluster contains:
Several Producer (can be Page View generated by web front end, or server log, system CPU, Memory, etc.), and several broker (Kafka supports horizontal expansion. Generally, the more the number of broker, the higher the cluster throughput).
Several Consumer Group and a Zookeeper cluster. Kafka manages the cluster configuration through Zookeeper, elects leader, and rebalance when the Consumer Group changes. Producer publishes messages to broker,Consumer using the push pattern subscribes and consumes messages from broker using the pull pattern.
Topic & Partition
Topic can be logically thought of as a queue, and each consumer must specify its Topic, which can be simply understood as indicating which queue to put the message into. In order to improve the throughput of Kafka linearly, the Topic is physically divided into one or more Partition, and each Partition physically corresponds to a folder in which all the messages and index files of the Partition are stored. If you create two topic of topic1 and topic2 with 13 and 19 partitions respectively, a total of 32 folders will be generated on the whole cluster.
Start installing the kafka cluster:
1, create a user
Add users on all hosts:
Groupadd kafka
Useradd kafka-g kafka
2. The host is assigned to Hadoop1 and Hadoop2,Hadoop3 respectively.
3. Bind hosts
172.16.1.250 hadoop1
172.16.1.252 hadoop2
172.16.1.253 hadoop3
4, download, decompress
Https://kafka.apache.org/
Tar-xzf kafka_2.9.2-0.8.1.1.tgz
Cd kafka_2.9.2-0.8.1.1
Ln-s / usr/local/hadoop/kafka_2.10-0.8.1.1 / usr/local/hadoop/kafka
Chown-R kafka:kafka / usr/local/hadoop
Install it on the Hadoop3 machine first
5. Modify the configuration file
Cd / usr/local/hadoop/kafka/config
Vim / kafka/server.properties
The id of the three broker.id=3 computers cannot be the same.
Port=9092
Num.network.threads=2
Num.io.threads=8
Socket.send.buffer.bytes=1048576
Socket.receive.buffer.bytes=1048576
Socket.request.max.bytes=104857600
Log.dirs=/tmp/kafka-logs
Num.partitions=2
Log.retention.hours=168
Log.segment.bytes=536870912
Log.retention.check.interval.ms=60000
Log.cleaner.enable=false
Zookeeper.connect=hadoop1:2181,hadoop2:2181,hadoop3:2181/kafka (zookpeer Cluster)
Zookeeper.connection.timeout.ms=1000000
Start
Bin/kafka-server-start.sh / usr/local/hadoop/kafka/config/server.properties &
6. Configure the Java environment
# java
Export JAVA_HOME=/soft/jdk1.7.0_79
Export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
Export PATH=$PATH:/$JAVA_HOME/bin:$HADOOP_HOME/bin
7. Deploy the kafka cluster
Since the kafka cluster depends on zookeeper, install zookeeper
See:
Https://taoistwar.gitbooks.io/spark-operationand-maintenance-management/content/spark_relate_software/kafka_install.html
7. Synchronize the configuration files of the three machines and modify the corresponding broker.id=1,broker.id=2,broker.id=3
Cd / usr/local/hadoop/
Install one on the Hadoop3 machine first.
Scp-r kafka/ hadoop1:/usr/local/hadoop/
Scp-r kafka/ hadoop2:/usr/local/hadoop/
On the Hadoop1 machine, modify the configuration file and start
Vim conf/server.properties
The id of the three broker.id=1 computers cannot be the same.
Port=9092
Num.network.threads=2
Num.io.threads=8
Socket.send.buffer.bytes=1048576
Socket.receive.buffer.bytes=1048576
Socket.request.max.bytes=104857600
Log.dirs=/tmp/kafka-logs
Num.partitions=2
Log.retention.hours=168
Log.segment.bytes=536870912
Log.retention.check.interval.ms=60000
Log.cleaner.enable=false
Zookeeper.connect=hadoop1:2181,hadoop2:2181,hadoop3:2181/kafka (zookpeer Cluster)
Zookeeper.connection.timeout.ms=1000000
Start
Bin/kafka-server-start.sh / usr/local/kafka/config/server.properties &
On the Hadoop2 machine, modify the configuration file and start
Vim conf/server.properties
The id of the three broker.id=2 computers cannot be the same.
Port=9092
Num.network.threads=2
Num.io.threads=8
Socket.send.buffer.bytes=1048576
Socket.receive.buffer.bytes=1048576
Socket.request.max.bytes=104857600
Log.dirs=/tmp/kafka-logs
Num.partitions=2
Log.retention.hours=168
Log.segment.bytes=536870912
Log.retention.check.interval.ms=60000
Log.cleaner.enable=false
Zookeeper.connect=hadoop1:2181,hadoop2:2181,hadoop3:2181/kafka (zookpeer Cluster)
Zookeeper.connection.timeout.ms=1000000
Start
Bin/kafka-server-start.sh / usr/local/hadoop/kafka/config/server.properties &
8 Verification
Start Console-based producer and consumer using scripts that come with Kafka.
9, error summary:
TttpVERVERULAR Universe Wenda.ChinaHadoop.cnUnix Universe 4079 notify ficationalization cards 290954 RTF falsehood itemization cards 10382
Http://blog.csdn.net/wenxuechaozhe/article/details/52664774
Http://472053211.blog.51cto.com/3692116/1655844
10. For actual operation, please see:
Https://taoistwar.gitbooks.io/spark-operationand-maintenance-management/content/spark_relate_software/kafka_install.html
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.