How to install and use kafka 07/12 Update SLTechnology News&Howtos

How to install and use kafka

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article will explain in detail how to install and use kafka. The content of the article is of high quality, so the editor will share it for you as a reference. I hope you will have a certain understanding of the relevant knowledge after reading this article.

1. Kafka introduction 1.1. Main function

According to the official website, ApacheKafka? Is a distributed streaming media platform with three main functions:

1:It lets you publish and subscribe to streams of records. Publish and subscribe message flows, which is similar to message queuing, which is why kafka is classified as a message queuing framework

2:It lets you store streams of records in a fault-tolerant way. Record the message flow in a fault-tolerant manner, and kafka stores the message flow as a file

3:It lets you process streams of records as they occur. It can be processed when the message is released.

1.2. Working with scen

1:Building real-time streaming data pipelines that reliably get data between systems or applications. Build reliable pipelines for real-time data transmission between systems or applications, message queuing functions

2:Building real-time streaming applications that transform or react to the streams of data . Build real-time stream data processing programs to transform or process data streams, data processing functions

1.3. Detailed introduction

At present, Kafka is mainly used as a distributed publishing and subscription messaging system. Here is a brief introduction to the basic mechanism of kafka.

1.3.1 message transmission process

Producer, the producer, sends messages to the Kafka cluster. Before sending messages, it classifies the messages, namely Topic. The figure shows that two producer send messages classified as topic1, and the other sends messages of topic2.

Topic is the topic. Messages can be classified by specifying a topic, and consumers can only focus on the messages in the Topic they need.

Consumer is the consumer, who constantly pulls messages from the cluster by establishing a long connection with the kafka cluster, and then can process these messages.

From the figure above, we can see that the number of consumers and producers under the same Topic is not corresponding.

1.3.2 kafka server message storage policy

When it comes to the storage of kafka, we have to mention partitions, that is, partitions. When creating a topic, you can specify the number of partitions at the same time. The more the number of partitions, the greater the throughput, but the more resources are needed, which will also lead to higher unavailability. After receiving the message sent by the producer, kafka will store the message in different partitions according to the balance policy.

In each partition, messages are stored sequentially, and the latest received messages are finally consumed.

1.3.3 interaction with producers

When the producer sends a message to the kafka cluster, it can send it to the specified partition by specifying the partition.

Messages can also be sent to different partitions by specifying an equalization policy

If not specified, the default random equalization strategy is used to store messages randomly in different partitions

1.3.4 interaction with consumers

When consumers consume messages, kafka uses offset to record the location of current consumption

In the design of kafka, there can be multiple different group to consume messages under the same topic at the same time. As shown in the figure, we have two different group consumption at the same time, and the record location of their consumption is offset, which does not interfere with each other.

For a group, the number of consumers should not exceed the number of partitions, because in a group, each partition can only be bound to one consumer, that is, a consumer can consume multiple partitions, and a partition can only be consumed by one consumer.

Therefore, if the number of consumers in a group is greater than the number of partitions, the extra consumers will not receive any messages.

2. Installation and use of Kafka 2.1. download

You can download the latest kafka installation package on kafka's official website http://kafka.apache.org/downloads, and choose to download the binary version of the tgz file. Fq may be required according to the network status. The version we choose here is 0.11.0.1, the latest version.

2.2. Installation

Kafka is a program written in scala to run on a jvm virtual machine, although it can also be used on windows, but kafka basically runs on a linux server, so we also use linux here to start today's actual combat.

First of all, make sure that you need java running environment to install jdk,kafka on your machine. The previous kafka also needs zookeeper. The new version of kafka already has a built-in zookeeper environment, so we can use it directly.

When it comes to installation, if we only need to make the simplest attempt, we just need to extract it to any directory. Here we extract the kafka package to the / home directory

2.3. Configuration

Under the kafka decompression directory, there is a config folder in which our configuration files are placed.

Consumer.properites consumer configuration, this profile is used to configure consumers that are enabled in Section 2.5, here we can use the default

Producer.properties producer configuration, which is used to configure producers that are enabled in Section 2.5. Here we can use the default

Configuration of server.properties kafka server. This configuration file is used to configure kafka server. At present, only a few of the most basic configurations are introduced.

Broker.id states that the unique ID of the current kafka server in the cluster needs to be configured as integer, and the id of each kafka server in the cluster should be unique. We can use the default configuration here.

Listeners states the port number that this kafka server needs to listen to. If you are running a virtual machine on this machine, you do not need to configure this item. By default, you will use the address of localhost. If you are running on a remote server, you must configure it, for example:

Listeners=PLAINTEXT:// 192.168.180.128:9092 . And make sure that port 9092 of the server can be accessed

3.zookeeper.connect states that the address of the zookeeper to which kafka is connected needs to be configured as the address of zookeeper. Since zookeeper is included in the high version of kafka, you can use the default configuration.

Zookeeper.connect=localhost:2181

2.4. Running

Start zookeeper

Enter cd into the kafka decompression directory and enter

Bin/zookeeper-server-start.sh config/zookeeper.properties &

After starting zookeeper successfully, you will see the following output

two。 Start kafka

Enter cd into the kafka decompression directory and enter

Bin/kafka-server-start.sh config/server.properties

After starting kafka successfully, you will see the following output

2.5. The first message.

2.5.1 create a topic

Kafka manages the same kind of data through topic, and it is more convenient for the same kind of data to use the same topic to process the data.

Open the terminal in the kafka decompression directory and type

Bin/kafka-topics.sh-create-zookeeper localhost:2181-replication-factor 1-partitions 1-topic test

Create a topic named test

After creating a topic, you can enter

Bin/kafka-topics.sh-list-zookeeper localhost:2181

To view the topic that has been created

2.4.2 create a message consumer

Open the terminal in the kafka decompression directory and enter (from-beginning is consumed from scratch every time. If you don't want to consume from scratch, you can cancel the parameter)

Bin/kafka-console-consumer.sh-bootstrap-server localhost:9092-topic test-from-beginning

You can create a consumer whose topic is test

After the consumer has been created, no data has been printed here because no data has been sent.

But don't worry, don't close the terminal, open a new terminal, and then let's create the first message producer.

2.4.3 create a message producer

Open a new terminal in the kafka decompression directory and type

Bin/kafka-console-producer.sh-broker-list localhost:9092-topic test

The editor page that will be entered after the execution is completed

After sending the message, we can go back to our message consumer terminal and see that the message we just sent has been printed out in the terminal.

Python pseudocode version

Consumer

[root@ip-10-1-2-175sh] # more cus.py

Import time, json

From pykafka import KafkaClient

Client = KafkaClient (hosts= "10.1.2.175 Client 9092") # can accept multiple Client this is the key point

Topic = client.topics ['test'] # Select a topic

# generate a consumer

Balanced_consumer = topic.get_balanced_consumer (consumer_group='goods_group',auto_commit_enable=True,zookeeper_connect='localhost:2181')

For message in balanced_consumer:

Print message

Producer

[root@ip-10-1-2-175sh] # more prod.py

Import time, json

From pykafka import KafkaClient

Def pro ():

Client = KafkaClient (hosts= "10.1.2.175 purl 9092")

Topic = client.topics ['test'] # Select a topic

Producer = topic.get_producer () # create a producer

Goods_dict = {'option_type':'insert','option_obj': {' goods_name':'goods-1'}}

Goods_json = json.dumps (goods_dict)

Producer.produce (goods_json) # production message

Producer.stop ()

If _ _ name__ = ='_ _ main__':

Pro ()

Start the consumer

[root@ip-10-1-2-175sh] # python cus.py

Start the producer

[root@ip-10-1-2-175sh] # python prod.py

View consumers

[root@ip-10-1-2-175sh] # python cus.py

On how to install and use kafka to share here, I hope that the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.