Environmental installation of kafka and storm clusters 10/28 Update SLTechnology News&Howtos

Environmental installation of kafka and storm clusters

2025-10-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Preface

Storm and kafka cluster installation are not necessarily related, and I write them together because they are both managed by zookeeper and depend on the JDK environment, so I write them together in order not to repeat the configuration. If you only need one, just choose the one you choose to read.

The dependencies of the two are as follows:

Storm cluster: JDK1.8, Zookeeper3.4,Storm1.1.1;Kafa cluster: JDK1.8, Zookeeper3.4, Kafka2.12

Note: Storm1.0 and Kafka2.0 require more than 1.7 for JDK and above for Zookeeper3.0.

Download address:

Zookeeper: https://zookeeper.apache.org/releases.html#download

Storm: http://storm.apache.org/downloads.html

Kafka: http://kafka.apache.org/downloads

JDK installation

Every machine must be equipped with JDK installation!

Description: generally, CentOS comes with openjdk, but we use oracle's JDK here. So write and uninstall openjdk, and then install the downloaded JDK in oracle. If you have uninstalled it, you can skip this step.

First enter java-version

Check to see if JDK is installed, and if so, but the version is not suitable, uninstall it

Input

Rpm-qa | grep java

View information

Then enter:

Rpm-e-- nodeps "you want to uninstall JDK information"

Such as: rpm-e-- nodeps java-1.7.0-openjdk-1.7.0.99-2.6.5.1.el6.x86_64

After confirming that it is gone, extract the downloaded JDK

Tar-xvf jdk-8u144-linux-x64.tar.gz

Move to the opt/java folder, create a new one without it, and rename the folder to jdk1.8.

Mv jdk1.8.0_144 / opt/javamv jdk1.8.0_144 jdk1.8

Then edit the profile file and add the following configuration

Enter:

Vim / etc/profile

Add:

Export JAVA_HOME=/opt/java/jdk1.8export JRE_HOME=/opt/java/jdk1.8/jreexport CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/libexport PATH=.:$ {JAVA_HOME} / bin:$PATH

After the addition is successful, enter

Source / etc/profilejava-version

Check to see if the configuration is successful

Zookeeper environment installation 1, file preparation

Extract the downloaded Zookeeper configuration file

On linux, enter:

Tar-xvf zookeeper-3.4.10.tar.gz

Then move to / opt/zookeeper, create a new one without it, and rename the folder to zookeeper3.4.

Input

Mv zookeeper-3.4.10 / opt/zookeepermv zookeeper-3.4.10 zookeeper3.42, environment configuration

Edit / etc/profile file

Enter:

Export ZK_HOME=/opt/zookeeper/zookeeper3.4 export PATH=.:$ {JAVA_HOME} / bin:$ {ZK_HOME} / bin:$PATH

Enter:

Source / etc/profile

Make the configuration effective

3. Modify configuration file 3.3.1 to create files and directories

Create these directories on the servers of the cluster

Mkdir / opt/zookeeper/data mkdir / opt/zookeeper/dataLog

And create a myid file in the / opt/zookeeper/data directory

Enter:

Touch myid

After the creation is successful, change the myid file.

For convenience, I changed the contents of the myid files of master, slave1 and slave2 to 1, 2, and 3.

3.3.2 New zoo.cfg

Change to / opt/zookeeper/zookeeper3.4/conf directory

If the file is not zoo.cfg, copy the zoo_sample.cfg file and rename it to zoo.cfg.

Modify the newly created zoo.cfg file

DataDir=/opt/zookeeper/datadataLogDir=/opt/zookeeper/dataLogserver.1=master:2888:3888server.2=slave1:2888:3888server.3=slave2:2888:3888

Description: client port, as the name implies, is the port through which the client connects to the zookeeper service. This is a TCP port. DataLogDir is put into the sequential log (WAL). On the other hand, what is put in the dataDir is the snapshot of the in-memory data structure, which is easy to recover quickly. In order to maximize performance, it is generally recommended to divide dataDir and dataLogDir into different disks, so that you can take full advantage of the disk sequential write feature. DataDir and dataLogDir need to be created by themselves, and directories can be made by themselves. The 1 in server.1 needs to correspond to the value in the myid file in the dataDir directory on the master machine. The 2 in server.2 needs to correspond to the value in the myid file in the dataDir directory on the slave1 machine. The 3 in server.3 needs to correspond to the value in the myid file in the dataDir directory on the slave2 machine. Of course, you can use the value as long as you correspond to it. The port numbers of 2888 and 3888 can also be used freely, because it doesn't matter if they are used the same on different machines.

1.tickTime:CS confidence hop count

The interval at which a heartbeat is maintained between Zookeeper servers or between clients and servers, that is, a heartbeat is sent at each tickTime time. TickTime is in milliseconds.

TickTime=2000

2.initLimit:LF initial communication time limit

The maximum number of heartbeats (the number of tickTime) that can be tolerated during the initial connection between the follower server (F) and the leader server (L) in the cluster.

InitLimit=10

3.syncLimit:LF synchronous communication time limit

The maximum number of heartbeats (number of tickTime) that can be tolerated between requests and responses between the follower server and the leader server in the cluster.

SyncLimit=5

Still transfer zookeeper to other machines, remember to change the myid under / opt/zookeeper/data, this is not consistent.

Enter:

Scp-r / opt/zookeeper root@slave1:/optscp-r / opt/zookeeper root@slave2:/opt4, start zookeeper

Because zookeeper is an electoral system, its master-slave relationship is not as specified as hadoop, which can be specified in the official documentation.

After successfully configuring zookeeper, start zookeeper on each machine.

Change to the zookeeper directory

Cd / opt/zookeeper/zookeeper3.4/bin

Enter:

ZkServer.sh start

After successful startup

View status input:

ZkServer.sh status

You can view the leader and follower of zookeeper on each machine

Storm environment installation 1, file preparation

Extract the downloaded storm configuration file

On linux, enter:

Tar-xvf apache-storm-1.1.1.tar.gz

Then move to / opt/storm, create a new one without it, and rename the folder to storm1.1.

Input

Mv apache-storm-1.1.1 / opt/storm mv apache-storm-1.1.1 storm1.12, environment configuration

Edit / etc/profile file

Add:

Export STORM_HOME=/opt/storm/storm1.1export PATH=.:$ {JAVA_HOME} / bin:$ {ZK_HOME} / bin:$ {STORM_HOME} / bin:$PATH

Enter storm version to view version information

3. Modify the configuration file

Edit the storm.yarm of storm/conf.

Make the following edits:

Enter:

Vim storm.yarm

Storm.zookeeper.servers:-"master"-"slave1"-"slave2" storm.local.dir: "/ root/storm" nimbus.seeds: ["master"] supervisor.slots.ports:-6700-6701-6702-6703

Description:

Storm.zookeeper.servers is the service address of the specified zookeeper.

Because the storage information of storm is on zookeeper, configure the service address of zookeeper. If the zookeeper is a stand-alone machine, only specify one! Storm.local.dir represents the storage directory.

The Nimbus and Supervisor daemons need to store a directory on the local disk to store a small amount of state (such as jar,confs, etc.). Can be created on each machine and given permissions.

3.nimbus.seeds represents the candidate host.

Worker needs to know which machine is a host candidate (the zookeeper cluster is elected) so that topology jars and confs can be downloaded.

4.supervisor.slots.ports represents the worker port.

For each supervisor machine, we can use this to configure how much worker is running on this machine. Each worker uses a separate port to receive messages, and this port also defines which ports are open for use. If you define 5 ports here, it means that a maximum of 5 worker can be run on this supervisor node. If you define 3 ports, it means that you can run up to 3 worker. By default (that is, configured in defaults.yaml), there are four workers running on ports 6700, 6701, 6702, and and 6703.

Supervisor does not start the four worker as soon as it starts. Instead, it will only start when it receives the assigned task, and the specific number of worker launches will be determined according to how many worker our Topology needs in this supervisor. If the specified Topology is executed by only one worker, then supervisor starts a worker, not all.

Note: there should be no spaces in front of these configurations! Otherwise, the error will be reported. Here you use the hostname (mapped), or you can use IP. In practice, it is based on your own.

You can use the scp command or ftp software to copy storm to another machine

After successful configuration, you can start Storm, but make sure that JDK and Zookeeper are installed correctly and that Zookeeper is started successfully.

4, start Storm

Change to the storm/bin directory

Start input at the primary node (master):

Storm nimbus > / dev/null 2 > & 1 &

Access the web interface (master) input:

Storm ui

Enter from the node (slave1,slave2):

Storm supervisor > / dev/null 2 > & 1 &

Enter port 8080 in the browser interface

The successful opening of the interface indicates that the environment is configured successfully:

Environment installation 1 for kafka, file preparation

Extract the downloaded Kafka configuration file

On linux, enter:

Tar-xvf kafka_2.12-1.0.0.tgz

Then move to / opt/kafka, create a new one without it, and rename the folder to kafka2.12.

Input

Mv kafka_2.12-1.0.0 / opt/kafka mv kafka_2.12-1.0.0 kafka2.122, environment configuration

Edit / etc/profile file

Enter:

Export KAFKA_HOME=/opt/kafka/kafka2.12 export PATH=.:$ {JAVA_HOME} / bin:$ {KAFKA_HOME} / bin:$ {ZK_HOME} / bin:$PATH

Enter:

Source / etc/profile

Make the configuration effective

3. Modify the configuration file

Note: in fact, if it is a stand-alone machine, the configuration file of kafka can be started directly under the bin directory without modification. But we have a cluster here, so we can change it a little bit.

Change to the kafka/config directory

Edit the server.properties file

What needs to be changed is the address of Zookeeper:

Find the configuration of Zookeeper, specify the address of the Zookeeper cluster, and modify the settings as follows

Zookeeper.connect=master:2181,slave1:2181,slave2:2181zookeeper.connection.timeout.ms=6000

Other things that can be changed are

1. Num.partitions represents the specified partition. The default is 1.

2the log path of log log. Dirs kafka, which can be changed according to individual needs.

3, broker.id: non-negative integer, which is used to uniquely identify the broker. Each set is different.

...

Note: there are other configurations, you can check the official documentation, if there is no special request, use the default.

Once configured, remember to use the scp command to transfer to another cluster and change the server.properties file!

4, start kafka

Cluster every cluster needs operation!

Change to the kafka/bin directory

Enter:

Kafka-server-start.sh

Then enter the jps name to see if it started successfully:

After successful startup, you can do a simple test

First create a topic

Enter:

Kafka-topics.sh-zookeeper master:2181-create-topic t_test-partitions 5-replication-factor 2

Note: here you create a topic named t_test and specify five partitions, each of which specifies two replicas. If you do not specify a partition, the default partition is configured by the configuration file.

Then carry on the production data.

Enter:

Kafka-console-producer.sh-broker-list master:9092-topic t_test

You can use to make Ctrl+D exit

And then we open a xshell window.

Carry on consumption

Enter:

Kafka-console-consumer.sh-zookeeper master:2181-topic t_test-from-beginning

You can use to make Ctrl+C exit

You can see that the data has been consumed normally.

Some common commands of 5Jing Kafka

1. Start and shut down kafka

Bin/kafka-server-start.sh config/server.properties > > / dev/null 2 > & 1 & bin/kafka-server-stop.sh

two。 View message queues and specific queues in the kafka cluster

View all the topic of the cluster

Kafka-topics.sh-zookeeper master:2181,slave1:2181,slave2:2181-list

View the information of a topic

Kafka-topics.sh-zookeeper master:2181-describe-topic t_test

3. Create Topic

Kafka-topics.sh-zookeeper master:2181-create-topic t_test-partitions 5-replication-factor 2

4. Production data and consumption data

Kafka-console-producer.sh-broker-list master:9092-topic t_test

Ctrl+D exit

Kafka-console-consumer.sh-zookeeper master:2181-topic t_test-from-beginning

Ctrl+C exit

Delete command for 5.kafka

Kafka-topics.sh-delete-zookeeper master:2181-topic t_test

6, add Partition

Kafka-topics.sh-alter-topict_test-zookeeper master:2181-partitions 10 other

For more information on the construction of Storm environment, please see the official documentation:

Http://storm.apache.org/releases/1.1.1/Setting-up-a-Storm-cluster.html

For more information on the construction of Kafka environment, please see the official documentation:

Http://kafka.apache.org/quickstart

This is the end of this article, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.