What are the steps for Storm installation and deployment 07/02 Update SLTechnology News&Howtos

What are the steps for Storm installation and deployment

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

Storm installation and deployment steps, I believe that many inexperienced people do not know what to do, so this article summarizes the causes of the problem and solutions, through this article I hope you can solve this problem.

1. Storm cluster components

There are two types of nodes in Storm cluster: master node (Master Node) and work node (Work Node). The corresponding roles are as follows:

1. A daemon called Nimbus runs on the master node (Master Node), which is responsible for distributing code within the Storm cluster, assigning tasks to working machines, and monitoring the running status of the cluster. The role of Nimbus is similar to that of JobTracker in Hadoop.

two。 A daemon called Supervisor runs on each worker node (Work Node). The Supervisor is responsible for listening on the tasks assigned to it from the Nimbus, thereby starting or stopping the worker process that executes the task. Each worker process executes a subset of Topology; a running Topology consists of multiple worker processes distributed on different worker nodes.

Storm cluster components

All coordination between the Nimbus and Supervisor nodes is achieved through the Zookeeper cluster. In addition, both Nimbus and Supervisor processes are fail-fast and stateless; all the states of the Storm cluster are either in the Zookeeper cluster or stored on the local disk. This means that you can use kill-9 to kill Nimbus and Supervisor processes, and they can continue to work after a restart. This design makes the Storm cluster incredibly stable.

two。 Install Storm cluster

This section describes in detail how to build a Storm cluster. Here are the installation steps that need to be completed in turn:

1. Set up Zookeeper cluster

two。 Install Storm dependent libraries

3. Download and extract the Storm release

4. Modify storm.yaml configuration file

5. Start each background process of Storm.

2.1 set up Zookeeper cluster

Storm uses Zookeeper to coordinate the cluster, and since Zookeeper is not used for messaging, the pressure on Zookeeper from Storm is quite low. In most cases, the Zookeeper cluster of a single node is competent enough, but to ensure failure recovery or to deploy a large-scale Storm cluster, a larger node Zookeeper cluster may be required (for Zookeeper clusters, the official recommended minimum number of nodes is 3). Complete the following installation and deployment steps on each machine in the Zookeeper cluster:

1. Download and install Java JDK. The official download link is http://java.sun.com/javase/downloads/index.jsp Jing JDK version JDK 6 or above.

two。 According to the load of the Zookeeper cluster, set the Java heap size reasonably to avoid the occurrence of swap as much as possible, resulting in the decline of Zookeeper performance. To be conservative, machines with 4GB memory can allocate 3GB of heap space for Zookeeper.

3. After downloading, extract and install the Zookeeper package. The official download link is http://hadoop.apache.org/zookeeper/releases.html.

4. According to the Zookeeper cluster node, create the Zookeeper configuration file zoo.cfg under the conf directory:

TickTime=2000 dataDir=/var/zookeeper/ clientPort=2181 initLimit=5 syncLimit=2 server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888

Where dataDir specifies the data file directory of Zookeeper; where server.id=host:port:port,id is the number of each Zookeeper node, which is stored in the myid file under the dataDir directory, zoo1~zoo3 indicates that the hostname,*** port of each Zookeeper node is the port used to connect leader, and the second port is the port used for leader election.

5. Create a myid file under the dataDir directory that contains only one line and contains the id number in the server.id corresponding to that node.

6. Start the Zookeeper service:

Java-cp zookeeper.jar:lib/log4j-1.2.15.jar:conf\ org.apache.zookeeper.server.quorum.QuorumPeerMain zoo.cfg

Bin/zkServer.sh start

7. Test the availability of the service through the Zookeeper client:

Java-cp zookeeper.jar:src/java/lib/log4j-1.2.15.jar:conf:src/java/lib/jline-0.9.94.jar\ org.apache.zookeeper.ZooKeeperMain-server 127.0.0.1purl 2181

Bin/zkCli.sh-server 127.0.0.1purl 2181

Note:

Because Zookeeper fails quickly (fail-fast), and the process exits when any error occurs, * can manage the Zookeeper through the monitor program to ensure that the Zookeeper can be automatically restarted after exiting. Please refer to here for details.

During the running of Zookeeper, a lot of log and snapshot files are generated in the dataDir directory, and the Zookeeper running process is not responsible for cleaning and merging these files on a regular basis, which takes up a lot of disk space. Therefore, it is necessary to clean up useless log and snapshot files regularly through cron and other methods. Please refer to here for details. The specific command format is as follows: java-cp zookeeper.jar:log4j.jar:conf org.apache.zookeeper.server.PurgeTxnLog-n

2.2 install Storm dependent libraries

Next, you need to install Storm's dependent libraries on Nimbus and Supervisor machines, as follows:

1. ZeroMQ 2.1.7-do not use version 2.1.10, as some serious bug in this version can cause strange problems when the Storm cluster is running. A small number of users will encounter a "IllegalArgumentException" exception in version 2.1.7, which can be fixed by reducing it to version 2.1.4.

2. JZMQ

3. Java 6

4. Python 2.6.6

5. Unzip

The versions of the above dependent libraries have been tested by Storm, and Storm is not guaranteed to run under other versions of Java or Python libraries.

2.2.1 install ZMQ 2.1.7

Download and compile and install ZMQ:

Note:

If the installation process reports that uuid cannot be found, install the uuid library with the following package:

Sudo yum install e2fsprogsl

Sudo yum install e2fsprogs-devel

2.2.2 install JZMQ download and compile and install JZMQ:git clone https://github.com/nathanmarz/jzmq.git

Cd jzmq

. / autogen.sh

. / configure

Make

Sudo make install

In order for JZMQ to work properly, you may need to complete the following configuration:

Set the JAVA_HOME environment variable correctly

Install the Java package

Upgrade autoconf

If you are Mac OSX, refer to here

Note:

If there is a problem running the. / configure command, refer to here.

2.2.3 install Java 6

1. Download and install JDK 6, refer to here

two。 Configure JAVA_HOME environment variables

3. Run the java, javac commands to test the normal installation of java.

2.2.4 install Python2.6.6

1. Download Python2.6.6:

Wget http://www.python.org/ftp/python/2.6.6/Python-2.6.6.tar.bz2

two。 Compile and install Python2.6.6:

Tar-jxvf Python-2.6.6.tar.bz2

Cd Python-2.6.6

. / configure

Make

Make install

3. Test Python2.6.6:

Python-V

Python 2.6.6

2.2.5 install unzip

1. If you are using a RedHat series Linux system, execute the following command to install unzip:

Yum install unzip

two。 If you are using a Debian series Linux system, execute the following command to install unzip:

Apt-get install unzip

2.3 download and extract the Storm release

Next, you need to install the Storm distribution on Nimbus and Supervisor machines.

1. To download the Storm distribution, it is recommended to use Storm0.8.1:

Wget https://github.com/downloads/nathanmarz/storm/storm-0.8.1.zip

two。 Unzip it to the installation directory:

Unzip storm-0.8.1.zip

2.4 modify the storm.yaml configuration file

Under the unzipped directory of the Storm distribution, there is a conf/storm.yaml file that is used to configure Storm. The default configuration can be seen here. The configuration options in conf/storm.yaml override the default configuration in defaults.yaml. The following configuration options must be configured in conf/storm.yaml:

1) storm.zookeeper.servers: the Zookeeper cluster address used by the Storm cluster, which is in the following format:

Storm.zookeeper.servers:

-"111.222.333.444"

-"555.666.777.888"

If the Zookeeper cluster does not use the default port, then the storm.zookeeper.port option is also required.

2) storm.local.dir: Nimbus and Supervisor processes are used to store a small amount of state, such as the local disk directory of jars, confs, etc., which needs to be created in advance and given sufficient access permissions. Then configure the directory in storm.yaml, such as:

Storm.local.dir: "/ home/admin/storm/workdir"

3) java.library.path: the loading path of the local libraries (ZMQ and JZMQ) used by Storm. Default is "/ usr/local/lib:/opt/local/lib:/usr/lib". Generally speaking, ZMQ and JZMQ are installed under / usr/local/lib by default, so you don't need to configure them.

4) nimbus.host: address of Nimbus machines in Storm cluster. Each Supervisor worker node needs to know which machine is Nimbus in order to download jars, confs and other files of Topologies, such as:

Nimbus.host: "111.222.333.444"

5) supervisor.slots.ports: for each Supervisor worker node, you need to configure the number of worker that the worker node can run. Each worker occupies a separate port for receiving messages, and this configuration option is used to define which ports can be used by the worker. By default, four workers can be run on each node on ports 6700, 6701, 6702, and 6703, such as:

Supervisor.slots.ports:

-6700

-6701

-6702

-6703

2.5 start each background process of Storm

* one step to start all background processes of Storm. Like Zookeeper, Storm is a fail-fast system so that Storm can be stopped at any time and resume execution correctly when the process is restarted. This is why Storm does not save state in the process, and even if Nimbus or Supervisors is restarted, the running Topologies will not be affected.

Here is how to start each background process in Storm:

Nimbus: run "bin/storm nimbus > / dev/null 2 > & 1 &" on the Storm master node to start the Nimbus daemon and put it into the background to execute

Supervisor: run "bin/storm supervisor > / dev/null 2 > & 1 &" on each work node of Storm to start the Supervisor daemon and put it into the background to execute

UI: run "bin/storm ui > / dev/null 2 > & 1 &" on the Storm master node to start the UI daemon and put it into the background to execute. After startup, you can observe the cluster worker resource usage, Topologies running status and other information through http://{nimbus host}: 8080.

Note:

When you start the Storm background process, you need to have write access to the storm.local.dir directory set in the conf/storm.yaml configuration file.

After the Storm background process is started, log files for each process are generated in the logs/ subdirectory under the Storm installation and deployment directory.

It has been tested that Storm UI must be deployed on the same machine as Storm Nimbus, otherwise UI will not work properly because the UI process will check for native Nimbus links.

For ease of use, bin/storm can be added to the system environment variable.

At this point, the Storm cluster has been deployed and configured, and the topology can be submitted to the cluster for operation.

3. Submit tasks to the cluster

1. Start Storm Topology:

Storm jar allmycode.jar org.me.MyTopology arg1 arg2 arg3

Where allmycode.jar is the jar package containing the Topology implementation code, the main method of org.me.MyTopology is the entry of Topology, and arg1, arg2 and arg3 are the parameters that need to be passed in when org.me.MyTopology executes.

two。 Stop Storm Topology:

Storm kill {toponame}

Where {toponame} is the name of the Topology task specified when Topology is submitted to the Storm cluster.

After reading the above, have you mastered the steps of Storm installation and deployment? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.