Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the programming idea of storm java?

2025-01-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

In this issue, the editor will bring you what is the programming idea of storm java. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.

General thinking

Storm programming is very similar to hadoop's mapreduce programming. Hadoop's mapreduce needs to implement map functions, reduce functions, and a main class driver; storm needs to implement spout,bolt and a main function on its own. There are three steps to storm programming:

Create a Spout to read the data

Create bolt to process data

Create a main class, create a topology and a cluster object in the main class, and submit the topology to the cluster

Topology operation mode

The operation of Topology can be divided into local mode and distributed mode, and the mode settings can be set in the configuration file or in the code. In local mode, nothing needs to be installed, just the storm jar package.

(1) the submission method of local operation:

LocalCluster cluster = new LocalCluster ()

Cluster.submitTopology (topologyName, conf, topology)

Cluster.killTopology (topologyName)

Cluster.shutdown ()

(2) distributed submission method:

StormSubmitter.submitTopology (topologyName, topologyConfig, builder.createTopology ())

It should be noted that after the Storm code is written, it needs to be packaged as a jar package and run in Nimbus. When packaging, you do not need to type in all the dependent jar, otherwise, if you enter the dependent storm.jar package, repeated configuration file errors will occur at run time, causing Topology to fail to run. Because the local storm.yaml configuration file is loaded before Topology runs.

The command to run on Nimbus is as follows:

Storm jar StormTopology.jar maincalss args

Topology running process

There are several points that need to be explained:

(1) after the Storm is submitted, the code is first stored in the inbox directory of the Nimbus node, and then a stormconf.ser file generated by the running configuration of Storm is placed in the stormdist directory of the Nimbus node, and the serialized Topology code file is also in this directory.

(2) when setting the Spouts and Bolts associated with Topology, you can set the number of executor and the number of task of current Spout and Bolt at the same time. By default, the sum of task of a Topology is the same as the sum of executor. After that, the system distributes the execution of these worker evenly according to the number of task. The supervisor node on which the worker runs is determined by the storm itself

(3) after the task is assigned, the Nimbes node will submit the task information to the zookeeper cluster. At the same time, there will be a workerbeats node in the zookeeper cluster, where the heartbeat information of all worker processes in the current Topology will be stored.

(4) the Supervisor node constantly polls the zookeeper cluster and stores all the Topology task assignment information, code storage directory, relationship between tasks and so on in the assignments node of zookeeper. Supervisor polls the contents of this node to get its own task and start the worker process to run.

(5) after a Topology runs, the Stream stream will be sent continuously through Spouts, and the received Stream stream will be processed continuously through Bolts. The Stream stream is unbounded.

The last step will be carried out without interruption unless the Topology is manually terminated.

Topology method call process

The method invocation procedure for Stream processing in Topology is as follows:

There are several points that need to be explained:

(1) the constructor and declareOutputFields methods of each component (Spout or Bolt) are called only once.

(2) the calls of open method and prepare method are repeated. The parallelism parameter in setSpout or setBolt set in the entry function refers to the number of executor, the number of threads responsible for running the task in the component, and the number of times the above two methods will be called, once per executor runtime. Equivalent to the constructor of a thread.

(3) the nextTuple method and the execute method are running all the time, and the nextTuple method continuously transmits the execute of Tuple,Bolt and receives the Tuple for processing. Only by running continuously in this way can the unbounded Tuple flow be generated and the real-time performance be realized. Equivalent to the run method of a thread.

(4) after submitting a topology, Storm creates an spout/bolt instance and serializes it. After that, the serialized component is sent to the machine where all the tasks are located (that is, the Supervisor node), and the component is deserialized on each task.

(5) the communication between Spout and Bolt and between Bolt and Bolt is realized through zeroMQ's message queue.

(6) the ack method and the fail method are not listed in the figure above. After a Tuple is processed successfully, you need to call the ack method to mark the success, otherwise the fail method flag fails and the Tuple is reprocessed.

Topology parallelism

In the execution unit of Topology, there are several concepts related to parallelism.

(1) worker: each worker belongs to a specific Topology, and each Supervisor node can have multiple worker, and each worker uses a separate port. It runs one or more executor threads for each component in the Topology to provide task running services.

(2) executor:executor is a thread generated within a worker process that executes one or more task of the same component.

(3) task: the actual data processing is done by task. In the life cycle of Topology, the number of task of each component will not change, but the number of executor is not certain. The number of executor is less than or equal to the number of task, which is equal by default.

When running a Topology, you can set different numbers of worker, task, and executor according to the specific situation, and the location of the setting can also be in multiple places.

(1) worker settings:

(1.1) you can set the topology.workers property in yaml

(1.2) set in the code through the setNumWorkers method of Config

(2) executor settings:

By specifying the last parameter of the setBolt and setSpout methods in the entry class of Topology, if not specified, the default is 1.

(3) task settings:

(3.1) by default, the number is the same as that of executor

(3.2) set the number of task for a specific component in the code through the setNumTasks method of TopologyBuilder

Terminating Topology

Terminate the operation of a Topology by using the following command on the Nimbus node:

Storm kill topologyName

After kill, you can view the topology status through the UI interface, which will first become KILLED, and after cleaning up the local directory and the information related to the current Topology in the zookeeper cluster, the Topology will disappear completely.

Topology tracking

After the Topology is submitted, it can be viewed in the web interface of the Nimbus node. The default address is http://NimbusIp:8080.

These are the programming ideas of storm java shared by Xiaobian. If you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report