In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article will explain in detail the example analysis of parallelism in storm. The editor thinks it is very practical, so I share it with you for reference. I hope you can get something after reading this article.
1 the basic concept of storm parallelism
What is a running topology made of: worker processes, executors, and tasks
In a Storm cluster, Storm runs the topology mainly through the following three parts:
Working process (worker processes) (number of processes)
Actuator (executors) (number of threads)
Task (tasks) (number of instance components)
A machine in a storm cluster can run one or more worker, corresponding to one or more topologies.1 worker processes running one or more excutor threads. Each worker is subordinate to a topology.executor that is single-threaded. Each executor runs one or more task of the same component (spout or bolt). One task performs the actual data processing.
The following is a simple illustration of their relationship.
2 whether it is necessary to increase the number of workers
(1) it is best to use only one worker for a topology on a machine. The main reason is that the data transmission between worker is reduced.
(2) having more worker may have better performance, depending on where your bottleneck lies. Each worker has a thread to transfer tuples to other worker, so if your bottleneck is in CPU and each worker is dealing with a large number of tuples, more worker may improve your throughput.
So basically there is no clear answer, you should try different configurations according to your environment and design.
3 number of executor
Executor is the true degree of parallelism (de facto parallelism). (the number of task is the degree of parallelism you want to set)
Initial number of executor = number of spout + number of bolt + number of acker (these add up to the number of task. )
The number of spout, the number of bolt, the number of acker will not change at run time, but the number of executor can change.
4 whether it is necessary to increase the number of TASK
TASK exists only for the flexibility of topology extensions, regardless of parallelism.
One task performs the actual data processing.
1 worker process executes a subset of the topology. One worker process is subordinate to a particular topology and is running one or more executor of one or more components of the topology (spout or bolt). A running topology includes many such processes on many machines in the cluster.
1 executor is 1 thread generated by 1 worker process. It may be running one or more task of the same component (spout or bolt).
A task performs actual data processing, and each spout or bolt you implement in code is equivalent to many task distributed throughout the cluster. In the life cycle of a topology, the number of task of a component is always the same, but the number of executor (threads) of a component can change over time. This means that the following conditions are always true: the number of thread
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.