In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
What this article shares with you is about how Spark Streaming implements the workflow scheduler. The editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article.
It was said before that a workflow scheduler would be designed. Developing a complete workflow scheduler should not be an easy task. But through Spark Streaming (based on the concept of the Transfomer architecture), we may be able to simplify this work. I don't have any experience in this area, it's just something in my mind.
The following is an architectural diagram of Azkaban:
In other words, to build a stable and reliable Azkaban workflow scheduler, you may need
Two MySQL are active and standby to each other.
Two Executor Server sets
A Web Server
You need to do architectural design and consider the communication between WebServer and Executor Server
Scalability issues. Can Executor be adjusted dynamically?
Stability issues. After all, it runs 24 hours a day.
However, we don't really need to pay attention to so many things. What we really care about is:
Web UI
Generation, parsing, running and storage of workflow
The rest is infrastructure. According to the design philosophy of Transfomer architecture, we should be able to find an Estimator, as our infrastructure, we only need to focus on the above two points, do not need to worry about deployment, high availability, stability and so on. At the same time, we also hope that work such as WebUI is not to start from scratch, but to add new skills step by step. So with Estimator, we just need to do three things:
Realize the business logic, that is, the generation, parsing, running and storage of workflows.
Implement the logic of managing pages
If you specify the required resource cpu/ memory, you can Run the Transformer
I searched around and found that Spark Streaming is an Estimator that can meet this need.
This is due to the fact that Spark Streaming is, in a way, a scheduled task scheduling system, which is what we call micro-batch processing. For the workflow scheduler, it is nothing more than starting a thread scan MySQL on the driver side every duration to distribute and execute tasks.
Well, if you achieve something similar to what Azkaban can do, we mentioned earlier that there are three things to do, which correspond to:
1. Realize the business logic, that is, the generation, parsing, running and storage of workflows. The three links of generation, parsing and storage can be placed on the driver side or all on the Executor side. In other words: the design of Driver can be heavy or light. Heavy designs can be read from MySQL by Driver and parsed into workflow tasks, and then sent to Executor for execution. The light design Driver simply reads the MySQL and then simply distributes the id to each Executor, and each Executor is responsible for parsing the execution and feedback.
two。 Enhance Spark Streaming UI, add management pages, and achieve an Azkaban Web Server-like interface.
3. The deployment can be completed by submitting the implementation to the cluster according to the standard Spark Streaming program.
We see that we really only focus on the implementation of the core business logic, the so-called deployment, installation, operation and other links have achieved a platform (in fact, Estimator has been completed). And achieve a fine-grained (CPU/ memory) division of resources, instead of taking the server as the basic unit.
In fact, we can also treat a Spark Streaming as a crontab task, so that we naturally have a distributed crontab system, provide more friendly management, and even integrate the task itself into the crontab.
This is how Spark Streaming implements the workflow scheduler. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.