Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Example Analysis of coordinated Sequential Task execution based on Azkaban

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

Editor to share with you based on Azkaban coordination timing task implementation example analysis, I hope you will learn something after reading this article, let's discuss it together!

1. Overview of Azkaban 1. Task timing

In the business scenario of data service, a very common business process is that the log files are analyzed by big data, and then output the resulting data to the business. In this process, there will be many tasks to perform, and it is difficult to accurately grasp the end time of task execution, but it is hoped that the entire task chain will end and release resources as soon as possible.

The approximate order of execution is as follows:

Business log files are synchronized to the HDFS file system

Perform the process of analysis and calculation through Hadoop

The result data is stored in the import data warehouse.

Finally, you need to synchronize the data in the warehouse to the business database.

Such a process does not require task scheduling in the business, and the time is basically predictable. as long as you keep enough time between tasks, big data's task link usually needs one end to start another directly, so as to reduce the time cost. when you first entered the data service company, there were cases where the execution of the synchronization task ended but the final individual CSV data files were not generated. It causes nearly one million analysis data to update the business database synchronously.

2. Introduction to Azkaban

Azkaban is a scheduler developed by Linkedin to manage batch workflow tasks, which is used to run a set of workflows and processes in a specific order within a workflow. Azkaban uses job profiles to establish dependencies between tasks and provides an easy-to-use web user interface to maintain and track your workflow.

Characteristics and advantages of Azkaban

Provide a clear, easy-to-use Web UI interface

The job configuration is simple, and the task job dependency is clear.

Provide extensible components

Based on Java language development, easy for secondary development

Compared with Oozie, the process of configuring workflow is to write a large number of XML configuration, and its code complexity is relatively high, not easy to secondary development, Azkaban is lightweight, the function and usage are relatively simple and easy to use.

Second, service installation 1, core package

Web service

Azkaban-web-server-2.5.0.tar.gz

Executive service

Azkaban-executor-server-2.5.0.tar.gz

SQL script

Azkaban-sql-script-2.5.0.tar.gz2, installation path

Upload the above three installation packages and extract the operation.

[root@hop01 azkaban] # pwd/opt/azkaban [root@hop01 azkaban] # tar-zxvf azkaban-web-server-2.5.0.tar.gz [root@hop01 azkaban] # tar-zxvf azkaban-executor-server-2.5.0.tar.gz [root@hop01 azkaban] # tar-zxvf azkaban-sql-script-2.5.0.tar.gz [root@hop01 azkaban] # mv azkaban-web-2.5.0/ server [root@hop01 azkaban] # mv azkaban-executor-2.5.0/ executor3, MySQL import script [root@hop01 ~] # mysql-uroot-p123456mysql > create database azkaban_test Mysql > use azkaban_test;mysql > source / opt/azkaban/azkaban-2.5.0/create-all-sql-2.5.0.sql

View tabl

4. SSL configuration [root@hop01 opt] # keytool-keystore keystore-alias jetty-genkey-keyalg RSA

Generate file: keystore

Copy to the AzkabanWeb server directory:

[root@hop01 opt] # mv keystore / opt/azkaban/server/5, Web service configuration

Basic configuration

[root@hop01 conf] # pwd/opt/azkaban/server/conf [root@hop01 conf] # vim azkaban.properties

Core modifications: MySQL and Jetty.

Default.timezone.id=Asia/Shanghai# Azkaban MySQL server properties.database.type=mysqlmysql.port=3306mysql.host=localhostmysql.database=azkaban_testmysql.user=rootmysql.password=123456mysql.numconnections=100# Azkaban Jetty server properties.jetty.maxThreads=25jetty.ssl.port=8443jetty.port=8081jetty.keystore=keystorejetty.password=123456jetty.keypassword=123456jetty.truststore=keystorejetty.trustpassword=123456

The configuration here is in line with the local configuration parameters.

User configuration

[root@hop01 conf] # vim azkaban-users.xml

Add an administrator user:

6. Executor service configuration [root@hop01 conf] # pwd/opt/azkaban/executor/conf [root@hop01 conf] # vim azkaban.properties

Core modifications: MySQL and time zone.

Default.timezone.id=Asia/Shanghai# Azkaban MySQL server properties.database.type=mysqlmysql.port=3306mysql.host=localhostmysql.database=azkaban_testmysql.user=rootmysql.password=123456mysql.numconnections=1007, start the server

Web service

[root@hop01 bin] # pwd/opt/azkaban/server/bin [root@hop01 bin] # lltotal 16-rwxr-xr-x 1 root root 161 Apr 21 2014 azkaban-web-shutdown.sh-rwxr-xr-x 1 root root 1275 Apr 21 2014 azkaban-web-start.sh

Here are the startup and shutdown scripts, respectively.

[root@hop01 bin] # / opt/azkaban/server/bin/azkaban-web-start.sh

Executor service

[root@hop01 bin] # / opt/azkaban/executor/bin/azkaban-executor-start.sh

Startup log

Key trailing logs for the two services:

Azkaban Server running on ssl port 8443.Azkaban Executor Server started on port 12321

Login Page

Note that this is based on the https protocol:

Https://hop01:8443/

Third, operation case 1, entry case

Create command type job

[root@hop01 flow_01] # pwd/opt/azkaban/testJob/flow_01 [root@hop01 flow_01] # vim simple.jobtype=commandcommand=echo 'mySimpleJob'

Pack it into zip package.

[root@hop01 flow_01] # zip-Q-r simpleJob.zip simple.job

Create a project

Upload task package

Perform a task

2. Sequential execution of tasks

Create Task A

[root@hop01 flow_02] # vim simpleA.jobtype=commandcommand=echo 'simplejobA'

Create Task B

[root@hop01 flow_02] # vim simpleB.jobtype=commanddependencies=simpleAcommand=echo 'simplejobB'

Packaging task

[root@hop01 flow_02] # zip-Q-r simpleTwoJob.zip simpleA.job simpleB.job

In the same operation mode, the two tasks are placed in the zip package, uploaded through the Web service, and the execution effect can be observed.

After reading this article, I believe you have some understanding of "sample Analysis of Azkaban-based coordinated timing Task execution". If you want to know more about it, you are welcome to follow the industry information channel. Thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report