Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Compilation and installation steps of azkaban3.9.0

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article introduces the relevant knowledge of "the steps of compiling and installing azkaban3.9.0". Many people will encounter such a dilemma in the operation of actual cases, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

1 Azkaban introduction

In big data's business processing scenarios, there are often such analysis scenarios:

Task A: clean the collected data through a series of rules, and then store it in Hive table a.

Task B: associate table b with table c that already exists in Hive to get table d.

Task C: associate the table an obtained in task A with the table d obtained in task B to get the result table e of the analysis.

D task: finally, the table e obtained from Hive is imported into the relational database MySQL through sqoop for web query.

Obviously, the above task C depends on the results of task An and task B, and task D depends on the results of task C. In general, we can open two terminals to execute task An and task B respectively, and then execute task C when task An and task B are completed, and then execute task D when task C is completed. In the whole task flow, we must ensure that task An and task B are completed and then task C is executed, and then task D. In this way, a certain link is inseparable from manual participation, and it is very laborious to keep an eye on the progress of each task at all times.

The above business scenario is a large task, which is divided into four sub-tasks A, B, C and D. If we can have a task scheduler to automatically execute Task A, execute Task B, then execute Task C, and finally execute Task D, then we do not need to keep an eye on whether the task is completed and whether we should start the next task. Azkaban is such a workflow scheduler that can solve the above scenario problems.

2 installation of Azkaban

2.1 Azkaban consists of three key components

Azkaban is a batch workflow scheduler, and the bottom layer is developed using java language, which is used to run a set of tasks and processes in a certain order in a workflow, and provides a very convenient webui interface to monitor task scheduling, so it is convenient for us to manage flow scheduling tasks.

Azkaban consists of three key components:

AzkabanWebServer:

Mainly responsible for project management, user login authority authentication, regular execution of work tasks, tracking the process of submitting tasks for execution, accessing the history of execution of tasks, and saving the status of the execution plan.

AzkabanExecutorServer: mainly responsible for submitting, executing, retrieving and updating the data of the current execution plan, and processing the log of the execution plan.

Relational database:

The main purpose is to save the original data information in the workflow.

Next, let's build an Azkaban task flow scheduling system from scratch.

2.2 install Azkaban

Https://github.com/azkaban/azkaban/releases

2.2.1 Environmental preparation

To install Azkaban in Linux, you need to install jdk,MySQL in the system. Jdk8 and MySQL5.1 versions are selected here. In addition, you also need to install git. Git is an open source distributed version control system, which usually uses git control in project version control. Git is needed to install Azkaban here because you need to build dependency packages through git.

Install git:

Yum-y install git

2.2.2 install Azkaban

Upload and download the azkaban and decompress it to the / opt/azkaban-temp folder

[root@mynode5 software] # tar-zxvf. / azkaban-3.90.0.tar.gz

[root@mynode5 software] # mv. / azkaban-3.90.0 azkaban-temp

Go to the azkaban-temp directory and compile

[root@node4 azkaban] #. / gradlew distTar

...

BUILD SUCCESSFUL in 4m 6s

54 actionable tasks: 40 executed, 14 from cache

Note: compile time failure may be caused by network delay during compilation. You can try again several times to solve this problem.

3. Create a new azkaban directory and copy the compiled files to this directory

[root@node4 software] # mkdir. / azkaban

[root@node4 software] # cd azkaban

[root@node4 azkaban] # cp / software/azkaban-temp/azkaban-

Db/build/distributions/azkaban-db-0.1.0-SNAPSHOT.tar.gz

/ software/azkaban4.

[root@node4 azkaban] # cp / software/azkaban-temp/azkaban-web-

Server/build/distributions/azkaban-web-server-0.1.0-SNAPSHOT.tar.gz

/ software/azkaban

[root@node azkaban] # cp / software/azkaban-temp/azkaban-exec-

Server/build/distributions/azkaban-exec-server-0.1.0-SNAPSHOT.tar.gz

/ software/azkaban

Extract each compiled package in the azkaban directory and rename it

[root@node4 azkaban] # tar-zxvf azkaban-db-0.1.0-SNAPSHOT.tar.gz

[root@node4 azkaban] # tar-zxvf azkaban-web-server-0.1.0-SNAPSHOT.tar.gz

[root@node4 azkaban] # tar-zxvf azkaban-exec-server-0.1.0-SNAPSHOT.tar.gz

[root@node4 azkaban] # mv azkaban-db-0.1.0-SNAPSHOT azkaban-db

[root@node4 azkaban] # mv azkaban-web-server-0.1.0-SNAPSHOT azkaban-web

[root@node4 azkaban] # mv azkaban-exec-server-0.1.0-SNAPSHOT azkaban-exec

At this point, after downloading and compiling Azkaban, the basic preparation for Azkaban installation has been completed, and the next step is to configure Azkaban to run Azkaban.

2.2.3 Import the database

To run the Azkaban basic original data information library, there is basic library information in the compiled azkaban-db, which needs to be imported into the relational database, which is imported into the MySQL database. The imported MySQL database can be installed on the same node as the current Azkaban, or on different nodes. The author's Azkaban is installed on the node4 node, and the MySQL database is installed on the node1 node.

Log in to the mysql database and create the azkaban database

[root@mynode2] # mysql-u root-p

Mysql > create database azkaban default character set latin1

Note: it is recommended to use latin1 encoding when creating azkaban, because the index is too long and the utf8 encoding format is not supported. A maximum of 1000 is supported.

two。 Prepare the sql file

Copy the create-all-sql-0.1.0-SNAPSHOT.sql under the / software/azkaban/azkaban-db directory on the node4 node to the node1 node / software/test directory.

[root@node4 azkaban-db] # scp / software/azkaban/azkaban-db/create-all-

Sql-0.1.0-SNAPSHOT.sql

[root@node4:/software/test create-all-sql-0.1.0-SNAPSHOT.sql 100% 12KB 11.8KB/s 00:00

Import a database into MySQL

Mysql > use azkaban

Mysql > source / software/test/create-all-sql-0.1.0-SNAPSHOT.sql

Check the imported database table

Mysql > show tables

2.3 configure and run Azkaban

2.3.1 create ssl configuration

The full name of HTTP is Hypertext Transfer Protocol Vertion (Hypertext transfer Protocol), and the full name of HTTPS is Secure Hypertext Transfer Protocol (secure Hypertext transfer Protocol). HTTPS is based on HTTP and uses secure Sockets layer (SSL) for information exchange. to put it simply, it is the secure version of HTTP. Azkaban supports secure https access, but you need to create a ssl configuration.

Execute the command in the / software/azkaban directory: keytool-keystore keystore-alias jetty-genkey-keyalg RSA to create the ssl configuration.

[root@mynode5 azkaban] # keytool-keystore keystore-alias jetty-genkey-keyalg RSA

Enter the KeyStore password:

Enter the new password again:

Key password entered

(press enter if the password is the same as the KeyStore password):

After entering the information and executing the above command, generate a keystore file in the current directory and copy this file to the root directory of the azkaban web server.

[root@node4 azkaban] # mv / software/azkaban/keystore / software/azkaban/azkaban-web

2.3.2 Azkaban web server configuration

Enter the / software/azkaban/azkaban-web/conf catalog and compile the azkaban.properties file:

[root@node4 conf] # cd / software/azkaban/azkaban-web/conf/

[root@node4 conf] # vim azkaban.properties

The editing content is as follows:

# Azkaban Personalization Settingsazkaban.name=My Azkabanazkaban.label=My Local Azkabanazkaban.color=#FF3601azkaban.default.servlet.path=/indexweb.resource.dir=/root/soft/azkaban/azkaban-web/web/default.timezone.id=Asia/Shanghai# Azkaban UserManager classuser.manager.class=azkaban.user.XmlUserManageruser.manager.xml.file=/root/soft/azkaban/azkaban-web/conf/azkaban-users.xml# Loader for projectsexecutor.global.properties=/root/soft/azkaban/azkaban-web/conf/global.propertiesazkaban.project.dir=projects# Velocity dev modevelocity.dev.mode=false# Azkaban Jetty server properties.jetty.use.ssl=falsejetty.maxThreads=25jetty.port=8081jetty.keystore=/root/soft/azkaban/azkaban-web/keystorejetty.password=123456jetty.keypassword=azkabanjetty.truststore=/root/soft/azkaban/azkaban-web/keystorejetty.trustpassword=123456jetty.ssl.port=8443executor.connector.stats=trueexecutor.port=12312# Azkaban Executor settings# mail settingsmail.sender=mail.host=# User facing web server configurations used to construct the user facing server URLs. They are useful when there is a reverse proxy between Azkaban web servers and users.# enduser-> myazkabanhost:443-> proxy-> localhost:8081# when this parameters set then these parameters are used to generate email links.# if these parameters are not set then jetty.hostname And jetty.port (if ssl configured jetty.ssl.port) are used.# azkaban.webserver.external_hostname=myazkabanhost.com# azkaban.webserver.external_ssl_port=443# azkaban.webserver.external_port=8081job.failure.email=job.success.email=lockdown.create.projects=falsecache.directory=cache# JMX statsjetty.connector.stats=trueexecutor.connector.stats=true# Azkaban mysql settings by default. Users should configure their own username and password.database.type=mysqlmysql.port=3306mysql.host=192.168.3.175mysql.database=azkabanmysql.user=azkabanmysql.password=azkabanmysql.numconnections=100#Multiple Executorazkaban.use.multiple.executors=trueazkaban.executorselector.filters=StaticRemainingFlowSize,MinimumFreeMemory,CpuStatusazkaban.executorselector.comparator.NumberOfAssignedFlowComparator=1azkaban.executorselector.comparator.Memory=1azkaban.executorselector.comparator.LastDispatched=1azkaban.executorselector.comparator.CpuUsage=1

2.3.3 Azkaban executor server configuration

Enter the / software/azkaban/azkaban-exec/conf catalog and compile the azkaban.properties file:

# Azkaban Personalization Settingsazkaban.name=My Azkabanazkaban.label=My Local Azkabanazkaban.color=#FF3601azkaban.default.servlet.path=/indexweb.resource.dir=/root/soft/azkaban/azkaban-web/web/default.timezone.id=Asia/Shanghai# Azkaban UserManager classuser.manager.class=azkaban.user.XmlUserManageruser.manager.xml.file=/root/soft/azkaban/azkaban-web/conf/azkaban-users.xml# Loader for projectsexecutor.global.properties=/root/soft/azkaban/azkaban-web/conf/global.propertiesazkaban.project.dir=projects# Velocity dev modevelocity.dev.mode=false# Azkaban Jetty server properties.#jetty.use.ssl=false#jetty.maxThreads=25#jetty.port=8081# Where the Azkaban webserver is locatedazkaban.webserver.url= https://localhost:8443# mail settingsmail.sender=mail.host=# User facing webserver configurations used to construct the user facing server URLs. They are useful when there is a reverse proxy between Azkaban web servers and users.# enduser-> myazkabanhost:443-> proxy-> localhost:8081# when this parameters set then these parameters are used to generate email links.# if these parameters are not set then jetty.hostname And jetty.port (if ssl configured jetty.ssl.port) are used.# azkaban.webserver.external_hostname=myazkabanhost.com# azkaban.webserver.external_ssl_port=443# azkaban.webserver.external_port=8081job.failure.email=job.success.email=lockdown.create.projects=falsecache.directory=cache# JMX statsjetty.connector.stats=trueexecutor.connector.stats=true# Azkaban plugin settingsazkaban.jobtype.plugin.dir=/root/soft/azkaban/azkaban-exec/plugins/jobtypes# Azkaban mysql settings by default. Users should configure their own username and password.database.type=mysqlmysql.port=3306mysql.host=192.168.3.175mysql.database=azkabanmysql.user=azkabanmysql.password=azkabanmysql.numconnections=100# Azkaban Executor settingsexecutor.maxThreads=50executor.flow.threads=30executor.port=12321

2.3.4 start Azkaban

Start AzkabanExecutorServer

Enter the / software/azkaban/azkaban-exec/bin directory, start the AzkabanExecutorServer, and check the jps process. The presence of the AzkabanExecutorServer process indicates that the startup is successful.

[root@node4 bin] # cd / software/azkaban/azkaban-exec/bin

[root@node4 bin] #. / start-exec.sh

[root@node4 bin] # jps

Activate AzkabanExecutor

Starting / restarting AzkabanExecutor requires activation. Execute the following command in the browser to activate

AzkabanExecutor:

Http://node4:12321/executor?action=activate

Start AzkabanWebServer

Enter the / software/azkaban/azkaban-web/bin directory and start

AzkabanWebServer,jps checks the process, and the presence of the AzkabanWebServer process indicates that the startup is successful.

[root@node4 bin] # cd / software/azkaban/azkaban-web/bin

[root@node4 bin] #. / start-web.sh

3. [root@mynode5 bin] # jps

At this point, the construction of Azkaban has been completed, let's check the operation of Azkaban.

2.4 verify the operation of Azkaban

To verify that Azkaban starts successfully, you can access the WebUI interface of Azkaban to check if it starts successfully. Enter http://node4:8081: in the browser

If ssl is configured, please visit: https//node4:port

Insert a picture description here

After entering the address, the above page indicates that there is no problem with configuring Azkaban. Azkaban starts successfully, and the default user name and password are Azkaban. You can enter the user name and password to log in to the Azkaban interface to submit the task flow for task management and scheduling. :

Insert a picture description here

Now that the Azkaban has been successfully built, let's simulate a task flow to try to use Azkaban for task scheduling.

3. Build Workflow

The above section describes the installation and deployment of Azkaban. In this section, we will design a simulated task flow flow, through which we can learn how to write tasks for Azkaban and how to view task flow scheduling and status in WebUI.

First of all, let's introduce the relationship among project, flows and job in Azkaban: a project can contain one or more flows, and a flows can contain multiple job. Job here is a process running in Azkaban, which can be simple linux commands, shell scripts, sql scripts, and so on. One job can rely on another job, and this dependency between multiple job forms a flow, that is, a task flow.

3.1 Design workflow

Suppose you now have five job, which are job1, job2, job3, job4, and job5. Each job executes a shell script. Job3 depends on the results of execution with job1 and job2, job4 depends on the results of job3 execution, and job5 depends on the results of job4 execution.

In order to meet the above task requirements, we can design a task flow (flow), in which there are five job. According to the above job dependency, we can write a simple task flow and submit it to Azkaban for scheduling and execution.

3.2Writing Job in different stages

Writing job is easy, and you need to create a text file that ends with ".job" in the following format:

Type=command

Scripts or commands that command= needs to execute

Type=command is to tell Azkaban to use the unix original command to run the order or script, command= "xxx" is to specify the current job needs to execute the command or script, if the current job depends on other job, only need to add "dependencies= dependent job name" to this text file, the dependent job only needs to write the name, not the suffix "job".

To facilitate the demonstration of the above workflow, each job designed here calls up a script on linux. In the script, a simple echo is used to print some information for reference. The five job and the corresponding script settings are as follows:

Write job tasks

Job1.job:

Type=command

Command= sh job1.sh

Job2.job:

Type=command

Command= sh job2.sh

Job3.job:

Type=command

Command= sh job3.sh

Dependencies=job1,job2

Job4.job:

Type=command

Command= sh job4.sh

Dependencies=job3

Job5.job:

Type=command

Command= sh job5.sh

Dependencies=job4

The above job tasks can be written in the local window environment, and after the completion of the writing, you need to transfer 5 job

Compressed into a compressed file, later submitted to Azkaban for execution.

two。 Write script content

Job1.sh:

Echo "starts to execute job1......"

Echo "executing job1......"

Echo "execution completes job1......"

Job2.sh:

Echo "starts to execute job2......"

Echo "executing job2......"

Echo "execution completes job2......"

Job3.sh:

Echo "starts to execute job3......"

Echo "executing job3......"

Echo "execution completes job3......"

Job4.sh:

Echo "starts to execute job4......"

Echo "executing job4......"

Echo "execution completes job4......"

Job5.sh:

Echo "starts to execute job5......"

Echo "executing job5......"

Echo "execution completes job5......"

The script above is written in linux, and several scripts such as job1.sh, job2.sh, job3.sh, job4.sh and job5.sh need to be given execution permission.

3.3 configure workflow and execute

When a task is submitted in Azkaban, all job files must be compressed into a zip file and then submitted to the

Execute in Azkaban. First compress the above five job into a zip file, then log in to Azkaban and click

Click Create Project in the upper right corner to create a project:

Insert a picture description here

Fill in the project name and project description in the pop-up box:

Insert a picture description here

Click Create Project and click upload to upload the compressed task:

Insert a picture description here

After the upload is completed, check the task flow. The default Flow name of Azkaban is defined with the last job that has no dependencies:

Insert a picture description here

Click Execute Flow to see the detailed dependencies of the task flow:

Insert a picture description here

Click Execute to perform the task.

If you want to execute a task every once in a while, click Schedule, configure scheduled tasks, and then click Schedule to schedule after configuring the time. As shown in the figure:

Insert a picture description here

3.4 Workflow execution monitoring

After the execution task is completed, it automatically jumps to the successful execution interface:

Insert a picture description here

3.5 Azkaban problem

If the task is running all the time when the Azkaban is submitted, but the execution is not complete, check the azkabanwebui log for "Cannot request memory (Xms 0 kb, Xmx 0 kb) from system for job job1, sleep for 60 secs and retry, at …". This problem is that executor checks whether the node memory is sufficient for 3G before executing the task. This error will occur if it is not enough. Can be used in... / azkaban-exec/plugins/jobtype catalog, configure the commonprivate.properties document, add "memCheck.enabled=false" to the document, do not check the memory, and restart azkaban.

This is the end of the "compilation and installation steps of azkaban3.9.0". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report