Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to compile Oozie

2025-01-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article introduces the knowledge of "how to compile Oozie". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

What is workflow?

Workflow (WorkFlow) is the computing model of workflow, that is, the logic and rules of how the work in the workflow is organized together are represented and calculated by an appropriate model in the computer. The main problem to be solved in workflow is that in order to achieve a business goal, it is automatically transmitted by computer among multiple participants according to some predetermined rules. Let's take "the process of employee leave" as an example to introduce what is workflow.

This example includes a complete employee leave process. From the beginning of the leave process to the employee filling in the request for leave, and then to the approval of the department manager, if the approval is not approved, the process goes back to the employee filling in the request for leave; if the approval is approved by the department manager, the process goes to the next node; until the end of the final process. In Java, we can use some frameworks to help us implement such a process. The three mainstream workflow engines of Java are: Shark,osworkflow,JBPM

What is Oozie?

On what is Oozie, in fact, Oozie is a workflow scheduling tool that serves the Hadoop ecosystem, and the Job running platform is the biggest difference from other scheduling tools. However, the idea of its implementation is almost exactly the same as that of general scheduling tools. Oozie workflows are constructed through HPDL, a language that is customized through XML, similar to JBOSS JBPM's JPDL. The Action in the Oozie workflow runs on the process system (on the Hadoop,Pig server). Once the Action is complete, the remote server will call back the interface of the Oozie and notify the Action that it is complete, and the Oozie will execute the next Action in the workflow in the same way until all the Action in the workflow is complete (completion including failure). Oozie workflows provide various types of Action to support different needs, such as Hadoop Map/Reduce,Hadoop File System,Pig,SSH,HTTP,Email,Java and oozie subprocesses. Oozie also supports custom extensions to various types of Action mentioned above.

A working Oozie system must contain the following four modules: Oozie Client, Oozie Server, DataBase and Hadoop clusters.

Oozie Client can submit workflow task requests to Oozie Server through Web Service API, Java API and Command line. The Oozie client can obtain the log flow of Job from the Oozie server through REST API or Web GUI. Workflow configuration files, workflow property files and workflow libraries are usually included on the client side.

Oozie Server is responsible for receiving client requests, scheduling work tasks, and monitoring the execution status of workflows. Oozie itself does not execute a specific Job, but instead sends the configuration information of the Job to the execution environment.

DataBase is used to store Action information and Job information of Bundle, Coordinator and Workflow workflows, and record Oozie system information. To put it simply, except for the Oozie running log, which exists on the local hard disk and does not exist in DB, all other information is stored in DB.

The Hadoop cluster is the entity that runs the Oozie workflow and is responsible for handling the various Job submitted by Oozie Server. Including Job submitted by Hadoop components such as HDFS, MapReduce, Hive, Sqoop, etc.

III. Compiling Oozie

The version information used is as follows

Hadoop 2.4.1JDK 1.7Maven 3.5.0Oozie 4.3

In the directory unzipped from oozie, compile oozie and execute the command:

Bin/mkdistro.sh-DskipTests-Dhadoop.version=2.4.1

Note: if installed for the first time, Maven will automatically download the dependent jar package, which may take a long time.

If the following error occurs, it means that Maven has a memory overflow.

Set the environment variable: export MAVEN_OPTS= "- Xmx512m-XX:MaxPermSize=128m" and recompile.

The compilation is complete and the following prompt appears successfully.

4. Install and deploy Oozie

Extract the installation package

Tar-zxvf oozie-4.3.0-distro.tar.gz-C ~ / training/

Set environment variabl

Establish MySQL database

Create database oozie;create user 'oozieowner'@'%' identified by' password'; grant all on oozie.* TO 'oozieowner'@'%'; grant all on oozie.* TO' oozieowner'@'localhost' identified by 'password'

Modify file: conf/oozie-site.xml

Configure web console for oozie

(*) create a directory: mkdir / root/training/oozie-4.3.0/libext (*) upload the drivers of the files ext-2.2.zip and mysql to this directory (*) copy $HADOOP_HOME/share/hadoop/*/*.jar and $HADOOP_HOME/share/hadoop/*/lib/*.jar to the libext directory of Oozie (*) due to the conflict between hadoop and the tomcat jar package that comes with oozie, you need to drive the conflicting jar packages. Execute the following command: cd / root/training/oozie-4.3.0/libext mv servlet-api-2.5.jar servlet-api-2.5.jar.bak mv jsp-api-2.1.jar jsp-api-2.1.jar.bak mv jasper-compiler-5.5.23.jar jasper-compiler-5.5.23.jar.bak mv jasper-runtime-5.5.23.jar jasper-runtime-5.5.23.jar.bak

Initialize oozie

(*) generate war package for oozie web console: oozie-setup.sh prepare-war (*) initialize database: ooziedb.sh create-sqlfile oozie.sql-run (*) upload shared jar packages that different tasks depend on to HDFS: oozie-setup.sh sharelib create-fs hdfs://hadoop111:9000 (*) modify oozie-4.3.0/oozie-server/conf/server.xml and comment out the following records:

Start historyserver for oozie and Hadoop

Oozied.sh startmr-jobhistory-daemon.sh start historyserver

Access URL address: http://192.168.88.111:11000/oozie/

This is the end of how to compile Oozie. Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report