This articlerelies excessively onreferences toprimary sources. Please improve this article by addingsecondary or tertiary sources. Find sources: "Apache Oozie" – news ·newspapers ·books ·scholar ·JSTOR(January 2013) (Learn how and when to remove this message) |
Apache Oozie is a server-basedworkflowscheduling system to manageHadoop jobs.
Workflows in Oozie are defined as a collection of control flow and actionnodes in adirected acyclic graph. Control flow nodes define the beginning and the end of a workflow (start, end, and failure nodes) as well as a mechanism to control the workflow execution path (decision, fork, and join nodes). Action nodes are the mechanism by which a workflow triggers the execution of a computation/processing task. Oozie provides support for different types of actions including HadoopMapReduce, Hadoop distributed file system operations,Pig,SSH, andemail. Oozie can also be extended to support additional types of actions.
Oozie workflows can be parameterised using variables such as${inputDir} within the workflow definition. When submitting a workflow job, values for the parameters must be provided. If properly parameterized (using different output directories), several identical workflow jobs can run concurrently.
Oozie is implemented as a Javaweb application that runs in aJava servlet container and is distributed under theApache License 2.0.