Introducing Oozie Editor/Dashboard

The Oozie Editor/Dashboard application allows you to define Oozie workflow and coordinator applications, run workflow and coordinator jobs, and view the status of jobs. For information about Oozie, see Oozie Documentation.

A workflow application is a collection of actions arranged in a directed acyclic graph (DAG). It includes control flow nodes (start, end, fork, join, decision, and kill) and action nodes (MapReduce, Streaming, Java, Pig, Hive, Sqoop, Shell, Ssh, DistCp, Fs, Email, Sub-workflow, and Generic).

A coordinator application allows you to define and execute recurrent and interdependent workflow jobs. The coordinator application defines the conditions under which the execution of workflows can occur.

Contents

Oozie Editor/Dashboard Installation and Configuration

Oozie Editor/Dashboard is one of the applications that is installed as part of Hue.

Note
In order to run DistCp, Streaming, Pig, Sqoop, and Hive jobs as part of a workflow, Oozie must be configured to use the Oozie ShareLib. See Oozie Installation.

Starting Oozie Editor/Dashboard

To start Oozie Editor/Dashboard, click the Oozie Editor/Dashboard icon () in the navigation bar at the top of the Hue browser page. Oozie Editor/Dashboard opens with the following screens:

Installing Oozie Editor/Dashboard Samples

The Oozie Editor/Dashboard sample workflows and coordinators can help you learn how to use Oozie Editor/Dashboard. To install the samples:

  1. Click the Workflows tab.
  2. Click the Setup App button. This action adds samples demonstrating all the types of actions to the Workflows Editor and one sample to the Coordinator Editor. It also creates workspaces and deployment directories required by the samples in /user/hue/oozie.

Filtering Lists in Oozie Editor/Dashboard

The Dashboard, Workflows, Coordinators, and History screens contain lists of workflows, coordinators, and jobs. When you type in the Filter field on these screens, the lists are dynamically filtered to display only those rows containing text that matches the specified substring.

Permissions in Oozie Editor/Dashboard

In the Dashboard workflows and coordinators can only be viewed, submitted, and modified by its owner or a superuser.

Editor permissions for performing actions on workflows and coordinators are summarized in the following table:

Action Superuser or Owner
All
View Y Only if "Is shared" is set
Submit Y Only if "Is shared" is set
Modify Y N

Oozie Dashboard

Oozie Dashboard shows a summary of the running and completed workflow and coordinator jobs.

You can view jobs for a period up to the last 30 days.

You can filter the list by date (1, 7, 15, or 30 days) or status (Succeeded, Running, or Killed). The date and status buttons are toggles.

Workflows

Click the Workflows tab to view the running and completed workflow jobs for the filters you have specified.

Click a workflow row in the Running or Completed table to view detailed information about that workflow job.

For the selected job, the following information is available.

Coordinators

Click the Coordinators tab to view the running and completed coordinator jobs for the filters you have specified.

For the selected job, the following information is available.

Workflow Manager

In Workflow Manager you create Oozie workflows and submit them for execution.

Click the Workflows tab to open the Workflow Manager.

Each row shows a workflow: its name, description, timestamp of its last modification. It also shows:

In Workflow Editor you edit workflows that include MapReduce, Streaming, Java, Pig, Hive, Sqoop, Shell, Ssh, DistCp, Fs, Email, Sub-workflow, and Generic actions. You can configure these actions in the Workflow Editor, or you can import job designs from Job Designer to be used as actions in your workflow. For information about defining workflows, see the Workflow Specification.

Installing the Sample Workflows

  1. Click the Setup Examples button at the top right.

Opening a Workflow

To open a workflow, in Workflow Manager, click the workflow. Proceed with Editing a Workflow.

Creating a Workflow

  1. Click the Create button at the top right.
  2. In the Name field, type a name.
  3. Click advanced to specify whether the workflow is shared, the deployment directory, or a job.xml file.
  4. Click Save. The Workflow Editor opens. Proceed with Editing a Workflow.

Importing a Workflow

  1. Click the Import button at the top right.
  2. In the Name field, type a name.
  3. In the Local workflow.xml file field, click Choose File and select a workflow file.
  4. Click advanced to specify whether the workflow is shared, the deployment directory, or a job.xml file.
  5. Click Save. The Workflow Editor opens. Proceed with Editing a Workflow.

Submitting a Workflow

To submit a workflow for execution, do one of the following:

The workflow job is submitted and the Dashboard displays the workflow job.

To view the output of the job, click View the logs.

Suspending a Running Job

In the pane on the left, click the Suspend button.

  1. Verify that you want to suspend the job.

Resuming a Suspended Job

In the pane on the left, click the Resume button.

  1. Verify that you want to resume the job.

Rerunning a Workflow

In the pane on the left, click the Rerun button.

  1. Check the checkboxes next to the actions to rerun.
  2. Specify required variables.
  3. Click Submit.

Scheduling a Workflow

To schedule a workflow for recurring execution, do one of the following:

A coordinator is created and opened in the Coordinator Editor. Proceed with Editing a Coordinator.

Editing a Workflow

In the Workflow Editor you can easily perform operations on Oozie action and control nodes.

Action Nodes

The Workflow Editor supports dragging and dropping action nodes. As you move the action over other actions and forks, highlights indicate active areas. If there are actions in the workflow, the active areas are the actions themselves and the areas above and below the actions. If you drop an action on an existing action, a fork and join is added to the workflow.

Control Nodes

Uploading Workflow Files

In the Workflow Editor, click the Upload button.

Editing Workflow Properties

  1. In the Workflow Editor, click the link under the Name or Description fields in the left pane.
  2. To share the workflow with all users, check the Is shared checkbox.
  3. To set advanced execution options, click advanced and edit the deployment directory, add parameters and job properties, or specify a job.xml file.
  4. Click Save.

Displaying the History of a Workflow

  1. Do one of the following:

Coordinator Manager

In Coordinator Manager you create Oozie coordinator applications and submit them for execution.

Click the Coordinators tab to open the Coordinator Manager.

Each row shows a coordinator: its name, description, timestamp of its last modification. It also shows:

In Coordinator Editor, you edit coordinators and the datasets required by the coordinators. For information about defining coordinators and datasets, see the Coordinator Specification.

Opening a Coordinator

To open a coordinator, in Coordinator Manager, click the coordinator. Proceed with Editing a Coordinator.

Creating a Coordinator

To create a coordinator, in Coordinator Manager:

  1. Click the *Create button at the top right. The Coordinator wizard opens. Proceed with Editing a Coordinator.

Submitting a Coordinator

To submit a coordinator for execution, click the radio button next to the coordinator and click the Submit button.

Editing a Coordinator

In the Coordinator Editor you specify coordinator properties and the datasets on which the workflow scheduled by the coordinator will operate by stepping through screens in a wizard. You can also advance to particular steps and revisit steps by clicking the Step "tabs" above the screens. The following instructions walk you through the coordinator wizard.

  1. Type a name, select the workflow, check the Is shared checkbox to share the job, and click Next. If the Coordinator Editor was opened after scheduling a workflow, the workflow will be set.
  2. Select how many times the communicator will run for each specified unit, the start and end times of the coordinator, the timezone of the start and end times, and click Next. The start and end times must be expressed as UTC times. For example, to run at 10 pm PST, specify a start time of 6 am UTC of the following day (+8 hours) and set the Timezone field to America/Los_Angeles.
  3. Click Add to select an input dataset and click Next. If no datasets exist, follow the procedure in Creating a Dataset.
  4. Click Add to select an output dataset. Click Save coordinator or click Next to specify advanced settings.
  5. To share the coordinator with all users, check the Is shared checkbox.
  6. Fill in parameters to pass to Oozie, properties that determine how long a coordinator will wait before timing out, how many coordinators can run and wait concurrently, and the coordinator execution policy.
  7. Click Save coordinator.

Creating a Dataset

  1. In the Coordinator Editor, do one of the following:

Displaying Datasets

  1. In the Coordinator Editor, click Show existing in pane at the left.
  2. To edit a dataset, click the dataset name in the Existing datasets table. Proceed with Editing a Dataset.

Editing a Dataset

  1. Type a name for the dataset.
  2. In the Start and Frequency fields, specify when and how often the dataset will be available.
  3. In the URI field, specify a URI template for the location of the dataset. To construct URIs and URI paths containing dates and timestamps, you can specify the variables ${YEAR},${MONTH},${DAY},${HOUR},${MINUTE}. For example: hdfs://foo:9000/usr/app/stats/${YEAR}/${MONTH}/data.
  4. In the Instance field, click a button to choose a default, single, or range of data instances. For example, if frequency==DAY, a window of the last rolling 5 days (not including today) would be expressed as start: -5 and end: -1. Check the advanced checkbox to display a field where you can specify a coordinator EL function.
  5. Specify the timezone of the start date.
  6. In the Done flag field, specify the flag that identifies when input datasets are no longer ready.

Displaying the History of a Coordinator

  1. Do one of the following: