Job Designer

The Job Designer application enables you to create and submit jobs to the Hadoop cluster. You can include variables with your jobs to enable you and other users to enter values for the variables when they run your job. The Job Designer supports the actions supported by Oozie: MapReduce, Streaming, Java, Pig, Hive, Sqoop, Shell, Ssh, DistCp, Fs, and Email.

Job Designer Installation and Configuration

Job Designer is one of the applications installed as part of Hue.

In order to run DistCp, Streaming, Pig, Sqoop, and Hive jobs, Oozie must be configured to use the Oozie ShareLib. See Oozie Installation in http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.3.0/bk_installing_manually_book/content/rpm-chap8.html.

Starting Job Designer

Click the Job Designer icon () in the navigation bar at the top of the Hue web page. The Job Designs page opens in the browser.

Job Designs

A job design specifies several meta-level properties of a job, including the job design name, description, the executable scripts or classes, and any parameters for those scripts or classes.

Filtering Job Designs

You can filter the job designs that appear in the list by owner, name, type, and description. 

To filter the Job Designs list:

  1. In the Job Designs window, click Designs.
  2. Enter text in the Filter text box at the top of the Job Designs window. When you type in the Filter field, the designs are dynamically filtered to display only those rows containing text that matches the specified substring.

Creating a Job Design

  1. In the Job Designs window, click New Action > Action, where Action is MapReduce, Streaming, Java, Pig, Hive, Sqoop, Shell, Ssh, DistCp, Fs, or Email.
  2. In the Job Design (Action type) window, specify the common and job type specific information.
  3. Click Save to save the job settings.

Deleting and Restoring Job Designs

You can move job designs to the trash and later restore or permanently delete them.

Deleting Job Designs

  1. In a Manager screen, check the checkbox next to one or more job designs.
  2. Choose one of the following:
    • Delete > Move to trash
    • Delete > Delete forever

Restoring Job Designs

  1. In a Manager screen, click Trash.
  2. Check the checkbox next to one or more job designs.
  3. Click Restore.

Job Design Settings

Job Design Common Settings

Most job design types support all the settings listed in the following table.  For job type specific settings, see: MapReduce, Streaming, Java, Pig, Hive, Sqoop, Shell, Ssh, DistCp, Fs, and Email.

All job design settings except Name and Description support the use of variables of the form $variable_name. When you run the job, a dialog box will appear to enable you to specify the values of the variables.

Name

Identifies the job and its collection of properties and parameters.

Description

A description of the job. The description is displayed in the dialog box that appears if you specify variables for the job.

Advanced

  • Is shared - Indicate whether to share the action with all users.
  • Oozie parameters -

Prepare

Specifies paths to create or delete before starting the workflow job.

Params

Parameters to pass to a script or command. The parameters are expressed using the  JSP 2.0 Specification (JSP.2.3) Expression Language, allowing variables, functions, and complex expressions as parameters.

Job Properties

Job properties. To set a property value, click Add Property.  

  • Property name - a configuration property name. This field provides autocompletion, so you can type the first few characters of a property name and then select the one you want from the drop-down list. 
  • Value - the property value.

Files

Files to pass to the job. Equivalent to the Hadoop -files option. 

Archives

Archives to pass to the job. Equivalent to the Hadoop -archives option.

MapReduce Job Design

A MapReduce job design consists of MapReduce functions written in Java. You can create a MapReduce job design from existing mapper and reducer classes without having to write a main Java class. You must specify the mapper and reducer classes as well as other MapReduce properties in the Job Properties setting.

Jar path

The fully-qualified path to a JAR file containing the classes that implement the Mapper and Reducer functions.

Streaming Job Design

Hadoop streaming jobs enable you to create MapReduce functions in any non-Java language that reads standard Unix input and writes standard Unix output. For more information about Hadoop streaming jobs, see Hadoop Streaming.

Mapper

The path to the mapper script or class. If the mapper file is not on the machines on the cluster, use the Files option to pass it as a part of job submission. Equivalent to the Hadoop -mapper option.

Reducer

The path to the reducer script or class. If the reducer file is not on the machines on the cluster, use the Files option to pass it as a part of job submission. Equivalent to the Hadoop -reducer option.

Java Job Design

A Java job design consists of a main class written in Java.

Jar path

The fully-qualified path to a JAR file containing the main class.

Main class

The main class to invoke the program.

Args

The arguments to pass to the main class.

Java opts

The options to pass to the JVM.

Pig Job Design

A Pig job design consists of a Pig script.

Script name

Script name or path to the Pig script.

Hive Job Design

A Hive job design consists of a Hive script.

Script name

Script name or path to the Hive script.

Sqoop Job Design

A Sqoop job design consists of a Sqoop command.

Command

The Sqoop command.

Shell Job Design

A Shell job design consists of a shell command.

Command

The shell command.

Capture output

Indicate whether to capture the output of the command.

Ssh Job Design

A Ssh job design consists of an ssh command.

User

The name of the user to run the command as.

Host

The name of the host to run the command on.

Command

The ssh command.

Capture output

Indicate whether to capture the output of the command.

DistCp Job Design

A DistCp job design consists of a .

Fs Job Design

A Fs job design consists of a command that operates on HDFS.

Delete path

The path to delete. If it is a directory, it deletes recursively all its content and then deletes the directory.

Create directory

The path of a directory to create.

Move file

The source and destination paths to the file to be moved.

Change permissions

The path whose permissions are to be changed, the permissions, and an indicator of whether to change permission recursively.

Email Job Design

A Email job design consists of an email message.

To addresses

The recipient of the email message.

CC addresses (optional)

The cc recipients of the email message.

Subject

The subject of the email message.

Body

The body of the email message.

Submitting a Job Design

  Note:

A job's input files must be uploaded to the cluster before you can submit the job.

To submit a job design:

  1. In the Job Designs window, click Designs in the upper left corner. Your jobs and other users' jobs are displayed in the Job Designs window.
  2. Check the checkbox next to the job you want to submit.
  3. Click the Submit button.
    1. If the job contains variables, enter the information requested in the dialog box that appears. For example, the sample grep MapReduce design displays a dialog where you specify the output directory.
    2. Click Submit to submit the job.

After the job is complete, the Job Designer displays the results of the job. For information about displaying job results, see Displaying the Results of Submitting a Job.

Copying, Editing, and Deleting a Job Design

If you want to edit and use a job but you don't own it, you can make a copy of it and then edit and use the copied job.

Action Procedure

Copy

  1. In the Job Designs window, click Designs. The jobs are displayed in the Job Designs window.
  2. Check the checkbox next to the job you want to copy.
  3. Click the Copy button.
  4. In the Job Design Editor window, change the settings and then click Save to save the job settings.

Edit

  1. In the Job Designs window, click Designs. The jobs are displayed in the Job Designs window.
  2. Check the checkbox next to the job you want to edit.
  3. Click the Edit button.
  4. In the Job Design window, change the settings and then click Save to save the job settings.
Delete
  1. In the Job Designs window, click Designs. The jobs are displayed in the Job Designs window.
  2. Check the checkbox next to the job you want to delete.
  3. Click the Delete button.
  4. Click OK to confirm the deletion.