DNAnexus Job Execution and Workflows

Overview

Jobs are the fundamental execution units on DNAnexus. When an applet or app runs, a job is created and executed on a worker node in an isolated Linux environment with constant API access.

Job Types

Origin Jobs

Initially created by users or automated systems.

Master Jobs

Result from directly launching an executable (app/applet).

Child Jobs

Spawned by parent jobs for parallel processing or sub-workflows.

Running Jobs

Running an Applet

Basic execution:

python

import dxpy

# Run an applet
job = dxpy.DXApplet("applet-xxxx").run({
    "input1": {"$dnanexus_link": "file-yyyy"},
    "input2": "parameter_value"
})

print(f"Job ID: {job.get_id()}")

Using command line:

bash

dx run applet-xxxx -i input1=file-yyyy -i input2="value"

Running an App

python

# Run an app by name
job = dxpy.DXApp(name="my-app").run({
    "reads": {"$dnanexus_link": "file-xxxx"},
    "quality_threshold": 30
})

Specifying Execution Parameters

python

job = dxpy.DXApplet("applet-xxxx").run(
    applet_input={
        "input_file": {"$dnanexus_link": "file-yyyy"}
    },
    project="project-zzzz",  # Output project
    folder="/results",        # Output folder
    name="My Analysis Job",   # Job name
    instance_type="mem2_hdd2_x4",  # Override instance type
    priority="high"           # Job priority
)

Job Monitoring

Checking Job Status

python

job = dxpy.DXJob("job-xxxx")
state = job.describe()["state"]

# States: idle, waiting_on_input, runnable, running, done, failed, terminated
print(f"Job state: {state}")

Using command line:

bash

dx watch job-xxxx

Waiting for Job Completion

python

# Block until job completes
job.wait_on_done()

# Check if successful
if job.describe()["state"] == "done":
    output = job.describe()["output"]
    print(f"Job completed: {output}")
else:
    print("Job failed")

Getting Job Output

python

job = dxpy.DXJob("job-xxxx")

# Wait for completion
job.wait_on_done()

# Get outputs
output = job.describe()["output"]
output_file_id = output["result_file"]["$dnanexus_link"]

# Download result
dxpy.download_dxfile(output_file_id, "result.txt")

Job Output References

Create references to job outputs before they complete:

python

# Launch first job
job1 = dxpy.DXApplet("applet-1").run({"input": "..."})

# Launch second job using output reference
job2 = dxpy.DXApplet("applet-2").run({
    "input": dxpy.dxlink(job1.get_output_ref("output_name"))
})

Job Logs

Viewing Logs

Command line:

bash

dx watch job-xxxx --get-streams

Programmatically:

python

import sys

# Get job logs
job = dxpy.DXJob("job-xxxx")
log = dxpy.api.job_get_log(job.get_id())

for log_entry in log["loglines"]:
    print(log_entry)

Parallel Execution

Creating Subjobs

python

@dxpy.entry_point('main')
def main(input_files):
    # Create subjobs for parallel processing
    subjobs = []

    for input_file in input_files:
        subjob = dxpy.new_dxjob(
            fn_input={"file": input_file},
            fn_name="process_file"
        )
        subjobs.append(subjob)

    # Collect results
    results = []
    for subjob in subjobs:
        result = subjob.get_output_ref("processed_file")
        results.append(result)

    return {"all_results": results}

@dxpy.entry_point('process_file')
def process_file(file):
    # Process single file
    # ...
    return {"processed_file": output_file}

Scatter-Gather Pattern

python

# Scatter: Process items in parallel
scatter_jobs = []
for item in items:
    job = dxpy.new_dxjob(
        fn_input={"item": item},
        fn_name="process_item"
    )
    scatter_jobs.append(job)

# Gather: Combine results
gather_job = dxpy.new_dxjob(
    fn_input={
        "results": [job.get_output_ref("result") for job in scatter_jobs]
    },
    fn_name="combine_results"
)

Workflows

Workflows combine multiple apps/applets into multi-step pipelines.

Creating a Workflow

python

# Create workflow
workflow = dxpy.new_dxworkflow(
    name="My Analysis Pipeline",
    project="project-xxxx"
)

# Add stages
stage1 = workflow.add_stage(
    dxpy.DXApplet("applet-1"),
    name="Quality Control",
    folder="/qc"
)

stage2 = workflow.add_stage(
    dxpy.DXApplet("applet-2"),
    name="Alignment",
    folder="/alignment"
)

# Connect stages
stage2.set_input("reads", stage1.get_output_ref("filtered_reads"))

# Close workflow
workflow.close()

Running a Workflow

python

# Run workflow
analysis = workflow.run({
    "stage-xxxx.input1": {"$dnanexus_link": "file-yyyy"}
})

# Monitor analysis (collection of jobs)
analysis.wait_on_done()

# Get workflow outputs
outputs = analysis.describe()["output"]

Using command line:

bash

dx run workflow-xxxx -i stage-1.input=file-yyyy

Job Permissions and Context

Workspace Context

Jobs run in a workspace project with cloned input data:

Jobs require CONTRIBUTE permission to workspace
Jobs need VIEW access to source projects
All charges accumulate to the originating project

Data Requirements

Jobs cannot start until:

All input data objects are in closed state
Required permissions are available
Resources are allocated

Output objects must reach closed state before workspace cleanup.

Job Lifecycle

Created → Waiting on Input → Runnable → Running → Done/Failed

States:

idle: Job created but not yet queued
waiting_on_input: Waiting for input data objects to close
runnable: Ready to run, waiting for resources
running: Currently executing
done: Completed successfully
failed: Execution failed
terminated: Manually stopped

Error Handling

Job Failure

python

job = dxpy.DXJob("job-xxxx")
job.wait_on_done()

desc = job.describe()
if desc["state"] == "failed":
    print(f"Job failed: {desc.get('failureReason', 'Unknown')}")
    print(f"Failure message: {desc.get('failureMessage', '')}")

Retry Failed Jobs

python

# Rerun failed job
new_job = dxpy.DXApplet(desc["applet"]).run(
    desc["originalInput"],
    project=desc["project"]
)

Terminating Jobs

python

# Stop a running job
job = dxpy.DXJob("job-xxxx")
job.terminate()

Using command line:

bash

dx terminate job-xxxx

Resource Management

Instance Types

Specify computational resources:

python

# Run with specific instance type
job = dxpy.DXApplet("applet-xxxx").run(
    {"input": "..."},
    instance_type="mem3_ssd1_v2_x8"  # 8 cores, high memory, SSD
)

Common instance types:

mem1_ssd1_v2_x4 - 4 cores, standard memory
mem2_ssd1_v2_x8 - 8 cores, high memory
mem3_ssd1_v2_x16 - 16 cores, very high memory
mem1_ssd1_v2_x36 - 36 cores for parallel workloads

Timeout Settings

Set maximum execution time:

python

job = dxpy.DXApplet("applet-xxxx").run(
    {"input": "..."},
    timeout="24h"  # Maximum runtime
)

Job Tagging and Metadata

Add Job Tags

python

job = dxpy.DXApplet("applet-xxxx").run(
    {"input": "..."},
    tags=["experiment1", "batch2", "production"]
)

Add Job Properties

python

job = dxpy.DXApplet("applet-xxxx").run(
    {"input": "..."},
    properties={
        "experiment": "exp001",
        "sample": "sample1",
        "batch": "batch2"
    }
)

Finding Jobs

python

# Find jobs by tag
jobs = dxpy.find_jobs(
    project="project-xxxx",
    tags=["experiment1"],
    describe=True
)

for job in jobs:
    print(f"{job['describe']['name']}: {job['id']}")

Best Practices

Job Naming: Use descriptive names for easier tracking
Tags and Properties: Tag jobs for organization and searchability
Resource Selection: Choose appropriate instance types for workload
Error Handling: Check job state and handle failures gracefully
Parallel Processing: Use subjobs for independent parallel tasks
Workflows: Use workflows for complex multi-step analyses
Monitoring: Monitor long-running jobs and check logs for issues
Cost Management: Use appropriate instance types to balance cost/performance
Timeouts: Set reasonable timeouts to prevent runaway jobs
Cleanup: Remove failed or obsolete jobs

Debugging Tips

Check Logs: Always review job logs for error messages
Verify Inputs: Ensure input files are closed and accessible
Test Locally: Test logic locally before deploying to platform
Start Small: Test with small datasets before scaling up
Monitor Resources: Check if job is running out of memory or disk space
Instance Type: Try larger instance if job fails due to resources