DNAnexus App Development

Overview

Apps and applets are executable programs that run on the DNAnexus platform. They can be written in Python or Bash and are deployed with all necessary dependencies and configuration.

Applets vs Apps

Applets: Data objects that live inside projects. Good for development and testing.
Apps: Versioned, shareable executables that don't live inside projects. Can be published for others to use.

Both are created identically until the final build step. Applets can be converted to apps later.

Creating an App/Applet

Using dx-app-wizard

Generate a skeleton app directory structure:

bash

dx-app-wizard

This creates:

dxapp.json - Configuration file
src/ - Source code directory
resources/ - Bundled dependencies
test/ - Test files

Building and Deploying

Build an applet:

bash

dx build

Build an app:

bash

dx build --app

The build process:

Validates dxapp.json configuration
Bundles source code and resources
Deploys to the platform
Returns the applet/app ID

App Directory Structure

my-app/
├── dxapp.json          # Metadata and configuration
├── src/
│   └── my-app.py       # Main executable (Python)
│   └── my-app.sh       # Or Bash script
├── resources/          # Bundled files and dependencies
│   └── tools/
│   └── data/
└── test/               # Test data and scripts
    └── test.json

Python App Structure

Entry Points

Python apps use the @dxpy.entry_point() decorator to define functions:

python

import dxpy

@dxpy.entry_point('main')
def main(input1, input2):
    # Process inputs
    # Return outputs
    return {
        "output1": result1,
        "output2": result2
    }

dxpy.run()

Input/Output Handling

Inputs: DNAnexus data objects are represented as dicts containing links:

python

@dxpy.entry_point('main')
def main(reads_file):
    # Convert link to handler
    reads_dxfile = dxpy.DXFile(reads_file)

    # Download to local filesystem
    dxpy.download_dxfile(reads_dxfile.get_id(), "reads.fastq")

    # Process file...

Outputs: Return primitive types directly, convert file outputs to links:

python

    # Upload result file
    output_file = dxpy.upload_local_file("output.fastq")

    return {
        "trimmed_reads": dxpy.dxlink(output_file)
    }

Bash App Structure

Bash apps use a simpler shell script approach:

bash

#!/bin/bash
set -e -x -o pipefail

main() {
    # Download inputs
    dx download "$reads_file" -o reads.fastq

    # Process
    process_reads reads.fastq > output.fastq

    # Upload outputs
    trimmed_reads=$(dx upload output.fastq --brief)

    # Set job output
    dx-jobutil-add-output trimmed_reads "$trimmed_reads" --class=file
}

Common Development Patterns

1. Bioinformatics Pipeline

Download → Process → Upload pattern:

python

# Download input
dxpy.download_dxfile(input_file_id, "input.fastq")

# Run analysis
subprocess.check_call(["tool", "input.fastq", "output.bam"])

# Upload result
output = dxpy.upload_local_file("output.bam")
return {"aligned_reads": dxpy.dxlink(output)}

2. Multi-file Processing

python

# Process multiple inputs
for file_link in input_files:
    file_handler = dxpy.DXFile(file_link)
    local_path = f"{file_handler.name}"
    dxpy.download_dxfile(file_handler.get_id(), local_path)
    # Process each file...

3. Parallel Processing

Apps can spawn subjobs for parallel execution:

python

# Create subjobs
subjobs = []
for item in input_list:
    subjob = dxpy.new_dxjob(
        fn_input={"input": item},
        fn_name="process_item"
    )
    subjobs.append(subjob)

# Collect results
results = [job.get_output_ref("result") for job in subjobs]

Execution Environment

Apps run in isolated Linux VMs (Ubuntu 24.04) with:

Internet access
DNAnexus API access
Temporary scratch space in /home/dnanexus
Input files downloaded to job workspace
Root access for installing dependencies

Testing Apps

Local Testing

Test app logic locally before deploying:

bash

cd my-app
python src/my-app.py

Platform Testing

Run the applet on the platform:

bash

dx run applet-xxxx -i input1=file-yyyy

Monitor job execution:

bash

dx watch job-zzzz

View job logs:

bash

dx watch job-zzzz --get-streams

Best Practices

Error Handling: Use try-except blocks and provide informative error messages
Logging: Print progress and debug information to stdout/stderr
Validation: Validate inputs before processing
Cleanup: Remove temporary files when done
Documentation: Include clear descriptions in dxapp.json
Testing: Test with various input types and edge cases
Versioning: Use semantic versioning for apps

Common Issues

File Not Found

Ensure files are properly downloaded before accessing:

python

dxpy.download_dxfile(file_id, local_path)
# Now safe to open local_path

Out of Memory

Specify larger instance type in dxapp.json systemRequirements

Timeout

Increase timeout in dxapp.json or split into smaller jobs

Permission Errors

Ensure app has necessary permissions in dxapp.json