scientific-skills/dnanexus-integration/references/app-development.md
Apps and applets are executable programs that run on the DNAnexus platform. They can be written in Python or Bash and are deployed with all necessary dependencies and configuration.
Both are created identically until the final build step. Applets can be converted to apps later.
Generate a skeleton app directory structure:
dx-app-wizard
This creates:
dxapp.json - Configuration filesrc/ - Source code directoryresources/ - Bundled dependenciestest/ - Test filesBuild an applet:
dx build
Build an app:
dx build --app
The build process:
my-app/
├── dxapp.json # Metadata and configuration
├── src/
│ └── my-app.py # Main executable (Python)
│ └── my-app.sh # Or Bash script
├── resources/ # Bundled files and dependencies
│ └── tools/
│ └── data/
└── test/ # Test data and scripts
└── test.json
Python apps use the @dxpy.entry_point() decorator to define functions:
import dxpy
@dxpy.entry_point('main')
def main(input1, input2):
# Process inputs
# Return outputs
return {
"output1": result1,
"output2": result2
}
dxpy.run()
Inputs: DNAnexus data objects are represented as dicts containing links:
@dxpy.entry_point('main')
def main(reads_file):
# Convert link to handler
reads_dxfile = dxpy.DXFile(reads_file)
# Download to local filesystem
dxpy.download_dxfile(reads_dxfile.get_id(), "reads.fastq")
# Process file...
Outputs: Return primitive types directly, convert file outputs to links:
# Upload result file
output_file = dxpy.upload_local_file("output.fastq")
return {
"trimmed_reads": dxpy.dxlink(output_file)
}
Bash apps use a simpler shell script approach:
#!/bin/bash
set -e -x -o pipefail
main() {
# Download inputs
dx download "$reads_file" -o reads.fastq
# Process
process_reads reads.fastq > output.fastq
# Upload outputs
trimmed_reads=$(dx upload output.fastq --brief)
# Set job output
dx-jobutil-add-output trimmed_reads "$trimmed_reads" --class=file
}
Download → Process → Upload pattern:
# Download input
dxpy.download_dxfile(input_file_id, "input.fastq")
# Run analysis
subprocess.check_call(["tool", "input.fastq", "output.bam"])
# Upload result
output = dxpy.upload_local_file("output.bam")
return {"aligned_reads": dxpy.dxlink(output)}
# Process multiple inputs
for file_link in input_files:
file_handler = dxpy.DXFile(file_link)
local_path = f"{file_handler.name}"
dxpy.download_dxfile(file_handler.get_id(), local_path)
# Process each file...
Apps can spawn subjobs for parallel execution:
# Create subjobs
subjobs = []
for item in input_list:
subjob = dxpy.new_dxjob(
fn_input={"input": item},
fn_name="process_item"
)
subjobs.append(subjob)
# Collect results
results = [job.get_output_ref("result") for job in subjobs]
Apps run in isolated Linux VMs (Ubuntu 24.04) with:
/home/dnanexusTest app logic locally before deploying:
cd my-app
python src/my-app.py
Run the applet on the platform:
dx run applet-xxxx -i input1=file-yyyy
Monitor job execution:
dx watch job-zzzz
View job logs:
dx watch job-zzzz --get-streams
Ensure files are properly downloaded before accessing:
dxpy.download_dxfile(file_id, local_path)
# Now safe to open local_path
Specify larger instance type in dxapp.json systemRequirements
Increase timeout in dxapp.json or split into smaller jobs
Ensure app has necessary permissions in dxapp.json