skills/pacsomatic/references/pacsomatic_guide.md
This guide summarizes the official nf-core/pacsomatic usage and how this skill helps an agent validate, prepare, and launch runs across compute platforms.
nf-core/pacsomatic is a Nextflow pipeline for matched tumor/normal PacBio HiFi somatic analysis.
Typical upstream command from official docs:
nextflow run nf-core/pacsomatic \
-profile <docker/singularity/.../institute> \
--input samplesheet.csv \
--outdir <OUTDIR> \
--genome GRCh38
Important notes from docs:
patient,sample,status,bam[,pbi].status uses 1 for tumor and 0 for normal.-params-file, not -c.This skill can be reused by other agents in the same workspace.
.github/skills/pacsomatic intact when reusing.scripts/run_pacsomatic.py to generate a backend-aware launch script, orSKILL.md, references/, and scripts/ together.singularity,sanger, queue/project names, and network access
for remote BAM URLs).patient,sample,status,bam,pbi
ID1,ID1_tumor,1,/path/ID1_tumor.bam,/path/ID1_tumor.bam.pbi
ID1,ID1_normal,0,/path/ID1_normal.bam,/path/ID1_normal.bam.pbi
pbi is optional. If not available, leave it blank.
By default, the helper generates artifacts. With --run, it executes/submits
using the selected --executor backend.
It produces:
bash, bsub, sbatch, qsub)--run is enabledpython .github/skills/pacsomatic/scripts/run_pacsomatic.py \
--tumor-bam /data/P1_tumor.bam \
--normal-bam /data/P1_normal.bam \
--patient-id P1 \
--tumor-sample-id P1_tumor \
--normal-sample-id P1_normal \
--fasta /refs/GRCh38.fa \
--outdir /results/p1 \
--profile singularity \
--executor local \
--queue normal \
--cpus 16 \
--memory-gb 64 \
--walltime 48:00
python .github/skills/pacsomatic/scripts/run_pacsomatic.py \
--tumor-bam /data/P1_tumor.bam \
--normal-bam /data/P1_normal.bam \
--patient-id P1 \
--tumor-sample-id P1_tumor \
--normal-sample-id P1_normal \
--genome GRCh38 \
--outdir /results/p1 \
--profile singularity \
--executor local \
--dry-run
python .github/skills/pacsomatic/scripts/run_pacsomatic.py \
--tumor-bam /data/P1_tumor.bam \
--normal-bam /data/P1_normal.bam \
--patient-id P1 \
--tumor-sample-id P1_tumor \
--normal-sample-id P1_normal \
--genome GRCh38 \
--outdir /results/p1 \
--profile singularity \
--executor lsf \
--run
python .github/skills/pacsomatic/scripts/run_pacsomatic.py \
--tumor-bam /data/P1_tumor.bam \
--normal-bam /data/P1_normal.bam \
--patient-id P1 \
--tumor-sample-id P1_tumor \
--normal-sample-id P1_normal \
--genome GRCh38 \
--outdir /results/p1 \
--profile singularity \
--executor slurm \
--queue compute \
--cpus 16 \
--memory-gb 64 \
--run
The helper is a standalone CLI and can be run directly from shell scripts, terminal sessions, CI jobs, or workflow launch wrappers.
Supported HPC schedulers via --executor:
lsf (uses bsub)slurm (uses sbatch)pbs (uses qsub)sge (uses qsub)Local direct execution is also supported with:
--executor local (runs generated script with bash)If users request a ready-to-submit LSF script directly, provide a .lsf.sh
file that can be submitted as-is.
Example file: submit_pacsomatic_hg008.lsf.sh
#!/usr/bin/env bash
#BSUB -J Somatic_singularity
#BSUB -P Somatic_singularity
#BSUB -q heavy_io
#BSUB -n 16
#BSUB -M 64000
#BSUB -W 48:00
#BSUB -o out%J.out
#BSUB -e err%J.err
set -euo pipefail
RUN_DIR="${RUN_DIR:-$PWD/pacsomatic_hg008_run}"
OUTDIR="${OUTDIR:-$RUN_DIR/results}"
WORKDIR="${WORKDIR:-$RUN_DIR/work}"
SAMPLESHEET="$RUN_DIR/samplesheet.csv"
mkdir -p "$RUN_DIR" "$OUTDIR" "$WORKDIR"
cat > "$SAMPLESHEET" << 'CSV'
patient,sample,status,bam,pbi
Patient_HG008,DS_MT_T,1,https://raw.githubusercontent.com/nf-core/test-datasets/pacsomatic/testdata/HG008_Downsample_MT_tumor.bam,
Patient_HG008,DS_MT_N,0,https://raw.githubusercontent.com/nf-core/test-datasets/pacsomatic/testdata/HG008_Downsample_MT_normal.bam,
CSV
module load nextflow/21.10.5
export NXF_WORK="$WORKDIR"
nextflow run nf-core/pacsomatic \
-profile singularity,sanger \
--input "$SAMPLESHEET" \
--outdir "$OUTDIR" \
--genome GRCh38 \
-with-report "$OUTDIR/HiFi_Somatic_Nextflow_Run_Report.html" \
-with-dag "$OUTDIR/HiFi_Somatic_Flowchart.png" \
-resume
Submit with:
bsub < submit_pacsomatic_hg008.lsf.sh
The script supports your style of submission, including -P, queue switching,
module load nextflow/21.10.5, -resume, and report/DAG outputs.
Built-in defaults now match your common combo:
Somatic_singularityheavy_iomodule load nextflow/21.10.5Default LSF output naming now follows your style:
out%J.outerr%J.errYou can override with --stdout-file and --stderr-file, and optionally set
--logdir to place them under a specific directory.
python .github/skills/pacsomatic/scripts/run_pacsomatic.py \
--tumor-bam /data/P1_tumor.bam \
--normal-bam /data/P1_normal.bam \
--patient-id P1 \
--tumor-sample-id P1_tumor \
--normal-sample-id P1_normal \
--genome GRCh38 \
--outdir /results/p1 \
--project Somatic_test \
--queue heavy_io \
--memory-gb 20 \
--job-name Somatic_test \
--module-load "module load nextflow/21.10.5" \
--with-report HiFi_Somatic_Nextflow_Run_Report.html \
--with-dag HiFi_Somatic_Flowchart.png
For your Sanger configs usage, set combined profiles such as:
--profile singularity,sanger
Reference: https://nf-co.re/configs/sanger/
--pipeline-version for reproducibility.--pipeline-version when using fixed test datasets to avoid schema drift across pipeline revisions.--params-file for large parameter sets and keep script options minimal.singularity or docker) on HPC.NXF_OPTS memory ceiling if Nextflow launcher memory spikes.