scientific-skills/latchbio-integration/references/resource-configuration.md
Latch SDK provides flexible resource configuration for workflow tasks, enabling efficient execution on appropriate compute infrastructure including CPU, GPU, and memory-optimized instances.
The SDK provides pre-configured task decorators for common resource requirements:
Default configuration for lightweight tasks:
from latch import small_task
@small_task
def lightweight_processing():
"""Minimal resource requirements"""
pass
Use cases:
Increased CPU and memory for intensive computations:
from latch import large_task
@large_task
def intensive_computation():
"""Higher CPU and memory allocation"""
pass
Use cases:
GPU-enabled with minimal resources:
from latch import small_gpu_task
@small_gpu_task
def gpu_inference():
"""GPU-enabled task with basic resources"""
pass
Use cases:
GPU-enabled with maximum resources:
from latch import large_gpu_task
@large_gpu_task
def gpu_training():
"""GPU with maximum CPU and memory"""
pass
Use cases:
For precise control, use the @custom_task decorator:
from latch import custom_task
from latch.resources.tasks import TaskResources
@custom_task(
cpu=8,
memory=32, # GB
storage_gib=100,
timeout=3600, # seconds
)
def custom_processing():
"""Task with custom resource specifications"""
pass
from latch.resources.tasks import TaskResources
@custom_task(
cpu=16,
memory=64,
storage_gib=500,
timeout=7200,
gpu=1,
gpu_type="nvidia-tesla-a100"
)
def alphafold_prediction():
"""AlphaFold with A100 GPU and high memory"""
pass
Available GPU options:
from latch import custom_task
@custom_task(
cpu=32,
memory=128,
gpu=4,
gpu_type="nvidia-tesla-v100"
)
def multi_gpu_training():
"""Distributed training across multiple GPUs"""
pass
Memory-Intensive Tasks:
@custom_task(cpu=4, memory=128) # High memory, moderate CPU
def genome_assembly():
pass
CPU-Intensive Tasks:
@custom_task(cpu=64, memory=32) # High CPU, moderate memory
def parallel_alignment():
pass
I/O-Intensive Tasks:
@custom_task(cpu=8, memory=16, storage_gib=1000) # Large ephemeral storage
def data_preprocessing():
pass
Quick Validation:
@small_task
def validate_inputs():
"""Fast input validation"""
pass
Main Computation:
@large_task
def primary_analysis():
"""Resource-intensive analysis"""
pass
Result Aggregation:
@small_task
def aggregate_results():
"""Lightweight result compilation"""
pass
from latch import workflow, small_task, large_task, large_gpu_task
from latch.types import LatchFile
@small_task
def quality_control(fastq: LatchFile) -> LatchFile:
"""QC doesn't need much resources"""
return qc_output
@large_task
def alignment(fastq: LatchFile) -> LatchFile:
"""Alignment benefits from more CPU"""
return bam_output
@large_gpu_task
def variant_calling(bam: LatchFile) -> LatchFile:
"""GPU-accelerated variant caller"""
return vcf_output
@small_task
def generate_report(vcf: LatchFile) -> LatchFile:
"""Simple report generation"""
return report
@workflow
def genomics_pipeline(input_fastq: LatchFile) -> LatchFile:
"""Resource-optimized genomics pipeline"""
qc = quality_control(fastq=input_fastq)
aligned = alignment(fastq=qc)
variants = variant_calling(bam=aligned)
return generate_report(vcf=variants)
from latch import custom_task
@custom_task(
cpu=8,
memory=32,
timeout=10800 # 3 hours in seconds
)
def long_running_analysis():
"""Analysis with extended timeout"""
pass
Configure temporary storage for intermediate files:
@custom_task(
cpu=8,
memory=32,
storage_gib=500 # 500 GB temporary storage
)
def process_large_dataset():
"""Task with large intermediate files"""
# Ephemeral storage available at /tmp
temp_file = "/tmp/intermediate_data.bam"
pass
# INEFFICIENT: All tasks use large resources
@large_task
def validate_input(): # Over-provisioned
pass
@large_task
def simple_transformation(): # Over-provisioned
pass
# EFFICIENT: Right-sized resources
@small_task
def validate_input(): # Appropriate
pass
@small_task
def simple_transformation(): # Appropriate
pass
@large_task
def intensive_analysis(): # Appropriate
pass
During workflow execution, monitor:
Out of Memory (OOM):
# Solution: Increase memory allocation
@custom_task(cpu=8, memory=64) # Increased from 32 to 64 GB
def memory_intensive_task():
pass
Timeout:
# Solution: Increase timeout
@custom_task(cpu=8, memory=32, timeout=14400) # 4 hours
def long_task():
pass
Insufficient Storage:
# Solution: Increase ephemeral storage
@custom_task(cpu=8, memory=32, storage_gib=1000)
def large_intermediate_files():
pass
Dynamically allocate resources based on input:
from latch import workflow, custom_task
from latch.types import LatchFile
def get_resource_config(file_size_gb: float):
"""Determine resources based on file size"""
if file_size_gb < 10:
return {"cpu": 4, "memory": 16}
elif file_size_gb < 100:
return {"cpu": 16, "memory": 64}
else:
return {"cpu": 32, "memory": 128}
# Note: Resource decorators must be static
# Use multiple task variants for different sizes
@custom_task(cpu=4, memory=16)
def process_small(file: LatchFile) -> LatchFile:
pass
@custom_task(cpu=16, memory=64)
def process_medium(file: LatchFile) -> LatchFile:
pass
@custom_task(cpu=32, memory=128)
def process_large(file: LatchFile) -> LatchFile:
pass
Latch platform provides:
Check current platform limits:
Be aware of workspace quotas:
Contact Latch support for quota increases if needed.