backend/onyx/background/README.md
The background jobs take care of:
| Worker | File | Queues |
|---|---|---|
| Primary | apps/primary.py | celery |
| Light | apps/light.py | vespa_metadata_sync, connector_deletion, doc_permissions_upsert, checkpoint_cleanup, index_attempt_cleanup |
| Heavy | apps/heavy.py | connector_pruning, connector_doc_permissions_sync, connector_external_group_sync, csv_generation, sandbox |
| Docprocessing | apps/docprocessing.py | docprocessing |
| Docfetching | apps/docfetching.py | connector_doc_fetching |
| User File Processing | apps/user_file_processing.py | user_file_processing, user_file_project_sync, user_file_delete |
| Monitoring | apps/monitoring.py | monitoring |
| Background (consolidated) | apps/background.py | All queues above except celery |
| App | File | Purpose |
|---|---|---|
| Beat | beat.py | Celery beat scheduler with DynamicTenantScheduler that generates per-tenant periodic task schedules |
| Client | client.py | Minimal app for task submission from non-worker processes (e.g., API server) |
app_base.py provides:
TenantAwareTask - Base task class that sets tenant contextIt is the single worker which handles tasks from the default celery queue. It is a singleton worker ensured by the PRIMARY_WORKER Redis lock
which it touches every CELERY_PRIMARY_WORKER_LOCK_TIMEOUT / 8 seconds (using Celery Bootsteps)
On startup:
Then it cycles through its tasks as scheduled by Celery Beat:
| Task | Frequency | Description |
|---|---|---|
check_for_indexing | 15s | Scans for connectors needing indexing → dispatches to DOCFETCHING queue |
check_for_vespa_sync_task | 20s | Finds stale documents/document sets → dispatches sync tasks to VESPA_METADATA_SYNC queue |
check_for_pruning | 20s | Finds connectors due for pruning → dispatches to CONNECTOR_PRUNING queue |
check_for_connector_deletion | 20s | Processes deletion requests → dispatches to CONNECTOR_DELETION queue |
check_for_user_file_processing | 20s | Checks for user uploads → dispatches to USER_FILE_PROCESSING queue |
check_for_checkpoint_cleanup | 1h | Cleans up old indexing checkpoints |
check_for_index_attempt_cleanup | 30m | Cleans up old index attempts |
celery_beat_heartbeat | 1m | Heartbeat for Beat watchdog |
Watchdog is a separate Python process managed by supervisord which runs alongside celery workers. It checks the ONYX_CELERY_BEAT_HEARTBEAT_KEY in Redis to ensure Celery Beat is not dead. Beat schedules the celery_beat_heartbeat for Primary to touch the key and share that it's still alive. See supervisord.conf for watchdog config.
Fast and short living tasks that are not resource intensive. High concurrency: Can have 24 concurrent workers, each with a prefetch of 8 for a total of 192 tasks in flight at once.
Tasks it handles:
Long running, resource intensive tasks, handles pruning and sandbox operations. Low concurrency - max concurrency of 4 with 1 prefetch.
Does not interact with the Document Index, it handles the syncs with external systems. Large volume API calls to handle pruning and fetching permissions, etc.
Generates CSV exports which may take a long time with significant data in Postgres.
Sandbox (new feature) for running Next.js, Python virtual env, OpenCode AI Agent, and access to knowledge files
Docprocessing and Docfetching are for indexing documents:
Observability and metrics collections:
Workers can expose Prometheus metrics via a standalone HTTP server. Currently docfetching and docprocessing have push-based task lifecycle metrics; the monitoring worker runs pull-based collectors for queue depth and connector health.
For the full metric reference, integration guide, and PromQL examples, see docs/METRICS.md.