docs/devguide/how-tos/Workers/scaling-workers.md
Workers execute business logic outside the Conductor server. Keeping them healthy requires two things: monitoring queue and worker state, and scaling based on what the data tells you.
Conductor tracks queue size and worker poll activity for every task type. Use this data to detect backlogs, stalled workers, and capacity issues.
Navigate to Home > Task Queues (or <your UI server URL>/taskQueue). For each task, the UI shows:
# List all tasks with queue info
conductor task list
# Get details for a specific task
conductor task get <TASK_NAME>
Get the number of tasks waiting in a queue:
curl '{{ server_host }}{{ api_prefix }}/tasks/queue/sizes?taskType=<TASK_NAME>' \
-H 'accept: */*'
Get worker poll data (which workers are polling, last poll time):
curl '{{ server_host }}{{ api_prefix }}/tasks/queue/polldata?taskType=<TASK_NAME>' \
-H 'accept: */*'
!!! note
Replace <TASK_NAME> with your task name.
Conductor publishes metrics that feed dashboards, alerts, and autoscaling policies. All metrics include taskType as a tag so you can monitor per-task.
max(task_queue_depth{taskType="my_task"})
rate(task_completed_seconds_count{taskType="my_task"}[$__rate_interval])
max(task_queue_wait_time_seconds{quantile="0.99", taskType="my_task"})
How long tasks sit in the queue before a worker picks them up. If this is more than a few seconds:
!!! warning Reducing the polling interval increases API requests to the server. Balance responsiveness against server load.
| Signal | Action |
|---|---|
| Queue depth growing steadily | Add worker instances |
| Queue wait time > 5s at p99 | Add worker instances or reduce polling interval |
| Throughput dropping while queue grows | Investigate worker health (CPU, memory, downstream dependencies) |
| Queue consistently empty, workers idle | Scale down to save resources |
Add more worker instances. Conductor distributes tasks automatically — every worker polling the same task type competes for work from the same queue. No configuration changes needed on the Conductor server.
The polling interval controls how frequently workers check for new tasks. Shorter intervals mean lower latency but higher server load.
| Scenario | Recommended interval |
|---|---|
| Latency-sensitive tasks | 100–500ms |
| Standard processing | 1–5s |
| Batch / background work | 5–30s |
Each worker instance can run multiple polling threads. A good starting point:
threads = (task_throughput × avg_task_duration) / num_worker_instances
For I/O-bound tasks (HTTP calls, database queries), use more threads than CPU cores. For CPU-bound tasks, match thread count to available cores.
If downstream services have rate limits, configure task-level rate limits to prevent workers from overwhelming them:
{
"name": "call_external_api",
"rateLimitPerFrequency": 100,
"rateLimitFrequencyInSeconds": 60
}
This limits the task to 100 executions per 60-second window across all workers.
Use task-to-domain to route tasks to specific worker pools. This prevents noisy neighbors — a high-volume workflow won't starve workers serving a latency-sensitive one.