docs/install/configure-operate/production-setup.mdx
This is the production setup we recommend. It is sized from a single number — your peak concurrent flows — and everything else follows from there.
One flow per worker. A small fleet of those. A thin tier of apps in front. Managed Postgres, Redis, and S3 behind — all in the same region.
| Component | Size each | How many |
|---|---|---|
| Worker | 0.5 vCPU / 1 GB, concurrency 1 | one per concurrent flow |
| App | 1 vCPU / 2 GB | one per ten workers |
| Postgres | 2 vCPU / 4 GB, managed | one, grows with the fleet |
| Redis | 1 vCPU / 1 GB, managed | one |
| Object storage (S3) | same region, signed URLs on | required |
S3 is a hard requirement, not a nice-to-have: without it, every flow bundle and piece archive funnels through the app tier and the throughput numbers below no longer hold. (Walkthrough: S3 Storage.)
Copy this — it's the exact configuration the benchmark below was measured on:
AP_WORKER_CONCURRENCY=1
AP_REUSE_SANDBOX=true
AP_EXECUTION_MODE=SANDBOX_CODE_ONLY
AP_FILE_STORAGE_LOCATION=S3
AP_S3_USE_SIGNED_URLS=true
A concurrency-1 worker is busy for a flow's whole duration (up to 10 min), so size by concurrent flows, not trigger rate:
workers = peak concurrent flows
apps = ceil(workers / 10)
At 50 concurrent flows: 50 workers (25 vCPU / 50 GB) + 5 apps (5 vCPU / 5 GB). Overflow queues in Redis and drains as slots free.
<Tip> Size **statically for peak** — autoscaling's boot and scheduling lag can't defend the 30 s sync-webhook budget. A pre-sized fleet keeps a slot warm and waiting. </Tip>And it scales with your fleet:
Full methodology and the ratio comparison: Benchmark.
| Limit | Default | Env var |
|---|---|---|
| Flow run timeout | 600 s | AP_FLOW_TIMEOUT_SECONDS |
| Sync webhook response | 30 s | AP_WEBHOOK_TIMEOUT_SECONDS |
| Max webhook payload | 25 MB | AP_MAX_WEBHOOK_PAYLOAD_SIZE_MB |
| Step file size | 25 MB | AP_MAX_FILE_SIZE_MB |
| Flow run log size | 50 MB | AP_MAX_FLOW_RUN_LOG_SIZE_MB |
The complete table lives in Limits.
<Note> Need to reserve dedicated capacity for specific tenants? See [Worker Groups](./worker-groups). </Note>Upgrading breaks nothing: the default AP_WORKER_CONCURRENCY=5 keeps each container running five flows at once. Reshaping to one flow per worker is opt-in, at the same total capacity. Keep total slots constant: slots = containers × concurrency.
| Before | After | |
|---|---|---|
| Per worker | ~2.5 vCPU / 5 GB, concurrency 5 | 0.5 vCPU / 1 GB, concurrency 1 |
| For 50 slots | 10 workers | 50 workers |