packages/cloud-infra/cloud/RAILWAY.md
Where each piece of the Eliza Cloud backend actually runs today, and where it is heading.
| Surface | Runtime | Repo path | Config |
|---|---|---|---|
cloud-frontend (dashboard SPA) | Cloudflare Pages | packages/cloud-frontend/ | Wrangler / Pages project |
cloud-api (REST + auth + billing) | Cloudflare Worker | packages/cloud-api/ | apps/api/wrangler.toml (env vars, secrets via wrangler secret) |
headscale (Tailscale coordination server for customer tunnels) | Railway | packages/cloud-services/headscale/ | railway.toml, Dockerfile |
tunnel-proxy (public HTTPS -> tailnet bridge) | Railway | packages/cloud-services/tunnel-proxy/ | railway.toml, Dockerfile |
gateway-discord | Cloudflare Worker | packages/cloud-services/gateway-discord/ | own wrangler.toml |
gateway-webhook | Cloudflare Worker | packages/cloud-services/gateway-webhook/ | own wrangler.toml |
agent-server (per-customer agent runtime) | Hetzner containers | packages/cloud-services/agent-server/ | provisioned via container-control-plane |
container-control-plane (provisioning API) | Hetzner / VPS | packages/cloud-services/container-control-plane/ | env-driven |
| Database migrations | GitHub Actions -> Neon (Postgres) | packages/cloud-api/db/ | .github/workflows/cloud-deploy-backend.yml |
The deprecated agent VPS deploy still exists behind the
deploy_legacy_vps workflow_dispatch input on
cloud-deploy-backend.yml. It is off by default and only runs when an
operator explicitly opts in. New code should not target it.
headscaleGET /health on listen_addr (port 8080). Headscale v0.28
serves this natively./var/lib/headscale (SQLite db + generated keys).headscale.elizacloud.ai.packages/cloud-services/headscale/DEPLOY.md.tunnel-proxyGET /health (served by main.go line 117)./var/lib/tunnel-proxy (tsnet node identity).tunnel.elizacloud.ai + wildcard *.tunnel.elizacloud.ai.packages/cloud-services/headscale/DEPLOY.md (covers both services).The strategic direction is to retire AWS and move central services to
Railway, with container-based workloads provisioned on Hetzner via the
container-control-plane. Concretely:
railway.toml next to its Dockerfile, point the healthcheck
at a real endpoint the service serves, and document it here.container-control-plane.Full classification, plan, owners, and outstanding items live in
AWS_RETIREMENT.md. Quick map:
| AWS thing | Status | Target |
|---|---|---|
@aws-sdk/client-s3 (cloud-shared) | Keep | Cloudflare R2 / Supabase / generic S3 endpoint — SDK is provider-agnostic |
@aws-sdk/client-kms (cloud-shared encryption) | Keep (optional) | LocalKMSProvider (AES-256-GCM with SECRETS_MASTER_KEY) is the default. AWS KMS provider only fires when AWS_KMS_KEY_ID is set |
legacy-gateway-discord-aws/ terraform | Deleted | n/a — was a stale duplicate |
cloud-services/gateway-discord/terraform/ (EKS) | Retire | Gateway-discord is a Docker/Bun service; redeploy on Railway / Hetzner. Terraform + CI workflow kept until Railway path lands. |
packages/examples/aws/ Lambda example | Keep | Documentation example for users who want to deploy elizaOS on Lambda. Not part of Eliza Cloud infra. |
| AWS ECR/ECS code | Already removed | Replaced by container-control-plane + Hetzner. README references are stale and have been pruned. |
railway.tomlpackages/cloud-infra/cloud/railway.toml used to deploy the old Next.js
fullstack cloud app to Railway. Its healthcheck pointed at /login, a
Next.js page route. That deployment is gone: cloud-frontend is a Vite SPA on
Cloudflare Pages and cloud-api is a Cloudflare Worker. The file has been
removed; nothing in the repo or in CI referenced it.