handbook/handbook/engineering/cache-infrastructure.md
Tuist operates a globally distributed cache service that handles artifacts to speed up developers' workflows. This performance is achieved by deploying servers across multiple geographic regions.
We manage the host machines using NixOS, which allows us to define the system configuration declaratively and reproduce the same environment across all servers. Even though the cache application itself runs in a Docker container, we rely on NixOS to keep the underlying host environment (kernel, networking, system services, nginx, Docker runtime, and observability) consistent and predictable.
The cache service runs on dedicated servers across multiple regions to minimize latency for developers worldwide. Unlike the main Tuist server (which runs on Render), the cache service is deployed on self-hosted machines that we operate and manage ourselves.
NixOS gives us an infrastructure-as-code workflow for those machines, so host-level configuration changes are versioned, reviewable, and reproducible across all regions.
| Component | Technology | Purpose |
|---|---|---|
| OS & Configuration | NixOS with Nix Flakes | Declarative, reproducible server configuration |
| Deployment | Colmena | Multi-machine NixOS deployment orchestration |
| Application | Elixir/Phoenix in Docker | Cache service API (deployed via Kamal) |
| Reverse Proxy | nginx | HTTP/2, TLS termination, static file serving |
| Secrets | opnix (1Password) | Secure secrets management |
| Observability | Grafana Alloy | Metrics and logs to Grafana Cloud |
The cache service is deployed to the following regions:
| Server | Environment | Region |
|---|---|---|
cache-eu-central | Production | Europe (Central) |
cache-us-east | Production | US East |
cache-us-west | Production | US West |
cache-ap-southeast | Production | Asia Pacific (Southeast) |
cache-eu-central-staging | Staging | Europe (Central) |
cache-us-east-staging | Staging | US East |
cache-eu-central-canary | Canary | Europe (Central) |
All servers are accessible at cache-<region>.tuist.dev (e.g., cache-eu-central.tuist.dev). Non-production environments include an explicit suffix: cache-<region>-<env>.tuist.dev (e.g., cache-eu-central-staging.tuist.dev).
The NixOS configuration lives in cache/platform/ and is structured as follows:
cache/platform/
├── flake.nix # Nix Flake entry point, defines machines and Colmena config
├── configuration.nix # Base system configuration (kernel, networking, Docker, etc.)
├── disk-config.nix # Declarative disk partitioning via disko
├── hardware-configuration.nix # Hardware-specific settings
├── nginx.nix # nginx reverse proxy configuration
├── secrets.nix # 1Password secrets integration via opnix
├── users.nix # User accounts and SSH keys
└── alloy.nix # Grafana Alloy observability configuration
flake.nix[!NOTE] A Nix flake is a standardized way to package Nix projects and their dependencies. It pins inputs (via
flake.lock) and makes builds and environments more reproducible and easier to share. See the Nix flakes documentation for details.
Defines the Nix Flake with:
cache-<region>(-<env>).tuist.devconfiguration.nixBase system configuration including:
disk-config.nixDeclarative disk layout using disko:
/dev/sda: Boot partition (128MB ESP) + root filesystem (ext4)/dev/sdb: Dedicated /cas volume for cache artifacts (ext4 with optimized mount options)nginx.nixnginx configuration optimized for cache performance:
/cas for read operations (bypasses Phoenix after auth)secrets.nix1Password integration via opnix for:
alloy.nixGrafana Alloy configuration for:
nixos-anywhere allows you to install NixOS on a remote machine over SSH from any Linux system (including a rescue/recovery environment). It automatically handles disk partitioning using disko and installs the complete NixOS configuration in a single command.
/dev/sda for system, /dev/sdb for cache storage)cache-eu-central.tuist.dev)cache vaultAdd the server to the configuration
Edit cache/platform/flake.nix and add the new hostname to the machines list:
machines = [
"cache-eu-central"
"cache-us-east"
# ... existing machines
"cache-new-region" # Add new server
];
Run nixos-anywhere
From your local machine with Nix installed:
cd cache/platform
# Install NixOS on the target server
nix run github:nix-community/nixos-anywhere -- \
--flake .#cache-new-region \
root@<server-ip-or-hostname>
This command will:
disk-config.nix (this destroys all data on the target disks)Configure secrets
After the server reboots, SSH in and set up the 1Password token:
ssh [email protected]
# Create the opnix token file (get token from 1Password)
echo "YOUR_1PASSWORD_SERVICE_ACCOUNT_TOKEN" > /etc/opnix-token
chmod 600 /etc/opnix-token
Apply the full configuration
The initial nixos-anywhere installation includes the base configuration. Run Colmena to ensure everything is up to date and secrets are properly loaded:
cd cache/platform
colmena apply --on cache-new-region
Deploy the Phoenix application
Add the new server to the Kamal deploy configuration (cache/config/deploy.yml), then deploy:
cd cache
kamal deploy -c config/deploy.yml
users.nix)cache vaultNixOS configuration changes are deployed using Colmena:
# Deploy to all machines
cd cache/platform
colmena apply
# Deploy to a specific machine
colmena apply --on cache-eu-central
# Build without deploying (dry run)
colmena build
Colmena builds the configuration on the target machine (buildOnTarget = true) to ensure architecture compatibility.
The Phoenix application runs in a Docker container and is deployed separately using Kamal:
# Production deployment
kamal deploy -d production
# Staging deployment
kamal deploy -d staging
Kamal handles:
Connect to servers using:
ssh <username>@<hostname>.tuist.dev
Authorized users are defined in users.nix. All users in the wheel group have passwordless sudo access.
Metrics and logs are shipped to Grafana Cloud via Alloy. Access dashboards at grafana.com with the Tuist organization account.
Key metrics exported:
/metricsusers.nix to add the user with their SSH public keywheel for sudo, docker for container access)colmena applySecrets are managed in 1Password under the cache vault. The opnix integration automatically fetches secrets at service startup.
To rotate a secret:
systemctl restart grafana-alloy (or restart the Docker container for app secrets)