doc/administration/reference_architectures/cloud_native_first.md
{{< details >}}
{{< /details >}}
Cloud Native First Reference Architectures are designed for modern cloud-native deployment patterns with four standardized sizes (S/M/L/XL) based on workload characteristics. These architectures deploy all GitLab components in Kubernetes, while PostgreSQL, Redis, and Object Storage use external third-party solutions including managed services or on-premises options.
[!note] These architectures are in beta. We encourage feedback and will continue refining specifications based on production usage data.
Cloud Native First architectures deploy GitLab components across Kubernetes and external services:
@startuml kubernetes
skinparam linetype ortho
card "Kubernetes via Helm Charts" as kubernetes {
collections "**Webservice Pods**\n//Auto-scaling//" as web #32CD32
collections "**Sidekiq Pods**\n//Auto-scaling//" as sidekiq #ff8dd1
collections "**Gitaly Pods**\n//StatefulSets//" as gitaly #FF8C00
collections "**Supporting Pods**\n//NGINX, Toolbox//" as support #e76a9b
}
card "External Services" as external {
collections "**PostgreSQL**" as database #4EA7FF
collections "**Redis Cache**" as redis_cache #FF6347
collections "**Redis Persistent**" as redis_persistent #FF6347
cloud "**Object Storage**" as object_storage #white
}
kubernetes -[hidden]---> external
web -[#32CD32,norank]--> object_storage
web -[#32CD32,norank]--> redis_cache
web -[#32CD32,norank]--> redis_persistent
web -[#32CD32,norank]--> database
sidekiq -[#ff8dd1,norank]--> object_storage
sidekiq -[#ff8dd1,norank]--> redis_cache
sidekiq -[#ff8dd1,norank]--> redis_persistent
sidekiq -[#ff8dd1,norank]--> database
@enduml
Kubernetes components:
[!note] Gitaly on Kubernetes is deployed as Gitaly Sharded (non-Cluster) only and does not support zero-downtime upgrades. Each Gitaly pod is a single point of failure for the repositories it serves. Gitaly Cluster (Praefect) is not supported in Kubernetes.
If you require Gitaly high availability with automatic failover, consider Cloud Native Hybrid architectures, which deploy Gitaly Cluster on virtual machines while running stateless components in Kubernetes. For Gitaly on Kubernetes requirements and limitations, see Gitaly on Kubernetes.
External services:
For recommended managed service providers (GCP Cloud SQL, AWS RDS, Azure Database, etc.), see recommended cloud providers and services.
These architectures are designed around target RPS ranges representing typical production workload patterns. RPS targets serve as starting points, your specific capacity needs depend on workload composition and usage patterns. For guidance on RPS composition and when adjustments are needed, see Understanding RPS composition.
| Size | Target RPS | Intended Workload |
|---|---|---|
| S | ≤100 | Teams with light development activity and minimal automation |
| M | ≤200 | Organizations with moderate development velocity and standard CI/CD usage |
| L | ≤500 | Large teams with heavy development activity and significant automation |
| XL | ≤1000 | Enterprise deployments with intensive workloads and extensive integrations |
For detailed guidance on determining your expected load and selecting the appropriate size, see the reference architecture sizing guide.
Cloud Native First architectures provide:
Before deploying a Cloud Native First architecture, ensure you have:
For complete requirements including networking, machine types, and cloud provider services, see reference architecture requirements.
For Gitaly on Kubernetes specific requirements and limitations, see Gitaly on Kubernetes requirements.
Target load: ≤100 RPS | Light overall load
Workload characteristics:
| Component | Per Pod Resources | Min Pods/Workers | Max Pods/Workers | Example Node Configuration |
|---|---|---|---|---|
| Webservice | 2 vCPU, 3 GB (request), 4 GB (limit) | 12 pods (24 workers) | 18 pods (36 workers) | GCP: 6 × n2-standard-8 |
| AWS: 6 × c6i.2xlarge | ||||
| Sidekiq | 900m vCPU, 2 GB (request), 4 GB (limit) | 8 workers | 12 workers | GCP: 3 × n2-standard-4 |
| AWS: 3 × m6i.xlarge | ||||
| Gitaly | 7 vCPU, 30 GB (request and limit) | 3 pods | 3 pods | GCP: 3 × n2-standard-8 |
| AWS: 3 × m6i.2xlarge | ||||
| Supporting | Variable per service | 12 vCPU, 48 GB | 12 vCPU, 48 GB | GCP: 3 × n2-standard-4 |
| AWS: 3 × c6i.xlarge |
| Component | Min → Max Pods | Min → Max Workers | Per Pod Resources | Workers per Pod |
|---|---|---|---|---|
| Webservice | 12 → 18 | 24 → 36 | 2 vCPU, 3 GB (request), 4 GB (limit) | 2 |
| Sidekiq | 8 → 12 | 8 → 12 | 900m vCPU, 2 GB (request), 4 GB (limit) | 1 |
| Gitaly | 3 (no autoscaling) | not applicable | 7 vCPU, 30 GB (request and limit) | not applicable |
Gitaly notes: Git cgroups: 27 GB, Buffer: 3 GB. Repository cgroups set to 1. See Gitaly cgroups configuration for tuning guidance.
| Service | Configuration | GCP Equivalent | AWS Equivalent |
|---|---|---|---|
| PostgreSQL | 8 vCPU, 32 GB | n2-standard-8 | m6i.2xlarge |
| Redis - Cache | 2 vCPU, 8 GB | n2-standard-2 | m6i.large |
| Redis - Persistent | 2 vCPU, 8 GB | n2-standard-2 | m6i.large |
| Object Storage | Cloud provider service | Google Cloud Storage | Amazon S3 |
Target load: ≤200 RPS | Moderate overall load
Workload characteristics:
| Component | Per Pod Resources | Min Pods/Workers | Max Pods/Workers | Example Node Configuration |
|---|---|---|---|---|
| Webservice | 2 vCPU, 3 GB (request), 4 GB (limit) | 28 pods (56 workers) | 42 pods (84 workers) | GCP: 6 × n2-standard-16 |
| AWS: 6 × c6i.4xlarge | ||||
| Sidekiq | 900m vCPU, 2 GB (request), 4 GB (limit) | 16 workers | 24 workers | GCP: 3 × n2-standard-8 |
| AWS: 3 × m6i.2xlarge | ||||
| Gitaly | 15 vCPU, 62 GB (request and limit) | 3 pods | 3 pods | GCP: 3 × n2-standard-16 |
| AWS: 3 × m6i.4xlarge | ||||
| Supporting | Variable per service | 12 vCPU, 48 GB | 12 vCPU, 48 GB | GCP: 3 × n2-standard-4 |
| AWS: 3 × c6i.xlarge |
| Component | Min → Max Pods | Min → Max Workers | Per Pod Resources | Workers per Pod |
|---|---|---|---|---|
| Webservice | 28 → 42 | 56 → 84 | 2 vCPU, 3 GB (request), 4 GB (limit) | 2 |
| Sidekiq | 16 → 24 | 16 → 24 | 900m vCPU, 2 GB (request), 4 GB (limit) | 1 |
| Gitaly | 3 (no autoscaling) | not applicable | 15 vCPU, 62 GB (request and limit) | not applicable |
Gitaly notes: Git cgroups: 56 GB, Buffer: 6 GB. Repository cgroups set to 1. See Gitaly cgroups configuration for tuning guidance.
| Service | Configuration | GCP Equivalent | AWS Equivalent |
|---|---|---|---|
| PostgreSQL | 16 vCPU, 64 GB | n2-standard-16 | m6i.4xlarge |
| Redis - Cache | 2 vCPU, 8 GB | n2-standard-2 | m6i.large |
| Redis - Persistent | 2 vCPU, 8 GB | n2-standard-2 | m6i.large |
| Object Storage | Cloud provider service | Google Cloud Storage | Amazon S3 |
Target load: ≤500 RPS | Heavy overall load
Workload characteristics:
| Component | Per Pod Resources | Min Pods/Workers | Max Pods/Workers | Example Node Configuration |
|---|---|---|---|---|
| Webservice | 2 vCPU, 3 GB (request), 4 GB (limit) | 56 pods (112 workers) | 84 pods (168 workers) | GCP: 6 × n2-standard-32 |
| AWS: 6 × c6i.8xlarge | ||||
| Sidekiq | 900m vCPU, 2 GB (request), 4 GB (limit) | 32 workers | 48 workers | GCP: 6 × n2-standard-8 |
| AWS: 6 × m6i.2xlarge | ||||
| Gitaly | 31 vCPU, 126 GB (request and limit) | 3 pods | 3 pods | GCP: 3 × n2-standard-32 |
| AWS: 3 × m6i.8xlarge | ||||
| Supporting | Variable per service | 12 vCPU, 48 GB | 12 vCPU, 48 GB | GCP: 3 × n2-standard-4 |
| AWS: 3 × c6i.xlarge |
| Component | Min → Max Pods | Min → Max Workers | Per Pod Resources | Workers per Pod |
|---|---|---|---|---|
| Webservice | 56 → 84 | 112 → 168 | 2 vCPU, 3 GB (request), 4 GB (limit) | 2 |
| Sidekiq | 32 → 48 | 32 → 48 | 900m vCPU, 2 GB (request), 4 GB (limit) | 1 |
| Gitaly | 3 (no autoscaling) | not applicable | 31 vCPU, 126 GB (request and limit) | not applicable |
Gitaly notes: Git cgroups: 120 GB, Buffer: 6 GB. Repository cgroups set to 1. See Gitaly cgroups configuration for tuning guidance.
| Service | Configuration | GCP Equivalent | AWS Equivalent |
|---|---|---|---|
| PostgreSQL | 32 vCPU, 128 GB | n2-standard-32 | m6i.8xlarge |
| Redis - Cache | 2 vCPU, 16 GB | n2-highmem-2 | r6i.large |
| Redis - Persistent | 2 vCPU, 16 GB | n2-highmem-2 | r6i.large |
| Object Storage | Cloud provider service | Google Cloud Storage | Amazon S3 |
Target load: ≤1000 RPS | Intensive overall load
Workload characteristics:
| Component | Per Pod Resources | Min Pods/Workers | Max Pods/Workers | Example Node Configuration |
|---|---|---|---|---|
| Webservice | 2 vCPU, 3 GB (request), 4 GB (limit) | 110 pods (220 workers) | 165 pods (330 workers) | GCP: 6 × n2-standard-64 |
| AWS: 6 × c6i.16xlarge | ||||
| Sidekiq | 900m vCPU, 2 GB (request), 4 GB (limit) | 64 workers | 96 workers | GCP: 6 × n2-standard-16 |
| AWS: 6 × m6i.4xlarge | ||||
| Gitaly | 63 vCPU, 254 GB (request and limit) | 3 pods | 3 pods | GCP: 3 × n2-standard-64 |
| AWS: 3 × m6i.16xlarge | ||||
| Supporting | Variable per service | 24 vCPU, 96 GB | 24 vCPU, 96 GB | GCP: 3 × n2-standard-8 |
| AWS: 3 × c6i.2xlarge |
| Component | Min → Max Pods | Min → Max Workers | Per Pod Resources | Workers per Pod |
|---|---|---|---|---|
| Webservice | 110 → 165 | 220 → 330 | 2 vCPU, 3 GB (request), 4 GB (limit) | 2 |
| Sidekiq | 64 → 96 | 64 → 96 | 900m vCPU, 2 GB (request), 4 GB (limit) | 1 |
| Gitaly | 3 (no autoscaling) | not applicable | 63 vCPU, 254 GB (request and limit) | not applicable |
Gitaly notes: Git cgroups: 248 GB, Buffer: 6 GB. Repository cgroups set to 1. See Gitaly cgroups configuration for tuning guidance.
| Service | Configuration | GCP Equivalent | AWS Equivalent |
|---|---|---|---|
| PostgreSQL | 64 vCPU, 256 GB | n2-standard-64 | m6i.16xlarge |
| Redis - Cache | 2 vCPU, 16 GB | n2-highmem-2 | r6i.large |
| Redis - Persistent | 2 vCPU, 16 GB | n2-highmem-2 | r6i.large |
| Object Storage | Cloud provider service | Google Cloud Storage | Amazon S3 |
This section provides supplementary guidance for deploying and operating Cloud Native First architectures, including machine type selection, component-specific considerations, and scaling strategies.
The machine types shown are examples used in validation and testing. You can use:
Do not use burstable instance types due to inconsistent performance.
For more information, see supported machine types.
Gitaly in Kubernetes with the Cloud Native First architectures uses StatefulSets with the following specifications:
Gitaly deployment mode:
By design, Gitaly (non-Cluster) on Kubernetes is a single point of failure service for repositories stored on each pod. Data is sourced and served from a single instance per pod. Each Gitaly pod manages its own set of repositories, providing horizontal scaling of Git storage through repository distribution.
Gitaly Cluster (Praefect) is not supported in Cloud Native First architectures. For context on Gitaly deployment limitations in Kubernetes, see Gitaly on Kubernetes.
Repository distribution:
With multiple Gitaly storages configured (for example default, storage1, storage2), GitLab defaults to creating all new repositories on the default storage. To distribute repositories across all Gitaly pods, configure storage weights to balance load.
For guidance on configuring repository storage weights, see configure where new repositories are stored.
Gitaly uses cgroups to protect against resource exhaustion from individual Git operations. The default configuration sets repository cgroup count to 1, which provides a starting point that allows any single repository to use full pod resources through oversubscription.
However, this configuration may not be optimal for all workloads. For environments with many active repositories or specific resource isolation requirements, you should tune the cgroups configuration based on observed usage patterns. This includes adjusting repository cgroup counts and memory allocations.
For detailed guidance on measuring, tuning, and configuring Gitaly cgroups, see Gitaly cgroups.
For large monorepos (over 2 GB) or intensive Git workloads, additional Gitaly adjustments may be required. See reference architecture sizing guide for detailed guidance.
All architectures use Kubernetes Horizontal Pod Autoscaler (HPA) and Cluster Autoscaler to manage capacity:
Minimum pod counts are set at approximately 2/3 of maximum to balance cost efficiency with performance reliability based on internal testing to achieve the following goals:
If you have well-understood load patterns you can adjust minimums based on your needs:
Cloud Native First architectures are designed to scale beyond their base specifications. You may need to adjust capacity if your environment has:
Scaling strategies differ by component type.
For increased capacity, scale horizontally by adjusting maximum replica counts and node pool capacity:
maxReplicas in Helm values and add corresponding nodes to the Webservice node poolmaxReplicas to handle higher job throughput and add nodes to the Sidekiq node poolHorizontal scaling is the recommended approach for these stateless components.
For stateful components, increase instance or pod specifications:
By default, Sidekiq processes all job types in a single queue. For environments with diverse workload patterns, you can configure separate queues based on job characteristics:
Queue separation can improve job processing reliability and prevent low-priority jobs from blocking time-sensitive operations, particularly in larger environments (L, XL) with heavy automation workloads.
For more information about configuring Sidekiq queues, see processing specific job classes.
GitLab Duo Agent Platform introduces additional infrastructure requirements beyond standard GitLab workloads. For detailed guidance on monitoring and scaling for Agent Platform adoption, see Scaling for GitLab Duo Agent Platform.
When scaling any component significantly:
For comprehensive scaling guidance, see scaling an environment.
Cloud Native First architectures can be deployed using Helm charts and external service providers directly or through the GitLab Environment Toolkit.
The GitLab Environment Toolkit provides automated deployment with:
For deployment instructions, see the GitLab Environment Toolkit documentation.
Prerequisites for manual deployment:
For detailed prerequisites and secret configuration, see GitLab chart prerequisites and configure secrets.
For manual deployment using Helm charts:
helm installFor detailed manual deployment steps, see installing GitLab on Kubernetes.
For complete Helm Chart configuration examples and detailed deployment guidance, see the GitLab Charts repository.
Key configuration areas for Cloud Native First architectures:
webservice, sidekiq, gitaly, support)For architecture-specific replica counts and resource values, refer to the specifications in each size section above.
[!note] Cloud Native First architectures are in Beta. Specific Helm Chart configuration examples will be added to the Charts repository as the feature progresses toward General Availability. Use the specifications in each architecture size section above to construct your Helm values configuration.
After deployment, environments typically require monitoring and tuning to match actual workload patterns.
Reference architectures are starting points. Many environments benefit from adjustments based on:
See Advanced scaling for component-specific adjustment guidance.
You may want to configure additional optional features of GitLab depending on your requirements. See Steps after installing GitLab for more information.
[!note] Additional capacity may be required for optional features. See the feature-specific documentation for requirements.