docs/sources/setup/migrate/ssd-to-distributed/_index.md
This guide provides instructions for migrating from a simple scalable deployment (SSD) to a distributed microservices deployment of Loki. Before starting the migration, make sure you have read the considerations section.
{{< admonition type="note" >}} Simple Scalable Deployment (SSD) mode is being deprecated. The timeline for the deprecation is to be determined (TBD), but will happen before Loki 4.0 is released. You should plan to migrate from SSD to distributed before Loki 4.0 releases. {{< /admonition >}}
{{< admonition type="note" >}} In this guide, an AWS deployment is used as an example. However, the migration process is mirrored for other cloud providers. This is due to the fact that no changes are required to the underlying data storage. {{< /admonition >}}
Migrating from a simple scalable deployment to a distributed deployment with zero downtime is possible but requires careful planning. The following considerations should be taken into account:
pattern_ingesters=true you will need to spin up patternIngesters before shutting down the SSD ingesters. This is primarily needed for the Grafana Logs Drilldown feature.Before starting the migration process, make sure you have the following prerequisites:
kubectl.This example will use the following SSD deployment as a reference:
{{< admonition type="note" >}}
This example is only a reference on the parameters that need to be changed. There will be other parameters within your own config such as limits_config, gateway, compactor, etc. These can remain the same.
{{< /admonition >}}
---
loki:
schemaConfig:
configs:
- from: "2024-04-01"
store: tsdb
object_store: s3
schema: v13
index:
prefix: loki_index_
period: 24h
storage_config:
aws:
region: eu-central-1
bucketnames: aws-chunks-bucket
s3forcepathstyle: false
ingester:
chunk_encoding: snappy
ruler:
enable_api: true
storage:
type: s3
s3:
region: eu-central-1
bucketnames: aws-ruler-bucket
s3forcepathstyle: false
alertmanager_url: http://prom:9093
querier:
max_concurrent: 4
storage:
type: s3
bucketNames:
chunks: "aws-chunks-bucket"
ruler: "aws-ruler-bucket"
s3:
region: eu-central-1
deploymentMode: SimpleScalable
# SSD
backend:
replicas: 2
read:
replicas: 3
write:
replicas: 3
# Distributed Loki
ingester:
replicas: 0
zoneAwareReplication:
enabled: false
querier:
replicas: 0
maxUnavailable: 0
queryFrontend:
replicas: 0
maxUnavailable: 0
queryScheduler:
replicas: 0
distributor:
replicas: 0
maxUnavailable: 0
compactor:
replicas: 0
indexGateway:
replicas: 0
maxUnavailable: 0
ruler:
replicas: 0
maxUnavailable: 0
# Single binary Loki
singleBinary:
replicas: 0
minio:
enabled: false
In this stage, we will deploy the distributed Loki components alongside the SSD components. We will also change the deploymentMode to SimpleScalable<->Distributed. The SimpleScalable<->Distributed migration mode allows for a zero-downtime transition between Simple Scalable and fully distributed architectures. During migration, both deployment types run simultaneously, sharing the same object storage backend.
The following table outlines which components take over the responsibilities of the SSD components:
| Simple Scalable Components | Distributed Components |
|---|---|
| write (Deployment) | Distributor + Ingester |
| read (StatefulSet) | Query Frontend + Querier |
| backend (StatefulSet) | Compactor + Ruler + Index Gateway |
How Loki handles request routing during the migration:
The Gateway (nginx) handles request routing based on endpoint type:
loki/api/v1/push):
/loki/api/v1/query):
To start the migration process:
Create a copy of your existing values.yaml file and name it values-migration.yaml.
cp values.yaml values-migration.yaml
Next modify the following parameters; deploymentMode, ingester and components based on the annotations below.
---
loki:
schemaConfig:
configs:
- from: "2024-04-01"
store: tsdb
object_store: s3
schema: v13
index:
prefix: loki_index_
period: 24h
storage_config:
aws:
region: eu-central-1
bucketnames: aws-chunks-bucket
s3forcepathstyle: false
ingester:
chunk_encoding: snappy
# Add this to ingester; this will force ingesters to flush before shutting down
wal:
flush_on_shutdown: true
ruler:
enable_api: true
storage:
type: s3
s3:
region: eu-central-1
bucketnames: aws-ruler-bucket
s3forcepathstyle: false
alertmanager_url: http://prom:9093
querier:
max_concurrent: 4
storage:
type: s3
bucketNames:
chunks: "aws-chunks-bucket"
ruler: "aws-ruler-bucket"
s3:
region: eu-central-1
# Important: Make sure to change this to SimpleScalable<->Distributed
deploymentMode: SimpleScalable<->Distributed
# SSD
backend:
replicas: 2
read:
replicas: 3
write:
replicas: 3
# Distributed Loki
# Spin up the distributed components
ingester:
replicas: 3
zoneAwareReplication:
enabled: false
querier:
replicas: 3
maxUnavailable: 0
queryFrontend:
replicas: 2
maxUnavailable: 0
queryScheduler:
replicas: 2
distributor:
replicas: 2
maxUnavailable: 0
compactor:
replicas: 1
indexGateway:
replicas: 2
maxUnavailable: 0
ruler:
replicas: 1
maxUnavailable: 0
# Single binary Loki
singleBinary:
replicas: 0
minio:
enabled: false
Here is a breakdown of the changes:
ingester.wal.flush_on_shutdown: true: This will force the ingesters to flush before shutting down. This is important to prevent data loss.deploymentMode: SimpleScalable<->Distributed: This will allow for the SSD and distributed components to run simultaneously.Deploy the distributed components using the following command:
helm upgrade --values values-migration.yaml loki grafana/loki -n loki
{{< admonition type="caution" >}} It is important to allow all components to fully spin up before proceeding to the next stage. You can check the status of the components using the following command:
kubectl get pods -n loki
Let all components reach the Running state before proceeding to the next stage.
{{< /admonition >}}
The final stage of the migration involves transitioning all traffic to the distributed components. This is done by scaling down the SSD components and swapping the deploymentMode to Distributed. To do this:
Create a copy of values-migration.yaml and name it values-distributed.yaml.
cp values-migration.yaml values-distributed.yaml
Next modify the following parameters; deploymentMode and components based on the annotations below.
---
loki:
schemaConfig:
configs:
- from: "2024-04-01"
store: tsdb
object_store: s3
schema: v13
index:
prefix: loki_index_
period: 24h
storage_config:
aws:
region: eu-central-1
bucketnames: aws-chunks-bucket
s3forcepathstyle: false
ingester:
chunk_encoding: snappy
wal:
flush_on_shutdown: true
ruler:
enable_api: true
storage:
type: s3
s3:
region: eu-central-1
bucketnames: aws-ruler-bucket
s3forcepathstyle: false
alertmanager_url: http://prom:9093
querier:
max_concurrent: 4
storage:
type: s3
bucketNames:
chunks: "aws-chunks-bucket"
ruler: "aws-ruler-bucket"
s3:
region: eu-central-1
# Important: Make sure to change this to Distributed
deploymentMode: Distributed
# SSD
# Scale down the SSD components
backend:
replicas: 0
read:
replicas: 0
write:
replicas: 0
# Distributed Loki
ingester:
replicas: 3
zoneAwareReplication:
enabled: false
querier:
replicas: 3
maxUnavailable: 0
queryFrontend:
replicas: 2
maxUnavailable: 0
queryScheduler:
replicas: 2
distributor:
replicas: 2
maxUnavailable: 0
compactor:
replicas: 1
indexGateway:
replicas: 2
maxUnavailable: 0
ruler:
replicas: 1
maxUnavailable: 0
# Single binary Loki
singleBinary:
replicas: 0
minio:
enabled: false
Here is a breakdown of the changes:
deploymentMode: Distributed: This will allow for the distributed components to run in isolation.0.Deploy the final configuration using the following command:
helm upgrade --values values-distributed.yaml loki grafana/loki -n loki
Once the deployment is complete, you can verify that all components are running using the following command:
kubectl get pods -n loki
You should see all distributed components running and the SSD compontents have now been removed.
Loki in distributed mode is inherently more complex than SSD mode. It is recommended to meta-monitor your Loki deployment to ensure that everything is running smoothly. You can do this by following the meta-monitoring guide.