apps/opik-documentation/documentation/fern/docs-v2/self-host/backup.mdx
This guide covers the two backup options available for ClickHouse in Opik's Kubernetes deployment:
BACKUP command with S3clickhouse-backup toolClickHouse backup is essential for data protection and disaster recovery. Opik provides two different approaches to handle backups, each with its own advantages:
This is the default backup method that uses ClickHouse's native BACKUP command to create backups directly to S3-compatible storage.
BACKUP ALL EXCEPT DATABASE system commandCreate a Kubernetes secret with your S3 credentials:
kubectl create secret generic clickhouse-backup-secret \
--from-literal=access_key_id=YOUR_ACCESS_KEY \
--from-literal=access_key_secret=YOUR_SECRET_KEY
Then configure the backup:
clickhouse:
backup:
enabled: true
bucketURL: "https://your-bucket.s3.region.amazonaws.com"
secretName: "clickhouse-backup-secret"
schedule: "0 0 * * *"
For AWS EKS clusters, you can use IAM roles instead of access keys:
clickhouse:
serviceAccount:
create: true
name: "opik-clickhouse"
annotations:
eks.amazonaws.com/role-arn: "arn:aws:iam::ACCOUNT:role/clickhouse-backup-role"
backup:
enabled: true
bucketURL: "https://your-bucket.s3.region.amazonaws.com"
schedule: "0 0 * * *"
Required IAM Policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "s3:*",
"Resource": ["arn:aws:s3:::your-bucket", "arn:aws:s3:::your-bucket/*"]
}
]
}
Trust Relationship Policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::ACCOUNT:oidc-provider/oidc.eks.REGION.amazonaws.com/id/OIDCPROVIDERID"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"oidc.eks.REGION.amazonaws.com/id/OIDCPROVIDERID:sub": "system:serviceaccount:YOUR_NAMESPACE:opik-clickhouse",
"oidc.eks.REGION.amazonaws.com/id/OIDCPROVIDERID:aud": "sts.amazonaws.com"
}
}
}
]
}
You can customize the backup command if needed:
clickhouse:
backup:
enabled: true
bucketURL: "https://your-bucket.s3.region.amazonaws.com"
command:
- /bin/bash
- "-cx"
- |-
export backupname=backup$(date +'%Y%m%d%H%M')
echo "BACKUP ALL EXCEPT DATABASE system TO S3('${CLICKHOUSE_BACKUP_BUCKET}/${backupname}/', '$ACCESS_KEY', '$SECRET_KEY');" > /tmp/backQuery.sql
clickhouse-client -h clickhouse-opik-clickhouse --send_timeout 600000 --receive_timeout 600000 --port 9000 --queries-file=/tmp/backQuery.sql
The SQL-based backup:
backupYYYYMMDDHHMM)BACKUP ALL EXCEPT DATABASE system TO S3(...) commandsystem database to S3To restore from a SQL-based backup:
# Connect to ClickHouse
kubectl exec -it deployment/clickhouse-opik-clickhouse -- clickhouse-client
# Restore from S3 backup
RESTORE ALL FROM S3('https://your-bucket.s3.region.amazonaws.com/backup202401011200/', 'ACCESS_KEY', 'SECRET_KEY');
The ClickHouse Backup Tool provides more advanced backup features including compression, incremental backups, and better restore capabilities.
clickhouse:
backupServer:
enabled: true
image: "altinity/clickhouse-backup:2.6.23"
port: 7171
env:
LOG_LEVEL: "info"
ALLOW_EMPTY_BACKUPS: true
API_LISTEN: "0.0.0.0:7171"
API_CREATE_INTEGRATION_TABLES: true
Set up S3 configuration for the backup tool:
clickhouse:
backupServer:
enabled: true
env:
S3_BUCKET: "your-backup-bucket"
S3_ACCESS_KEY: "your-access-key" # can be ignored when use role
S3_SECRET_KEY: "your-secret-key"
S3_REGION: "us-west-2"
S3_ENDPOINT: "https://s3.us-west-2.amazonaws.com" # Optional: for S3-compatible storage
Use Kubernetes secrets for sensitive data:
(can be ignored when using IAM roles)
kubectl create secret generic clickhouse-backup-tool-secret \
--from-literal=S3_ACCESS_KEY=YOUR_ACCESS_KEY \
--from-literal=S3_SECRET_KEY=YOUR_SECRET_KEY
clickhouse:
backupServer:
enabled: true
env:
S3_BUCKET: "your-backup-bucket"
S3_REGION: "us-west-2"
envFrom:
- secretRef:
name: "clickhouse-backup-tool-secret"
# Port-forward to access the backup server
kubectl port-forward svc/chi-opik-clickhouse-cluster-0-0 7171:7171
# Create a backup
curl -X POST "http://localhost:7171/backup/create?name=backup-$(date +%Y%m%d-%H%M%S)"
# List available backups
curl "http://localhost:7171/backup/list"
# Upload backup to S3
curl -X POST "http://localhost:7171/backup/upload/backup-20240101-120000"
# Download backup from S3
curl -X POST "http://localhost:7171/backup/download/backup-20240101-120000"
# Restore backup
curl -X POST "http://localhost:7171/backup/restore/backup-20240101-120000"
You can create a custom CronJob to automate the backup tool:
apiVersion: batch/v1
kind: CronJob
metadata:
name: clickhouse-backup-tool-job
spec:
schedule: "0 2 * * *" # Daily at 2 AM
jobTemplate:
spec:
template:
spec:
containers:
- name: backup-tool
image: altinity/clickhouse-backup:2.6.23
command:
- /bin/bash
- -c
- |
BACKUP_NAME="backup-$(date +%Y%m%d-%H%M%S)"
curl -X POST "http://clickhouse-opik-clickhouse:7171/backup/create?name=$BACKUP_NAME"
sleep 30
curl -X POST "http://clickhouse-opik-clickhouse:7171/backup/upload/$BACKUP_NAME"
restartPolicy: OnFailure
| Feature | SQL-based Backup | ClickHouse Backup Tool |
|---|---|---|
| Setup Complexity | Simple | Moderate |
| Compression | No | Yes |
| Incremental Backups | No | Yes |
| Backup Validation | Basic | Advanced |
| REST API | No | Yes |
| Restore Flexibility | Basic | Advanced |
| Resource Usage | Low | Moderate |
| S3 Compatibility | Native | Native |
# Check backup job logs
kubectl logs -l app=clickhouse-backup
# Check CronJob status
kubectl get cronjobs
kubectl describe cronjob clickhouse-backup
# Test S3 connectivity
kubectl exec -it deployment/clickhouse-opik-clickhouse -- \
clickhouse-client --query "SELECT * FROM system.disks WHERE name='s3'"
# Check backup server logs
kubectl logs -l app=clickhouse-backup-server
# Test API connectivity
kubectl port-forward svc/clickhouse-opik-clickhouse 7171:7171
curl "http://localhost:7171/backup/list"
Set up monitoring for backup operations:
# Example Prometheus alerts
- alert: ClickHouseBackupFailed
expr: increase(kube_job_status_failed{job_name=~".*clickhouse-backup.*"}[5m]) > 0
for: 0m
labels:
severity: warning
annotations:
summary: "ClickHouse backup job failed"
description: "ClickHouse backup job {{ $labels.job_name }} has failed"
Enable the backup server:
clickhouse:
backupServer:
enabled: true
Create initial backup with the tool
Disable SQL-based backup:
clickhouse:
backup:
enabled: false
Disable backup server:
clickhouse:
backupServer:
enabled: false
Enable SQL-based backup:
clickhouse:
backup:
enabled: true
For additional help with ClickHouse backups: