examples/k8s-helm/README.md
This Helm chart deploys OpenViking on Kubernetes, providing a scalable and production-ready RAG (Retrieval-Augmented Generation) and semantic search service.
OpenViking is an open-source RAG and semantic search engine that serves as a Context Database MCP (Model Context Protocol) server. This Helm chart enables easy deployment on Kubernetes clusters with support for major cloud providers.
helm repo add openviking https://volcengine.github.io/openviking
helm repo update
# Clone the repository
git clone https://github.com/volcengine/OpenViking.git
cd OpenViking/deploy/helm
# Install with default values
helm install openviking ./openviking
# Install with custom values
helm install openviking ./openviking -f my-values.yaml
# GCP deployment
helm install openviking ./openviking \
--set cloudProvider=gcp \
--set openviking.config.embedding.dense.api_key=YOUR_API_KEY
# AWS deployment
helm install openviking ./openviking \
--set cloudProvider=aws \
--set openviking.config.embedding.dense.api_key=YOUR_API_KEY
The chart supports automatic LoadBalancer annotation configuration for major cloud providers:
| Provider | Configuration Value |
|---|---|
| Google Cloud Platform | cloudProvider: gcp |
| Amazon Web Services | cloudProvider: aws |
| Other/Generic | cloudProvider: "" (default) |
| Parameter | Description | Default |
|---|---|---|
cloudProvider | Cloud provider for LoadBalancer annotations | "" |
replicaCount | Number of replicas | 1 |
image.repository | Container image repository | ghcr.io/astral-sh/uv |
image.tag | Container image tag | python3.12-bookworm |
service.type | Kubernetes service type | LoadBalancer |
service.port | Service port | 1933 |
openviking.config.server.api_key | API key for authentication | null |
openviking.config.embedding.dense.api_key | Volcengine API key | null |
All OpenViking configuration options from ov.conf are available under openviking.config. See values.yaml for the complete default configuration.
The embedding service requires a Volcengine API key:
openviking:
config:
embedding:
dense:
api_key: "your-api-key-here"
api_base: "https://ark.cn-beijing.volces.com/api/v3"
model: "doubao-embedding-vision-251215"
For vision-language model support:
openviking:
config:
vlm:
api_key: "your-api-key-here"
api_base: "https://ark.cn-beijing.volces.com/api/v3"
model: "doubao-seed-2-0-pro-260215"
By default, the chart uses emptyDir volumes for data storage. This is suitable for development and testing but data will be lost when pods are restarted.
To enable persistent storage with PVC:
openviking:
dataVolume:
enabled: true
usePVC: true
size: 50Gi
storageClassName: standard
accessModes:
- ReadWriteOnce
Enable API key authentication to secure your OpenViking server:
openviking:
config:
server:
api_key: "your-secure-api-key"
cors_origins:
- "https://your-domain.com"
For production deployments, use Kubernetes secrets or external secret management:
# Create secret from literal
kubectl create secret generic openviking-config \
--from-literal=ov.conf='{"server":{"api_key":"secret"}}'
# Or mount existing secret
helm install openviking ./openviking \
--set existingSecret=openviking-config
Enable Horizontal Pod Autoscaler for production workloads:
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 80
targetMemoryUtilizationPercentage: 80
Default resource configuration:
resources:
limits:
cpu: 2000m
memory: 4Gi
requests:
cpu: 500m
memory: 1Gi
Adjust based on your workload requirements.
# Get the LoadBalancer IP
export OPENVIKING_IP=$(kubectl get svc openviking -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
# Create CLI configuration
cat > ~/.openviking/ovcli.conf <<EOF
{
"url": "http://$OPENVIKING_IP:1933",
"api_key": null,
"output": "table"
}
EOF
# Test connection
openviking health
import openviking as ov
# Get service endpoint
# kubectl get svc openviking
client = ov.OpenViking(url="http://<load-balancer-ip>:1933", api_key="your-key")
client.initialize()
# Add a resource
client.add_resource(path="./document.pdf")
client.wait_processed()
# Search
results = client.find("your search query")
print(results)
client.close()
Check the pod logs:
kubectl logs -l app.kubernetes.io/name=openviking
Verify the configuration:
kubectl get secret openviking-config -o jsonpath='{.data.ov\.conf}' | base64 -d
Wait for the cloud provider to provision the load balancer:
kubectl get svc openviking -w
Check cloud provider-specific annotations in values.yaml.
helm uninstall openviking
To remove persistent data (if PVC was enabled):
kubectl delete pvc openviking-data
Contributions are welcome! Please see the OpenViking repository for contribution guidelines.
This Helm chart is licensed under the Apache License 2.0, matching the OpenViking project license.