scripts/health_check/health_check_client_README.md
A health check tool for testing all configured models on a LiteLLM proxy. Tests each model with completion/embedding requests and reports health status, errors, and response times.
Option 1: Fetch models from proxy API
export LITELLM_BASE_URL="https://litellm.example.com"
export LITELLM_API_KEY="your-api-key"
python scripts/health_check/health_check_client.py
Option 2: Use YAML config file
export LITELLM_BASE_URL="https://litellm.example.com"
export LITELLM_API_KEY="your-api-key"
export LITELLM_MODELS_YAML="/path/to/config.yaml"
python scripts/health_check/health_check_client.py
Option 3: Use custom authentication header
export LITELLM_BASE_URL="https://litellm.example.com"
export LITELLM_API_KEY="your-api-key"
export LITELLM_CUSTOM_AUTH_HEADER="x-custom-auth-header"
python scripts/health_check/health_check_client.py
docker build -f docker/Dockerfile.health_check -t litellm/litellm-health-check:latest .
docker run --rm \
-e LITELLM_BASE_URL="https://litellm.example.com" \
-e LITELLM_API_KEY="your-api-key" \
litellm/litellm-health-check:latest
docker run --rm \
-e LITELLM_BASE_URL="https://litellm.example.com" \
-e LITELLM_API_KEY="your-api-key" \
-e LITELLM_CUSTOM_AUTH_HEADER="x-custom-auth-header" \
litellm/litellm-health-check:latest
Run multiple health check containers in parallel:
PowerShell:
$env:LITELLM_BASE_URL="https://litellm.example.com"
$env:LITELLM_API_KEY="your-api-key"
.\scripts\health_check\run_parallel_health_checks.ps1 16
Bash/Shell:
export LITELLM_BASE_URL="https://litellm.example.com"
export LITELLM_API_KEY="your-api-key"
./scripts/health_check/run_parallel_health_checks.sh 16
With Custom Auth Header:
$env:LITELLM_BASE_URL="https://litellm.example.com"
$env:LITELLM_API_KEY="your-api-key"
$env:LITELLM_CUSTOM_AUTH_HEADER="x-custom-auth-header"
.\scripts\health_check\run_parallel_health_checks.ps1 16
With Custom Docker Image:
$env:LITELLM_BASE_URL="https://litellm.example.com"
$env:LITELLM_API_KEY="your-api-key"
$env:LITELLM_CUSTOM_AUTH_HEADER="x-custom-auth-header"
.\scripts\health_check\run_parallel_health_checks.ps1 -NumParallelJobs 16 -ImageName "your-registry/your-image:tag"
Bash with Custom Image:
export LITELLM_BASE_URL="https://litellm.example.com"
export LITELLM_API_KEY="your-api-key"
export LITELLM_CUSTOM_AUTH_HEADER="x-custom-auth-header"
./scripts/health_check/run_parallel_health_checks.sh 16 "your-registry/your-image:tag"
LITELLM_BASE_URL (required): Base URL of the LiteLLM proxy
https://litellm.example.comLITELLM_API_KEY (required): API key for authenticationLITELLM_CUSTOM_AUTH_HEADER (optional): Custom header name for authentication
Authorization headerx-custom-auth-header (the API key will be sent as Bearer <api_key> in this header)LITELLM_MODELS_YAML (optional): Path to YAML config file with model_list
/path/to/config.yamlLITELLM_TIMEOUT (optional): Request timeout in seconds (default: 120)LITELLM_COMPLETION_PROMPT (optional): Test prompt for chat/completion models (default: ~100k characters)LITELLM_EMBEDDING_TEXT (optional): Test text for embedding models (default: ~100k characters)LITELLM_JSON_OUTPUT (optional): Output results as JSON (default: false)PowerShell (run_parallel_health_checks.ps1):
-NumParallelJobs (optional): Number of parallel containers to run (default: 16)-ImageName (optional): Docker image to use (default: litellm/litellm-health-check:latest)-ContainerRuntime (optional): Container runtime to use (default: docker)Bash (run_parallel_health_checks.sh):
[num_parallel_jobs] (optional): Number of parallel containers to run (default: 16)[image_name] (optional): Docker image to use (default: litellm/litellm-health-check:latest)[container_runtime] (optional): Container runtime to use (default: docker)Example output format:
============================================================
Starting health check queries
---- gpt-4o ----
✅ Success. Response:
This is a test
---- text-embedding-3-small ----
✅ Success. Generated embedding vector with 1536 dimensions.
---- gpt-5-codex ----
❌ ERROR: HTTP 503: Service unavailable
============================================================
Health Check Summary
============================================================
Total models: 47
Healthy: 45
Unhealthy: 2
============================================================
Exit code: 0 if all models are healthy, 1 if any models are unhealthy.
When LITELLM_JSON_OUTPUT=true, outputs JSON:
{
"gpt-4o": {
"model": "gpt-4o",
"healthy": true,
"error": null,
"response_time_ms": 245.67,
"mode": "chat",
"response_text": "This is a test"
},
"text-embedding-3-small": {
"model": "text-embedding-3-small",
"healthy": true,
"error": null,
"response_time_ms": 123.45,
"mode": "embedding",
"dimensions": 1536
}
}
LITELLM_MODELS_YAML is set: Reads models from YAML config file/v1/models (OpenAI-compatible) or /model/info to get all configured modelsmode field from YAML config, or falls back to model name patterns (embedding, embed, text-embedding)POST /v1/chat/completions with configurable prompt (default: "Say this is a test")POST /v1/embeddings with configurable text (default: "This is a test for vectorization.")Run as a cron job or scheduled task:
# Cron job: Run every 5 minutes
*/5 * * * * /path/to/health_check.sh
Run multiple health checks in parallel:
PowerShell:
# Using default image
.\scripts\health_check\run_parallel_health_checks.ps1 16
# Using custom image
.\scripts\health_check\run_parallel_health_checks.ps1 -NumParallelJobs 16 -ImageName "your-registry/your-image:tag"
Bash:
# Using default image
./scripts/health_check/run_parallel_health_checks.sh 16
# Using custom image
./scripts/health_check/run_parallel_health_checks.sh 16 "your-registry/your-image:tag"
Add to your deployment pipeline:
# GitHub Actions example
- name: Health Check
run: |
docker run --rm \
-e LITELLM_BASE_URL="${{ secrets.LITELLM_BASE_URL }}" \
-e LITELLM_API_KEY="${{ secrets.LITELLM_API_KEY }}" \
litellm/litellm-health-check:latest
Deploy as a CronJob:
apiVersion: batch/v1
kind: CronJob
metadata:
name: litellm-health-check
spec:
schedule: "*/5 * * * *" # Every 5 minutes
jobTemplate:
spec:
template:
spec:
containers:
- name: health-check
image: litellm/litellm-health-check:latest
env:
- name: LITELLM_BASE_URL
value: "https://litellm.example.com"
- name: LITELLM_API_KEY
valueFrom:
secretKeyRef:
name: litellm-secrets
key: api-key
restartPolicy: OnFailure
LITELLM_BASE_URL is correctLITELLM_MODELS_YAML path is correctLITELLM_TIMEOUT for slower models (default is 120s)LITELLM_API_KEY is correctSame as LiteLLM project.