site/docs/providers/docker.md
Docker Model Runner makes it easy to manage, run, and deploy AI models using Docker. Designed for developers, Docker Model Runner streamlines the process of pulling, running, and serving large language models (LLMs) and other AI models directly from Docker Hub or any OCI-compliant registry.
ai/llama3.2:3B-Q4_K_Mdocker model pull ai/llama3.2:3B-Q4_K_M
npx promptfoo@latest eval -c https://raw.githubusercontent.com/promptfoo/promptfoo/main/examples/integration-docker/basic/promptfooconfig.comparison.simple.yaml
For an eval comparing several models with llm-rubric and similar assertions , see https://raw.githubusercontent.com/promptfoo/promptfoo/main/examples/integration-docker/basic/promptfooconfig.comparison.advanced.yaml.
docker:chat:<model_name>
docker:completion:<model_name>
docker:embeddings:<model_name>
docker:embedding:<model_name> # Alias for embeddings
docker:<model_name> # Defaults to chat
Note: Both docker:embedding: and docker:embeddings: prefixes are supported for embedding models and will work identically.
For a list of curated models on Docker Hub, visit the Docker Hub Models page.
Docker Model Runner can pull supported models from Hugging Face (i.e. models in GGUF format). For a complete list of all supported models on Hugging Face, visit this HF search page.
docker:chat:hf.co/<model_name>
docker:completion:hf.co/<model_name>
docker:embeddings:hf.co/<model_name>
docker:embedding:hf.co/<model_name> # Alias for embeddings
docker:hf.co/<model_name> # Defaults to chat
Configure the provider in your promptfoo configuration file:
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
providers:
- id: docker:ai/smollm3:Q4_K_M
config:
temperature: 0.7
Supported environment variables:
DOCKER_MODEL_RUNNER_BASE_URL - (optional) protocol, host name, and port. Defaults to http://localhost:12434. Set to http://model-runner.docker.internal when running within a container.DOCKER_MODEL_RUNNER_API_KEY - (optional) api key that is passed as the Bearer token in the Authorization Header when calling the API. Defaults to dmr to satisfy OpenAI API validation (not used by Docker Model Runner).Standard OpenAI parameters are supported:
| Parameter | Description |
|---|---|
temperature | Controls randomness (0.0 to 2.0) |
max_tokens | Maximum number of tokens to generate |
top_p | Nucleus sampling parameter |
frequency_penalty | Penalizes frequent tokens |
presence_penalty | Penalizes new tokens based on presence |
stop | Sequences where the API will stop generating |
stream | Enable streaming responses |
promptfoo eval -j 1.