examples/provider-cloudflare/ai/README.md
Cloudflare Workers AI evaluation with OpenAI-compatible endpoints.
You can run this example with:
npx promptfoo@latest init --example provider-cloudflare/ai
cd provider-cloudflare/ai
This example requires the following environment variables:
CLOUDFLARE_ACCOUNT_ID - Your Cloudflare account ID (found in your Cloudflare dashboard)CLOUDFLARE_API_KEY - Your Cloudflare API key with Workers AI permissionsSet these in your environment:
export CLOUDFLARE_ACCOUNT_ID=your_account_id_here
export CLOUDFLARE_API_KEY=your_api_key_here
chat_config.yaml)Compares the latest flagship chat models:
cloudflare-ai:chat:@cf/openai/gpt-oss-120b - OpenAI's production, general purpose, high reasoning modelcloudflare-ai:chat:@cf/meta/llama-4-scout-17b-16e-instruct - Meta's Llama 4 Scout with native multimodal capabilitieschat_advanced_configuration.yaml)Demonstrates OpenAI-compatible parameters with Cloudflare AI:
@cf/mistralai/mistral-small-3.1-24b-instruct (enhanced vision understanding and 128K context)max_tokens, temperature, seed parametersembedding_configuration.yaml)Shows embedding generation and similarity testing:
@cf/meta/llama-3.3-70b-instruct-fp8-fast (optimized 70B model with fp8 quantization)@cf/google/embeddinggemma-300m (state-of-the-art embedding model trained on 100+ languages)Cloudflare AI supports three provider types:
cloudflare-ai:chat:model-name - Conversational AI and instruction followingcloudflare-ai:completion:model-name - Text completion and generationcloudflare-ai:embedding:model-name - Text embeddings for similarity and searchThis example showcases the latest flagship models:
@cf/openai/gpt-oss-120b) - Production-ready, high reasoning capabilities@cf/meta/llama-4-scout-17b-16e-instruct) - Multimodal with mixture-of-experts@cf/meta/llama-3.3-70b-instruct-fp8-fast) - Speed-optimized 70B model@cf/mistralai/mistral-small-3.1-24b-instruct) - Enhanced vision and 128K context@cf/google/embeddinggemma-300m) - Multilingual embeddings (100+ languages)All examples use Cloudflare's OpenAI-compatible endpoints, supporting standard parameters:
temperature - Response randomness controlmax_tokens - Output length limitstop_p - Nucleus samplingfrequency_penalty - Repetition reductionpresence_penalty - Topic diversityFor local testing with your development version:
npm run local -- eval -c examples/provider-cloudflare/ai/chat_config.yaml