documentation/docs/guides/tanzu-ai-services.md
import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem';
VMware Tanzu Platform provides enterprise-managed LLM access through AI Services. goose connects to VMware Tanzu Platform as an OpenAI-compatible provider, supporting both single-model and multi-model service plans with streaming enabled by default.
genai service is available in the marketplacecf) installed and authenticated (cf login)First, verify the genai service is available in your marketplace and review the available plans:
cf marketplace -e genai
You will see output similar to:
broker: genai-service
plan description free or paid
tanzu-Qwen3-Coder-30B-A3B-vllm-v1 Access to: Qwen/Qwen3-Coder-30B-A3B-Instruct-FP8. free
tanzu-gpt-oss-120b-vllm-v1 Access to: openai/gpt-oss-120b. free
tanzu-all-models Access to: Qwen3.5-122B, Qwen3-Coder-30B, gpt-oss... free
Each plan corresponds to a different model or set of models. Single-model plans give access to one model. Multi-model plans (e.g., tanzu-all-models) give access to multiple models behind a single endpoint.
Create a service instance using a single-model plan:
cf create-service genai tanzu-Qwen3-Coder-30B-A3B-vllm-v1 my-qwen-coder --wait
Create a service instance using the multi-model plan:
cf create-service genai tanzu-all-models my-all-models --wait
Verify the instance was created:
cf services
Create a service key to generate API credentials:
cf create-service-key my-qwen-coder my-goose-key --wait
Then retrieve the credentials:
cf service-key my-qwen-coder my-goose-key
For a single-model plan, the output includes model metadata at the top level:
{
"credentials": {
"api_base": "https://genai-proxy.sys.example.com/tanzu-my-model-abc1234/openai",
"api_key": "eyJhbGciOi...",
"endpoint": {
"api_base": "https://genai-proxy.sys.example.com/tanzu-my-model-abc1234",
"api_key": "eyJhbGciOi...",
"config_url": "https://genai-proxy.sys.example.com/tanzu-my-model-abc1234/config/v1/endpoint",
"name": "tanzu-my-model-abc1234"
},
"model_capabilities": ["chat", "tools"],
"model_name": "Qwen/Qwen3-Coder-30B-A3B-Instruct-FP8",
"wire_format": "openai"
}
}
For a multi-model plan, the output only contains the endpoint object:
{
"credentials": {
"endpoint": {
"api_base": "https://genai-proxy.sys.example.com/tanzu-all-models-abc1234",
"api_key": "eyJhbGciOi...",
"config_url": "https://genai-proxy.sys.example.com/tanzu-all-models-abc1234/config/v1/endpoint",
"name": "tanzu-all-models-abc1234"
}
}
}
From the service key output, you need two values from the credentials.endpoint object:
| Value | JSON Path | Example |
|---|---|---|
| Endpoint URL | credentials.endpoint.api_base | https://genai-proxy.sys.example.com/tanzu-my-model-abc1234 |
| API Key | credentials.endpoint.api_key | eyJhbGciOi... (JWT token) |
:::warning Use credentials.endpoint.api_base, not credentials.api_base
Single-model plans include a top-level credentials.api_base field that has an /openai suffix. Do not use this value. Always use credentials.endpoint.api_base (without /openai), because goose automatically appends the correct path.
Using the wrong value would produce a double-path URL like .../openai/openai/v1/chat/completions.
:::
credentials.endpoint.api_base URLcredentials.endpoint.api_key JWT tokengoose configuregoose configure
TANZU_AI_ENDPOINT when promptedTANZU_AI_API_KEY when promptedSet the following environment variables before launching goose:
export TANZU_AI_ENDPOINT="https://genai-proxy.sys.example.com/tanzu-my-model-abc1234"
export TANZU_AI_API_KEY="eyJhbGciOi..."
Then start goose:
goose session
:::tip
Add these exports to your shell profile (~/.bashrc, ~/.zshrc, etc.) to persist them across sessions.
:::
goose dynamically fetches available models from your Tanzu endpoint. After configuring the provider:
Qwen/Qwen3-Coder-30B-A3B-Instruct-FP8)To change models later, use Settings > Models > Switch models in Desktop, or run goose configure in the CLI.
:::note
Embedding-only models (e.g., nomic-ai/nomic-embed-text-v2-moe) will appear in the model list but cannot be used as a chat model.
:::
This means the API key is not being sent correctly. Common causes:
api_base: Make sure you used credentials.endpoint.api_base (without /openai), not credentials.api_base.cf create-service-key.You can test connectivity with curl:
# Test model discovery
curl -H "Authorization: Bearer $TANZU_AI_API_KEY" \
"$TANZU_AI_ENDPOINT/openai/v1/models"
# Test chat completions
curl -X POST "$TANZU_AI_ENDPOINT/openai/v1/chat/completions" \
-H "Authorization: Bearer $TANZU_AI_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"YOUR_MODEL_NAME","messages":[{"role":"user","content":"hello"}]}'
Streaming is enabled by default. If your endpoint does not support streaming, you can disable it by unchecking the Streaming checkbox in the provider configuration UI, or by setting the TANZU_AI_STREAMING environment variable to false.
If the model you selected returns an error, verify available models on your plan:
curl -H "Authorization: Bearer $TANZU_AI_API_KEY" \
"$TANZU_AI_ENDPOINT/openai/v1/models"
Ensure the model name matches exactly (including the prefix, e.g., Qwen/Qwen3-Coder-30B-A3B-Instruct-FP8).
To remove a service instance and its keys:
cf delete-service-key my-qwen-coder my-goose-key -f
cf delete-service my-qwen-coder -f