docs/Azure-AI-Gateway.md
The Azure AI Gateway plugin enables Fabric to access multiple AI providers through a single Azure API Management (APIM) Gateway endpoint. This allows organizations using Azure APIM as a central AI gateway to leverage Fabric with any supported backend provider using a single subscription key.
Azure AI Gateway acts as a unified proxy that routes requests to different AI providers:
All backends share the same authentication mechanism (Azure APIM subscription key) and gateway endpoint, simplifying credential management and access control.
Run fabric --setup and select AzureAIGateway from the vendor list.
You'll be prompted for:
Backend Type (backend)
bedrock, azure-openai, vertex-aibedrockGateway URL (gateway_url)
https://gateway.company.comSubscription Key (subscription_key)
api_version) - Azure OpenAI backend only
2025-04-01-preview2024-08-01-preview, 2024-10-21, etc.Authentication: Authorization: Bearer <subscription-key>
API Format: Anthropic Messages API
Models:
fabric --listmodels
# Returns Claude models available via Bedrock:
# - us.anthropic.claude-3-5-sonnet-20241022-v2:0
# - us.anthropic.claude-3-5-haiku-20241022-v1:0
# - us.anthropic.claude-3-opus-20240229-v1:0
# - etc.
Endpoint Pattern: /model/{model-id}/invoke
Configuration:
fabric --setup
# Select: AzureAIGateway
# Backend: bedrock
# Gateway URL: https://gateway.company.com
# Subscription Key: your-apim-key
Authentication: api-key: <subscription-key>
API Format: OpenAI Chat Completions API
Models:
fabric --listmodels
# Returns Azure OpenAI deployment names:
# - gpt-4o
# - gpt-4o-mini
# - gpt-4-turbo
# - gpt-35-turbo
# - o1
# - o1-mini
# - DeepSeek-R1
Endpoint Pattern: /openai/deployments/{deployment-name}/chat/completions?api-version={version}
Configuration:
fabric --setup
# Select: AzureAIGateway
# Backend: azure-openai
# Gateway URL: https://gateway.company.com
# Subscription Key: your-apim-key
# API Version: (press Enter for default 2025-04-01-preview)
Custom API Version:
# During setup, specify a custom version:
# API Version: 2024-10-21
Authentication: x-goog-api-key: <subscription-key>
API Format: Gemini API
Models:
fabric --listmodels
# Returns Gemini models:
# - gemini-2.0-flash-exp
# - gemini-1.5-pro
# - gemini-1.5-flash
# - gemini-pro
# - gemini-pro-vision
Endpoint Pattern: /publishers/google/models/{model-id}:generateContent
Note: The endpoint path differs from direct Vertex AI API (/v1beta/models/...) because Azure APIM Gateway uses publisher-based routing.
Configuration:
fabric --setup
# Select: AzureAIGateway
# Backend: vertex-ai
# Gateway URL: https://gateway.company.com
# Subscription Key: your-apim-key
# Bedrock (Claude)
echo "Explain quantum computing" | fabric --model us.anthropic.claude-3-5-sonnet-20241022-v2:0 --pattern explain
# Azure OpenAI (GPT-4o)
echo "Explain quantum computing" | fabric --model gpt-4o --pattern explain
# Vertex AI (Gemini)
echo "Explain quantum computing" | fabric --model gemini-2.0-flash-exp --pattern explain
# Extract wisdom from a YouTube video (Bedrock)
fabric --youtube "https://youtube.com/watch?v=example" --model us.anthropic.claude-3-5-sonnet-20241022-v2:0 --pattern extract_wisdom
# Summarize an article (Azure OpenAI)
curl -s https://example.com/article | fabric --model gpt-4o --pattern summarize
# Create content from a prompt (Vertex AI)
fabric --model gemini-1.5-pro --pattern write_essay
# Reconfigure to use a different backend
fabric --setup
# Select: AzureAIGateway
# Change Backend from bedrock to azure-openai
Symptom: Request failed with status 401
Causes:
Solutions:
fabric --setup and enter new keySymptom: Request failed with status 404
Causes:
Solutions:
fabric --listmodelsus.anthropic.claude-3-5-sonnet-20241022-v2:0)Symptom: connection refused or timeout errors
Causes:
Solutions:
curl -I https://gateway.company.comSymptom: API version not supported or 400 Bad Request
Causes:
Solutions:
fabric --setup and leave API version emptySymptom: Unexpected response format or empty responses
Causes:
Solutions:
fabric --setup and select correct backend typeSymptom: Request times out after 5 minutes
Causes:
Solutions:
Azure APIM Gateway doesn't support Server-Sent Events (SSE) pass-through for streaming responses. The plugin automatically falls back to buffered responses.
Impact: --stream flag is ignored; full response is returned after model completes.
Workaround: None - this is an APIM Gateway architectural limitation.
Maximum request timeout is 300 seconds (5 minutes). Long-running model inference may timeout.
Solutions:
Model names must exactly match backend expectations:
us.anthropic.claude-3-5-sonnet-20241022-v2:0)gemini-2.0-flash-exp)Use fabric --listmodels to see available options for your configured backend.
~/.config/fabric/.env)The plugin rejects HTTP gateway URLs to prevent plaintext credential transmission. Your gateway URL must use HTTPS.
Responses are limited to 10MB to prevent memory exhaustion attacks. This is sufficient for all normal AI model responses.
To use multiple APIM gateways or backends, run fabric --setup and reconfigure when switching contexts.
Fabric configuration is stored in ~/.config/fabric/.env. Manual editing is supported but not recommended:
# Example configuration
AZUREAIGATEWAY_BACKEND=bedrock
AZUREAIGATEWAY_GATEWAY_URL=https://gateway.company.com
AZUREAIGATEWAY_SUBSCRIPTION_KEY=your-key-here
AZUREAIGATEWAY_API_VERSION=2025-04-01-preview
For issues specific to the Azure AI Gateway plugin:
For APIM gateway configuration issues, consult: