llama-index-integrations/llms/llama-index-llms-ovhcloud/README.md
This integration allows you to use OVHcloud AI Endpoints with LlamaIndex. OVHcloud AI Endpoints provides OpenAI-compatible API endpoints for various models.
OVHcloud is a global player and the leading European cloud provider operating over 450,000 servers within 40 data centers across 4 continents to reach 1.6 million customers in over 140 countries. Our product AI Endpoints offers access to various models with sovereignty, data privacy and GDPR compliance.
Install the required packages:
pip install llama-index llama-index-llms-ovhcloud
OVHcloud AI Endpoints can be used in two ways:
Free tier (with rate limits): You can use the API without an API key or with an empty string API key. This provides free access with rate limits.
With API key: For higher rate limits and production use, generate an API key from the OVHcloud manager:
To use OVHcloud AI Endpoints with LlamaIndex, first initialize the LLM:
from llama_index.llms.ovhcloud import OVHcloud
# Using with API key
llm = OVHcloud(
model="gpt-oss-120b",
api_key="YOUR_API_KEY", # Or empty string for free tier with rate limits)
)
You can find available models in the OVHcloud AI Endpoints catalog.
Generate a simple completion:
response = llm.complete("The capital of France is")
print(response.text)
Use chat-style interactions:
from llama_index.core.llms import ChatMessage
messages = [
ChatMessage(role="system", content="You are a helpful assistant"),
ChatMessage(role="user", content="What is the capital of France?"),
]
response = llm.chat(messages)
print(response)
Stream completions in real-time:
# Streaming completion
response = llm.stream_complete("The capital of France is")
for r in response:
print(r.delta, end="")
# Streaming chat
messages = [
ChatMessage(role="system", content="You are a helpful assistant"),
ChatMessage(role="user", content="What is the capital of France?"),
]
response = llm.stream_chat(messages)
for r in response:
print(r.delta, end="")
You can dynamically fetch available models:
llm = OVHcloud(model="gpt-oss-120b")
available = llm.available_models # List[Model] - fetched dynamically
model_ids = [model.id for model in available]
print(f"Available models: {model_ids}")
For more information about OVHcloud AI Endpoints, visit: