browser_use/llm/oci_raw/README.md
This module provides direct integration with Oracle Cloud Infrastructure's Generative AI service using raw API calls, without Langchain dependencies.
Make sure you have the required OCI dependencies installed:
pip install oci
from browser_use import Agent
from browser_use.llm import ChatOCIRaw
# Configure the model
model = ChatOCIRaw(
model_id="ocid1.generativeaimodel.oc1.us-chicago-1.amaaaaaask7dceya...",
service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
compartment_id="ocid1.tenancy.oc1..aaaaaaaayeiis5uk2nuubznrekd...",
provider="meta", # or "cohere"
temperature=1.0,
max_tokens=600,
top_p=0.75,
auth_type="API_KEY",
auth_profile="DEFAULT"
)
# Use with browser-use Agent
agent = Agent(
task="Search for Python tutorials and summarize them",
llm=model
)
# Run with asyncio
import asyncio
history = asyncio.run(agent.run())
meta_model = ChatOCIRaw(
model_id="ocid1.generativeaimodel.oc1.us-chicago-1.amaaaaaask7dceya...",
service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
compartment_id="ocid1.tenancy.oc1..aaaaaaaayeiis5uk2nuubznrekd...",
provider="meta", # Uses GenericChatRequest
temperature=0.7,
max_tokens=800,
frequency_penalty=0.0,
presence_penalty=0.0,
top_p=0.9
)
cohere_model = ChatOCIRaw(
model_id="ocid1.generativeaimodel.oc1.us-chicago-1.amaaaaaask7dceya...",
service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
compartment_id="ocid1.tenancy.oc1..aaaaaaaayeiis5uk2nuubznrekd...",
provider="cohere", # Uses CohereChatRequest
temperature=1.0,
max_tokens=600,
frequency_penalty=0.0,
top_p=0.75,
top_k=0 # Cohere-specific parameter
)
xai_model = ChatOCIRaw(
model_id="ocid1.generativeaimodel.oc1.us-chicago-1.amaaaaaask7dceya...",
service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
compartment_id="ocid1.tenancy.oc1..aaaaaaaayeiis5uk2nuubznrekd...",
provider="xai", # Uses GenericChatRequest
temperature=1.0,
max_tokens=20000,
top_p=1.0,
top_k=0
)
from pydantic import BaseModel
class SearchResult(BaseModel):
title: str
summary: str
relevance_score: float
# Use structured output
response = await model.ainvoke(messages, output_format=SearchResult)
result = response.completion # This is a SearchResult instance
For the complete list of available models in Oracle Cloud Infrastructure Generative AI, refer to the official documentation: OCI Generative AI Pretrained Models
Important: Only models that support tool calling/function calling are compatible with browser-use. Tool calling is essential for browser-use as the agent needs to call browser automation functions (click, type, scroll, etc.) to interact with web pages.
According to Oracle's documentation, tool calling functionality is available exclusively through the API and is not supported for browser-based use. However, when using browser-use with OCI models through this integration, the tool calling happens at the application level (not browser-based), making it compatible.
Several OCI models support image processing capabilities, which are useful when browser-use needs to analyze webpage screenshots:
These vision-enabled models are particularly useful for browser-use tasks that require understanding webpage content through screenshots, such as:
Different model providers in OCI use different API request formats:
GenericChatRequest with messages arraytemperature, max_tokens, frequency_penalty, presence_penalty, top_pCohereChatRequest with single message stringtemperature, max_tokens, frequency_penalty, top_p, top_kThe integration automatically detects the correct format based on the provider parameter and handles the conversion transparently.
The integration supports multiple OCI authentication methods:
API_KEY: Uses API key authentication (default)INSTANCE_PRINCIPAL: Uses instance principal authenticationRESOURCE_PRINCIPAL: Uses resource principal authenticationmodel_id: The OCID of your OCI GenAI modelservice_endpoint: The OCI service endpoint URLcompartment_id: The OCID of your OCI compartmentprovider: Model provider ("meta", "cohere", or "xai")temperature: Response randomness (0.0-2.0)max_tokens: Maximum tokens in responsetop_p: Top-p sampling parameterfrequency_penalty: Frequency penalty for repetitionpresence_penalty: Presence penalty for repetitiontop_k: Top-k sampling parameter (used by Cohere models)The integration provides proper error handling with specific exception types:
ModelRateLimitError: For rate limiting (429 errors)ModelProviderError: For other API errors (4xx, 5xx)| Feature | OCI Raw API | Langchain Integration |
|---|---|---|
| Dependencies | OCI SDK only | Langchain + OCI SDK |
| Performance | Direct API calls | Additional abstraction layer |
| Control | Full control over requests | Limited by Langchain interface |
| Updates | Direct OCI SDK updates | Dependent on Langchain updates |
| Complexity | Lower complexity | Higher complexity |
The OCI GenAI API returns responses in this format:
{
"chat_response": {
"api_format": "GENERIC",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": [
{
"text": "Response text here",
"type": "TEXT"
}
],
"role": "ASSISTANT"
}
}
],
"usage": {
"completion_tokens": 18,
"prompt_tokens": 38,
"total_tokens": 56
}
}
}
provider parameter:
provider="meta" for Meta Llama modelsprovider="cohere" for Cohere modelsprovider="xai" for xAI modelsEnable verbose logging by setting the verbose parameter to True (not implemented in this version but can be added).
When contributing to this module: