docs/agent/llm-integration.md
Multi-provider AI support, JSON handling, and prompt guidelines.
Backend uses LiteLLM to support multiple providers through a unified API:
| Provider | Type | Notes |
|---|---|---|
| Ollama | Local | Free, runs on your machine |
| OpenAI | Cloud | GPT-5 Nano, GPT-4o |
| Anthropic | Cloud | Claude Haiku 4.5 |
| Google Gemini | Cloud | Gemini 3 Flash |
| OpenRouter | Cloud | Access to multiple models |
| DeepSeek | Cloud | DeepSeek Chat |
API keys are passed directly to litellm.acompletion() via the api_key parameter (not via os.environ) to avoid race conditions in async contexts.
# Correct
await litellm.acompletion(
model=model,
messages=messages,
api_key=api_key # Direct parameter
)
# Incorrect - don't use os.environ in async code
os.environ["OPENAI_API_KEY"] = key # Race condition risk
The complete_json() function automatically enables response_format={"type": "json_object"} for providers that support it:
JSON completions include 2 automatic retries with progressively lower temperature:
Robust bracket-matching algorithm in _extract_json() handles:
{ but matching failsLLM functions log detailed errors server-side but return generic messages to clients:
except Exception as e:
logger.error(f"LLM completion failed: {e}")
raise ValueError("LLM completion failed. Please check your API configuration.")
Add new prompt templates to apps/backend/app/prompts/templates.py.
{variable} for substitution (single braces)IMPROVE_BULLET = """
Improve this resume bullet point for a {job_title} position.
Current: {current_bullet}
Output ONLY the improved bullet point, no explanations.
"""
Users configure their preferred AI provider via:
/settingsPUT /api/v1/config/llm-api-keyThe /api/v1/health endpoint validates LLM connectivity.
Note: Docker health checks must use
/api/v1/health(not/health).
All LLM calls have configurable timeouts:
| Operation | Timeout |
|---|---|
| Health checks | 30s |
| Completions | 120s |
| JSON operations | 180s |
| File | Purpose |
|---|---|
apps/backend/app/llm.py | LiteLLM wrapper with JSON mode |
apps/backend/app/prompts/templates.py | Prompt templates |
apps/backend/app/prompts/enrichment.py | Enrichment-specific prompts |
apps/backend/app/config.py | Provider configuration |