Back to Resume Matcher

LLM Integration Guide

docs/agent/llm-integration.md

1.2.03.0 KB
Original Source

LLM Integration Guide

Multi-provider AI support, JSON handling, and prompt guidelines.

Multi-Provider Support

Backend uses LiteLLM to support multiple providers through a unified API:

ProviderTypeNotes
OllamaLocalFree, runs on your machine
OpenAICloudGPT-5 Nano, GPT-4o
AnthropicCloudClaude Haiku 4.5
Google GeminiCloudGemini 3 Flash
OpenRouterCloudAccess to multiple models
DeepSeekCloudDeepSeek Chat

API Key Handling

API keys are passed directly to litellm.acompletion() via the api_key parameter (not via os.environ) to avoid race conditions in async contexts.

python
# Correct
await litellm.acompletion(
    model=model,
    messages=messages,
    api_key=api_key  # Direct parameter
)

# Incorrect - don't use os.environ in async code
os.environ["OPENAI_API_KEY"] = key  # Race condition risk

JSON Mode

The complete_json() function automatically enables response_format={"type": "json_object"} for providers that support it:

  • OpenAI
  • Anthropic
  • Gemini
  • DeepSeek
  • Major OpenRouter models

Retry Logic

JSON completions include 2 automatic retries with progressively lower temperature:

  • Attempt 1: temperature 0.1
  • Attempt 2: temperature 0.0

JSON Extraction

Robust bracket-matching algorithm in _extract_json() handles:

  • Malformed responses
  • Markdown code blocks
  • Edge cases
  • Infinite recursion protection when content starts with { but matching fails

Error Handling Pattern

LLM functions log detailed errors server-side but return generic messages to clients:

python
except Exception as e:
    logger.error(f"LLM completion failed: {e}")
    raise ValueError("LLM completion failed. Please check your API configuration.")

Adding Prompts

Add new prompt templates to apps/backend/app/prompts/templates.py.

Prompt Guidelines

  1. Use {variable} for substitution (single braces)
  2. Include example JSON schemas for structured outputs
  3. Keep instructions concise: "Output ONLY the JSON object, no other text"

Example

python
IMPROVE_BULLET = """
Improve this resume bullet point for a {job_title} position.

Current: {current_bullet}

Output ONLY the improved bullet point, no explanations.
"""

Provider Configuration

Users configure their preferred AI provider via:

  • Settings page: /settings
  • API: PUT /api/v1/config/llm-api-key

Health Checks

The /api/v1/health endpoint validates LLM connectivity.

Note: Docker health checks must use /api/v1/health (not /health).

Timeouts

All LLM calls have configurable timeouts:

OperationTimeout
Health checks30s
Completions120s
JSON operations180s

Key Files

FilePurpose
apps/backend/app/llm.pyLiteLLM wrapper with JSON mode
apps/backend/app/prompts/templates.pyPrompt templates
apps/backend/app/prompts/enrichment.pyEnrichment-specific prompts
apps/backend/app/config.pyProvider configuration