apps/web/content/docs/pro/2.cloud.mdx
Pro users get access to managed cloud services that work out of the box:
| Service | Description | Status |
|---|---|---|
/chat/completions | LLM endpoint for AI features (summaries, notes, chat) | Available |
/mcp | MCP server with web search and URL reading tools | Available |
Pro includes curated AI models that work out of the box. Your requests are proxied through our servers with automatic API key management. If you want to use a specific LLM provider, you can bring your own API key (BYOK) in Settings > Intelligence.
When you use Pro's curated intelligence, Char's server selects from these models automatically. You don't choose a specific model — the server decides which pool of models to use based on the type of request, then OpenRouter picks the fastest available model from that pool.
There are two pools of models, and the server picks one based on a single condition: does your request need tool calling?
If the desktop app sends tool definitions with the request (e.g., for web search or URL reading during note generation) and tool_choice is not set to "none", the server uses the tool-calling model pool. This happens when:
exa-search or read-url| Model | Provider |
|---|---|
anthropic/claude-haiku-4.5 | Anthropic (via OpenRouter) |
openai/gpt-oss-120b:exacto | OpenAI (via OpenRouter) |
moonshotai/kimi-k2-0905:exacto | Moonshot AI (via OpenRouter) |
For standard requests without tools — such as generating summaries, enhancing notes, or regular chat completions — the server uses the default model pool:
| Model | Provider |
|---|---|
anthropic/claude-sonnet-4.5 | Anthropic (via OpenRouter) |
openai/gpt-5.2-chat | OpenAI (via OpenRouter) |
moonshotai/kimi-k2-0905 | Moonshot AI (via OpenRouter) |
Within each pool, you don't get a fixed model. All models in the pool are sent to OpenRouter, which picks the one with the lowest latency at that moment. This means the actual model serving your request can vary between calls — if Anthropic's endpoint is fastest right now, you'll get Claude; if OpenAI responds faster, you'll get GPT.
Here is the routing condition in the server — it checks whether the request includes tool definitions:
<GithubCode url="https://github.com/fastrepl/char/blob/main/crates/llm-proxy/src/handler/mod.rs#L177-L184" />And here are the two model pools defined in the server config:
<GithubCode url="https://github.com/fastrepl/char/blob/main/crates/llm-proxy/src/config.rs#L43-L52" />The server sends your request to OpenRouter with provider.sort = "latency" to pick the fastest available model:
Sent to OpenRouter / model provider:
temperature, max_tokens, streamNOT sent to OpenRouter / model provider:
Char logs metadata about each LLM request to PostHog for usage tracking and billing. No message content is ever logged.
<GithubCode url="https://github.com/fastrepl/char/blob/main/crates/llm-proxy/src/analytics.rs#L31-L39" />Logged: provider name, model name, token counts, latency, cost, HTTP status. Not logged: message content, conversation history, user prompts.
The MCP server provides two built-in tools:
exa-search - Search the web via Exa and get page text and highlights in results. Useful for researching topics mentioned in your meetings.
read-url - Visit any URL and return the content as markdown. Great for pulling in context from links shared during meetings.
While Char aims to be fully transparent and controllable, cloud services help in two ways:
The cloud server (pro.hyprnote.com) is open-source and deployed in our Kubernetes cluster on AWS via GitHub Actions.
Data handling:
All requests are rate-limited and authenticated using your Pro subscription.
All Pro LLM requests go through OpenRouter, which routes to the actual model provider (OpenAI, Anthropic, Moonshot AI).
We have enabled Zero Data Retention (ZDR) on our OpenRouter account. This means all Pro requests are routed exclusively to endpoints that have a Zero Data Retention policy — model providers cannot store your prompts or completions, even temporarily.
| Policy | Details |
|---|---|
| Data retention | Zero — ZDR is enforced on our account, so only ZDR-compliant endpoints are used |
| Training | Does not train on API data |
| Compliance | SOC 2 |
| Data location | US (default) |
"OpenRouter does not store your prompts or responses, unless you have explicitly opted in to prompt logging in your account settings."
Official docs: Privacy Policy · Data Collection · Logging Policies · Zero Data Retention Guide
If you prefer to run AI locally instead, see Local LLM Setup for LLMs and Local Models for speech-to-text.