ui/docs/PRD.md
PrivateGPT is an open-source local AI API project. Its value proposition is not local inference itself, but the higher-level application layer built on top of any OpenAI-compatible local inference backend.
PrivateGPT aims to provide a local implementation of the capabilities developers and users expect from modern Claude-style APIs and applications, including:
The demonstrator UI, tentatively called PrivateGPT Workbench, exists to make these API capabilities tangible.
For non-technical users, it should show PrivateGPT as:
A free, local AI assistant I can use to query documents, knowledge bases, websites, CSVs, and databases without relying on a cloud API key.
For developers, it should show PrivateGPT as:
A local Claude-compatible API layer I can build applications on top of.
This UI should remain a lightweight demonstrator, not the main product. It now lives inside the PrivateGPT repository under ./ui and must not become a heavy frontend application or a maintenance burden.
The API contract is defined by the Fern-generated repository-root relative OpenAPI file:
./fern/openapi/openapi.json
From the ./ui directory, the same file resolves as:
../fern/openapi/openapi.json
The Fern-generated OpenAPI file is the primary source of truth for:
The implementation must inspect and follow the Fern-generated OpenAPI schema rather than relying only on this PRD's examples. Examples in this PRD are illustrative and should be corrected wherever the API contract differs.
Important current endpoints include:
POST /v1/messages
POST /v1/messages/count_tokens
POST /v1/messages/validate
GET /v1/models
POST /v1/artifacts/ingest
GET /v1/artifacts/list?collection=<collection>
POST /v1/artifacts/delete
POST /v1/artifacts/content
POST /v1/artifacts/chunked-content
POST /v1/primitives/search
POST /v1/tools/semantic-search
POST /v1/tools/tabular-data-analysis
POST /v1/tools/database-query
POST /v1/tools/web-fetch
POST /v1/tools/web-search
The implementation should treat POST /v1/messages as the central endpoint. Most of the product experience should flow through chat, with Context providing the inputs and Debugger explaining the underlying API interactions.
Before implementing request builders, parse or manually inspect ../fern/openapi/openapi.json and align all payloads with the current schemas, especially:
ChatBodyMessageInputToolSpecBodyContextFilterFileArtifactSqlDatabaseArtifactMcpServerConfigThe product and implementation requirements in this PRD should be paired with the visual direction in this repository-root relative file:
./ui/docs/STYLE_GUIDE.md
From the ./ui directory, that file resolves as:
./docs/STYLE_GUIDE.md
That style guide includes repo-local reference images copied from the current brand/UI explorations:
./ui/references/primary-chat-layout.png
./ui/references/search-overlay.png
./ui/references/chat-tools-composer.png
./ui/references/context-knowledge-base.png
From ./ui, those files resolve as:
../references/primary-chat-layout.png
../references/search-overlay.png
../references/chat-tools-composer.png
../references/context-knowledge-base.png
Use the style guide as the source of truth for layout, glass surfaces, background treatment, sidebar behavior, chat composer treatment, Context rows, and Debugger visual density.
PrivateGPT Workbench is a lightweight demonstrator UI for the PrivateGPT API. It should prove PrivateGPT's value as a local Claude-compatible AI application backend while staying simple enough to live inside the PrivateGPT repo as a non-core demo.
The app is not intended to become a full product, admin console, or design-system-heavy frontend. It is a practical local UI for trying the API, showing non-technical users what PrivateGPT can do, and helping developers understand how to build on top of it.
localStorage.Recommended initial implementation:
ui/
index.html
Single-file static app containing:
Browser storage:
localStorage for persistent app state.Default PrivateGPT API base URL:
http://127.0.0.1:8001
Allow user override from inside the web application. Users must not need to edit config files to point the demonstrator at a different PrivateGPT deployment.
Connection settings should include:
If auth is configured, requests should include:
Authorization: Basic <base64(username:password)>
There are two separate connection concepts:
PrivateGPT API URL and auth
http://127.0.0.1:8001.LLM Gateway URL and auth
http://127.0.0.1:11434, but that belongs to PrivateGPT backend configuration, not Workbench.All Workbench API calls should target the configured PrivateGPT API base URL. Do not call the LLM gateway directly from the browser UI.
For development and automated testing, a working PrivateGPT deployment may be available through local environment variables:
PGPT_BASE_URL
PGPT_TOKEN
The implementation may read these at runtime in local test scripts or dev-server setup, but must never store, print, commit, or hardcode the actual values.
Browser/CORS behavior should prioritize the local case first. The app should work when served locally against a local PrivateGPT API. If deployed elsewhere, it should still allow the user to configure API URL and token, but any required cross-origin server policy must be handled by the PrivateGPT deployment.
The app has a persistent left sidebar and a main content area.
Sidebar
Context
New Chat
Chats
Contract review
CSV analysis
Database demo
Custom tool test
API Debugger
Settings
GitHub
Not for Production
Main
If first launch or onboarding restarted:
Guided onboarding overlay
Step 1: URL + collection + live checks
Step 2: Optional look-and-feel customization
If Context selected:
Context configuration screen
If Settings selected:
PrivateGPT API connection settings and assistant behavior
If API Debugger selected:
Session-level API request/response trace
If Chat selected:
Chat interface
Store in localStorage:
{
privateGptBaseUrl: string,
privateGptUsername: string,
privateGptPassword: string,
systemPrompt: string,
useCitations: boolean,
selectedModel: string | null,
uiAppearance: {
brief: string,
brandName: string,
welcomeTitle: string,
welcomeSubtitle: string,
customInstructions: string,
palette: {
accent: string,
secondary: string,
surface: string,
background: string
},
features: {
databases: boolean,
web: boolean,
mcp: boolean,
skills: boolean,
customTools: boolean,
apiDebugger: boolean,
github: boolean,
productionNotice: boolean
}
},
onboarding: {
completed: boolean,
step: 1 | 2,
appearanceSkipped: boolean,
lastCheck: {
ok: boolean,
testedAt: string,
summary: string,
steps: Array<{ label: string, detail: string, ok: boolean | null }>
} | null
},
context: {
documents: {
defaultCollection: string
},
databases: DatabaseConfig[],
mcpServers: McpServerConfig[],
skills: SkillConfig[],
customTools: CustomToolConfig[]
},
chats: ChatSession[],
activeChatId: string | null
}
Chat session:
type ChatSession = {
id: string;
title: string;
createdAt: string;
updatedAt: string;
messages: ChatMessage[];
settings: {
enabledDocuments: boolean;
enabledDatabases: string[];
enabledWeb: boolean;
enabledMcpServers: string[];
enabledSkills: string[];
enabledCustomTools: string[];
model: string | null;
};
};
Do not store debugger data.
On reload:
Sidebar items:
Context
New Chat
New chat.Chat List
updatedAt DESC.API Debugger
Settings
GitHub
Not for Production
No projects or grouping.
The Settings screen owns Workbench-level configuration that is not part of assistant context.
Settings include:
The PrivateGPT API settings should not live in the Context screen. Context is for sources/tools the assistant can use; Settings is for how Workbench connects to PrivateGPT.
The system prompt also belongs in Settings because it controls global assistant behavior. Send it through the top-level system field in ChatBody, not as a system role message. If the user leaves it empty, do not send system prompt text. A system object may still be sent without text when needed for request-level options such as citations.enabled.
The Use citations toggle controls whether document-enabled chats request citation-annotated answers. It should be enabled by default.
The Clear local data action removes this Workbench's saved chats, settings, token, context, and preferences from browser localStorage. It must not imply deletion of data stored in PrivateGPT itself, such as ingested documents or backend configuration.
On first launch, Workbench should open an onboarding overlay before normal use.
Step 1 requirements:
GET /v1/modelsGET /v1/artifacts/list?collection=<collection>GET /v1/skills?collection=<collection>Step 2 requirements:
POST /v1/messages to generate a starting appearance proposal.The sidebar should include a persistent Not for Production disclosure below the GitHub widget.
Clicking it opens a closable glass-style modal titled:
This demonstrator is not intended for Production use
The modal should explain that Workbench is useful for trying API capabilities, debugging requests, and exploring local AI workflows, but should not be published as a production application.
The disclosure should cover four concise risks:
localStorage.End the modal with a commercial Zylon CTA:
Zylon to https://zylon.ai.book a demo with our team to https://cal.com/zylon/demo?source=privategptui.Zylon is an enterprise AI platform delivering on-premise generative AI infrastructure for regulated industries, enabling secure deployment without external cloud dependencies.The Context screen defines what the assistant can use.
Sections:
Documents
Databases
Web
MCP
Skills
Custom Tools
A compact tab or accordion layout is acceptable.
Purpose: manage ingested local knowledge.
Capabilities:
POST /v1/artifacts/ingest.GET /v1/artifacts/list?collection=<collection>.POST /v1/artifacts/delete.POST /v1/tools/semantic-search.POST /v1/artifacts/content.The collection field is important and must be user-configurable. Some PrivateGPT deployments create collections on the fly, while others sit behind a gateway that restricts each bearer token to one or more allowed collections. If a deployment enforces allowed collections, ingest/list/search calls must use one of those allowed collection ids or the API may reject the request.
The Collection field lives in Settings, not in the Documents panel. It applies globally and is used for:
There is one active collection for the Workbench. Do not expose a separate chat-level collection selector in v1.
Upload behavior:
file_name.Example ingest body:
{
"artifact": "contract-2026-05-14",
"collection": "default",
"input": {
"type": "file",
"value": "<base64>"
},
"metadata": {
"file_name": "contract.pdf"
}
}
When a chat has Documents enabled, the /v1/messages call should enable the semantic search tool and scope it to the configured Documents collection.
Use this request pattern:
{
"model": "default",
"messages": [
{
"role": "user",
"content": "Find the property address in the documents. Answer just with the address, no extra text."
}
],
"tools": [
{
"name": "semantic_search",
"type": "semantic_search_v1"
}
],
"tool_context": [
{
"type": "ingested_artifact",
"context_filter": {
"collection": "<configured-collection>",
"artifacts": []
}
}
]
}
An empty artifacts array means search all documents within the configured collection.
Purpose: define SQL database artifacts available to chat.
Stored locally only.
Fields:
idnameconnection_stringdescriptionschemas, optional comma-separated listsslenable_tablesenable_viewsenable_functionsenable_proceduresWhen selected in chat, convert database configs into tool_context artifacts:
{
"type": "sql_database",
"connection_string": "...",
"schemas": null,
"ssl": false,
"enable_tables": true,
"enable_views": true,
"enable_functions": true,
"enable_procedures": true,
"description": "Local sales database"
}
Purpose: explain web capabilities and let chat-level Tools decide whether to use them.
Do not collect web provider names, API keys, or extra web configuration in Workbench.
The current OpenAPI exposes:
POST /v1/tools/web-searchPOST /v1/tools/web-fetchThe web search provider and its credentials belong in PrivateGPT backend config. In Workbench, show static explanatory text in Context > Web and let the chat Tools menu decide whether web_search and web_extract are included in a chat request. The direct diagnostic endpoint may still be named /v1/tools/web-fetch; for /v1/messages, use the chat tool spec { "name": "web_extract", "type": "web_extract_v1" }.
Purpose: configure MCP connectors.
Fields:
idnameserver_config_jsonallowed_tools, optional listThe OpenAPI supports mcp_servers on ChatBody. The UI should allow raw JSON editing initially to avoid over-designing unknown MCP variants.
Example UI:
Purpose: configure available skills.
Use the backend skills API scoped to the single active Workbench collection.
Fields:
iddisplay_titlecollectionlatest_versionsourceloadingreadonlyOperations:
GET /v1/skills?collection=<collection> to list skills for the active collection.POST /v1/skills multipart create for new skills.POST /v1/skills/{skill_id}/versions multipart create for new versions.DELETE /v1/skills/{skill_id}?collection=<collection> for non-readonly skills.When chat requests are built, selected skills should be represented as a tool_context artifact:
{
"type": "skill",
"skill_filter": {
"collection": "<configured-collection>",
"skill_or_version_ids": ["<selected-skill-id>"]
}
}
Purpose: let users define Claude-style custom tools and browser-executed JavaScript handlers.
Fields:
idnamedescriptioninput_schema_jsonjavascript_handlertest_input_jsonlast_test_resultTool definition shape:
{
"name": "currency_converter",
"description": "Convert USD to EUR using a locally configured exchange rate.",
"inputSchema": {
"type": "object",
"properties": {
"amount": {
"type": "number"
}
},
"required": ["amount"]
}
}
Handler shape:
async function handle(input, context) {
const rate = Number(context.localStorage.getItem("usd_eur_rate") || "0.92");
return {
type: "text",
text: `${input.amount} USD is approximately ${input.amount * rate} EUR.`
};
}
Handler execution context:
{
fetch: window.fetch.bind(window),
localStorage: window.localStorage,
privateGptBaseUrl: string,
currentChatId: string,
currentCollection: string,
log: (message: string, data?: unknown) => void
}
Use browser execution directly. No extra sandbox is required for v1, because the user is explicitly authoring local browser code.
Custom tool test:
Main non-technical experience.
Header controls (in the composer toolbar below the textarea):
GET /v1/models, showing the current model name with an animated chevron. Selecting a model updates state.selectedModel.ghost-button chip showing a zap icon and the label Thinking. Activating it enables extended thinking mode for the active chat. Active state uses a purple tint and glow.For databases, MCP, skills, and custom tools:
Message composer:
Enter while focused in the composer sends the message.Shift+Enter inserts a line break.Message rendering:
<citation ...></citation> tags must never be rendered as text.index attribute, display index + 1.tool_result payload by parsing the JSON text, matching the inline citation id against nodes[].id, and using the matched node's content field as the excerpt. The matcher should tolerate bracket differences such as 4C40 vs [4C40].Request behavior:
POST /v1/messages.ChatBody from chat messages plus chat-selected context/tools./v1/models response. If models have not been loaded yet, fall back to default.system.citations.enabled: true so semantic-search answers can include citation tags. This may require a top-level system object even when no prompt text is configured.system field so they are applied consistently across the whole request.Basic request body:
{
"model": "default",
"messages": [
{
"role": "user",
"content": "Summarize my documents and cite sources."
}
],
"system": {
"text": "You are a support agent. Reply with only a short ticket title.",
"use_default_prompt": false,
"citations": {
"enabled": true
}
},
"tools": [],
"tool_context": [],
"mcp_servers": [],
"stream": false,
"max_tokens": 4096
}
Tool/context building:
ingested_artifact tool context scoped to the global Context Documents collection. Use artifacts: [] to search all documents in that collection.tool_context.mcp_servers.tools.If assistant response contains a tool_use block for a custom browser tool:
toolUse.input.POST /v1/messages with the tool result in the API history.The entire tool execution cycle — initial response, tool result, and follow-up answer — appears as a single consolidated assistant message bubble in chat. Hidden messages carrying tool roles exist in the API history only and are never rendered in the chat UI.
Follow-up messages should preserve the prior conversation and include the tool result using the API's expected content block shape.
If the handler fails:
is_error: true if appropriate.The app uses hash-based navigation so that reloading the page restores the current view and context tab.
Hash format:
#context/{tab} — Context screen with a specific tab (documents, databases, web, mcp, skills, customTools).#settings — Settings screen.#apiDebugger — API Debugger screen.#chat/{chatId} — Specific chat session by ID.syncHash() is called at the end of every render() and after context tab changes. restoreFromHash() runs once at startup before the first render and on hashchange for browser back/forward support.
API Debugger is session-level, live-only, and ephemeral.
Do not persist debugger events.
Show a small non-intrusive callout near the top of API Debugger explaining that it is a live trace for the current session and clears on page refresh.
Purpose:
Debugger layout:
Timeline list | Event detail panel
Event model:
Each API event should include:
Secrets must never be displayed in Debugger. Redact Authorization, token-like headers, API keys, and cookies.
API Debugger should show API events from the current page lifetime, including calls made from Chat, Context, and Settings.
On reload, debugger is empty.
Implement a tiny client wrapper:
async function apiFetch(path, options, debugMeta)
Responsibilities:
privateGptBaseUrl.Authorization: Basic <base64(username:password)> from the configured username and password when provided.Endpoints used from OpenAPI:
GET /v1/modelsPOST /v1/messagesPOST /v1/messages/count_tokens, optionalPOST /v1/messages/validate, optionalPOST /v1/artifacts/ingestGET /v1/artifacts/list?collection=<collection>POST /v1/artifacts/deletePOST /v1/artifacts/content, optionalPOST /v1/tools/semantic-searchPOST /v1/tools/tabular-data-analysis, optional direct diagnosticPOST /v1/tools/database-query, optional direct diagnosticPOST /v1/tools/web-search, optional direct diagnosticPOST /v1/tools/web-fetch, optional direct diagnosticStreaming/async endpoints can be deferred:
/v1/messages/async/v1/messages/async/{message_id}/streamGET /v1/models and select one for each chat./v1/messages.semantic_search tool and an ingested_artifact tool context scoped to the configured Documents collection.localStorage state model./v1/messages chat.