packages/kilo-docs/pages/contributing/architecture/index.md
This document provides a high-level overview of the Kilo platform architecture to help contributors understand how the different components fit together.
Kilo is an AI coding platform built around a central CLI engine that powers every client surface — the terminal, VS Code, and the cloud. The architecture follows a layered approach where all clients communicate with the CLI over HTTP + SSE, and the CLI connects to AI providers either directly or through Kilo Cloud.
graph LR
tui["Kilo CLI (TUI)"]
vscode["VS Code Extension"]
subgraph cli ["Kilo CLI Engine"]
provider["Provider Router"]
end
subgraph cloud ["Kilo Cloud"]
gateway["Kilo Gateway"]
cloudagent["Cloud Agent"]
bot["Kilo Bot"]
claw["KiloClaw"]
gastown["Gas Town"]
review["Code Review"]
triage["Auto Triage"]
appbuilder["App Builder"]
end
providers["Inference Providers: Anthropic, OpenAI, Google, OpenRouter + 500 more"]
tui -->|SDK| cli
vscode -->|SDK| cli
cloudagent -->|Sandbox| cli
provider -- Direct --> providers
provider -- Gateway --> gateway
gateway --> providers
claw --> gateway
gastown -->|Container| cli
gastown --> gateway
bot --> cloudagent
review --> cloudagent
triage --> cloudagent
appbuilder --> cloudagent
The CLI (packages/opencode/) is the core engine that all products are built on. It contains the AI agent runtime, tool execution, session management, provider integrations, and an HTTP server. Each client spawns or connects to a kilo serve process and communicates via HTTP + SSE using the @kilocode/sdk.
The CLI can run in several modes:
kilo — Interactive TUI for terminal-based codingkilo run — Headless single-prompt executionkilo serve — HTTP server mode for client integrationsKey subsystems inside the CLI:
| Subsystem | Purpose |
|---|---|
| Agent Runtime | Orchestrates AI conversations, tool calls, and multi-step task execution |
| Tools Service | Built-in tools for file editing, shell execution, search, and more |
| MCP Servers | Model Context Protocol support for extending with external tools |
| LSP Client | Language Server Protocol integration for code intelligence |
| Session Manager | Persistent session state, conversation history, and checkpoints |
| Provider Router | Connects to 500+ AI models via direct APIs or Kilo Gateway |
| HTTP Server | REST API + SSE streaming for client communication |
| Config System | Project and global configuration, modes, and permissions |
All clients are thin wrappers over the CLI engine.
The VS Code extension (packages/kilo-vscode/) bundles the CLI binary and spawns kilo serve as a child process. It includes:
The built-in terminal UI ships with the CLI itself — a SolidJS interface rendered in the terminal via OpenTUI.
Kilo Cloud is the hosted platform layer that provides authentication, provider routing, and autonomous agent services. The cloud infrastructure lives in a separate repository.
The gateway (packages/kilo-gateway/ in this repo, plus API routes in the cloud) handles:
A Cloudflare Worker within Kilo Cloud that runs the Kilo CLI in isolated sandbox environments. It powers cloud-based AI coding tasks triggered via the web dashboard, webhooks, or automation workflows. It provides a secure API for:
The GitHub/GitLab bot that responds to issue comments and PR mentions. It dispatches work to the Cloud Agent, enabling users to trigger AI coding tasks directly from their repositories.
A multi-tenant compute platform running on Fly.io, orchestrated by a Cloudflare Worker. Each user gets a dedicated persistent machine running an OpenClaw gateway, coordinated via Durable Objects for state management and self-healing reconciliation.
{% image src="/docs/img/kiloclaw/kiloclaw-architecture.png" alt="KiloClaw infrastructure architecture diagram" width="800" caption="KiloClaw infrastructure architecture" /%}
An automated code review service that subscribes to GitHub webhooks, dispatches reviews through the Cloud Agent, and posts feedback directly on pull requests. Supports per-organization concurrency limits and automatic queuing.
An automated issue triage service that classifies GitHub issues (bug, feature, question), detects duplicates via vector similarity search, and optionally creates fix PRs for high-confidence actionable issues.
A service that builds and deploys user applications via the Cloud Agent. Users can generate full applications from prompts, with the App Builder orchestrating the Cloud Agent to scaffold, iterate, and deploy the result.
A multi-agent orchestration platform that coordinates autonomous AI coding agents working on real Git repositories. Gas Town runs entirely on Cloudflare — a central Durable Object manages all state, while Docker containers on Cloudflare Containers run agent processes via the Kilo CLI.
Key concepts:
Agents operate in a hierarchy:
| Agent | Role |
|---|---|
| Mayor | Persistent conversational coordinator — decomposes tasks and delegates to worker agents |
| Polecat | Worker agent — clones repo worktrees, writes code, commits, pushes, and creates PRs |
| Refinery | Code review agent — reviews polecat branches, runs quality gates, merges or requests rework |
| Triage | Ephemeral agent that resolves ambiguous situations detected by automated patrol checks |
A reconciler loop running every 5 seconds drives all state transitions: dispatching agents, transitioning beads, polling PR status, managing convoys, and recovering from failures.
| Service | Purpose |
|---|---|
| Webhook Agent Ingest | Named webhook endpoints that capture HTTP requests and queue delivery to Cloud Agent |
| AI Attribution | Tracks line-level AI-generated code attribution when users accept or reject edits |
| Session Ingest | Ingests and stores CLI session data for analytics |
| Observability | Telemetry pipelines for monitoring cloud services |
Modes are configurable presets that customize the agent's behavior:
MCP enables extending the agent with external tools:
mcp.jsonGit-based state management for safe exploration:
Git worktree isolation for parallel task execution:
All clients communicate with the CLI via its HTTP + SSE API. The @kilocode/sdk package provides a TypeScript client:
import { KiloClient } from "@kilocode/sdk"
const client = new KiloClient({ baseUrl: "http://localhost:3000" })
const session = await client.session.create({ ... })
The CLI uses flat ESM exports inside each module, then re-exports the module as a namespace from an index file when callers need grouped access. Avoid adding new export namespace declarations; top-level exports are easier to tree-shake and work better with Node's type-stripping runtime.
// packages/opencode/src/session/session.ts
export const create = fn(CreateSchema, async (input) => {
// ...
})
export const list = fn(ListSchema, async (input) => {
// ...
})
// packages/opencode/src/session/index.ts
export * as Session from "./session"
Prefer importing the specific export when possible. Use the namespace re-export (Session.create, Session.list) when a caller benefits from grouped module access or when preserving the existing public shape.
The CLI server is Hono-based and publishes an OpenAPI-compatible HTTP + SSE API consumed by @kilocode/sdk. Some route groups are being migrated behind the experimental Effect HttpApi bridge while preserving the generated SDK shape.
HttpApi.KILO_EXPERIMENTAL_HTTPAPI only for migration testing; public clients should not depend on the bridge detail.packages/sdk/js/ after server endpoint changes.Tools follow a consistent pattern with Zod schema validation:
export const ReadTool = Tool.define({
name: "read",
description: "Read a file",
parameters: z.object({
path: z.string(),
}),
async execute(params) {
// ...
},
})
The project uses:
tsgo across all packages| Repository | Contents |
|---|---|
| Kilo-Org/kilocode | CLI engine, VS Code extension, SDK, gateway client, telemetry, docs, UI components |
| Cloud (private) | Web dashboard, Cloud Agent, Kilo Bot, KiloClaw, Gas Town, code review, auto triage, billing, and supporting Cloudflare Workers |