doc/SPEC.md
Target specification for the Paperclip control plane. Living document — updated incrementally during spec interviews.
A Company is a first-order object. One Paperclip instance runs multiple Companies. A Company does not have a standalone "goal" field — its direction is defined by its set of Initiatives (see Task Hierarchy Mapping).
| Field | Type | Notes |
|---|---|---|
id | uuid | Primary key |
name | string | Company name |
createdAt | timestamp | |
updatedAt | timestamp |
Every Company has a Board that governs high-impact decisions. The Board is the human oversight layer.
V1: Single human Board. One human operator.
The Board has unrestricted access to the entire system at all times:
The Board is not just an approval gate — it's a live control surface. The human can intervene at any level at any time.
The Board sets Company-level budgets. The CEO can set budgets for Agents below them, and every manager Agent can do the same for their reports. How this cascading budget delegation works in practice is TBD, but the permission structure supports it. The Board can manually override any budget at any level.
Future governance models (not V1):
Every employee is an agent. Agents are the workforce.
Concepts like SOUL.md (identity/mission) and HEARTBEAT.md (loop definition) are not part of the Paperclip protocol. They are adapter-specific configurations. For example, an OpenClaw adapter might use SOUL.md and HEARTBEAT.md files. A Claude Code adapter might use CLAUDE.md. A bare Python script might use command-line args.
Paperclip doesn't prescribe how an agent defines its identity or behavior. It provides the control plane; the adapter defines the agent's inner workings.
Each agent has an adapter type and an adapter-specific configuration blob. The adapter defines what config fields exist.
At the protocol level, Paperclip tracks:
Each adapter type defines its own config schema. Examples:
A key goal: the entire org's agent configurations are exportable. You can export a company's complete agent setup — every agent, their adapter configs, org structure — as a portable artifact. This enables:
Configurable per agent. Two ends of the spectrum:
The minimum requirement to be a Paperclip agent: be callable. That's it. Paperclip can invoke you via command or webhook. No requirement to report back — Paperclip infers basic status from process liveness when it can.
Beyond the minimum, Paperclip provides progressively richer integration:
Paperclip ships default agents that demonstrate full integration: progress tracking, cost instrumentation, and a Paperclip skill (a Claude Code skill for interacting with the Paperclip API) for task management. These serve as both useful defaults and reference implementations for adapter authors.
Two export modes:
The usual workflow: export a template, create a new company from it, add a couple initial tasks, go.
Hierarchical reporting structure. CEO at top, reports cascade down.
Full visibility across the org. Every agent can see the entire org chart, all tasks, all agents. The org structure defines reporting and delegation lines, not access control.
Each agent publishes a short description of their responsibilities and capabilities — almost like skills ("when I'm relevant"). This lets other agents discover who can help with what.
Agents can create tasks and assign them to agents outside their reporting line. This is the mechanism for cross-team collaboration. These rules are primarily encoded in the Paperclip SKILL.md which is recommended for all agents. Paperclip the app enforces the tooling and some light governance, but the cross-team rules below are mainly implemented by agent decisions.
When an agent receives a task from outside their team:
It's any manager's responsibility to understand why their subordinates are blocked and resolve it:
When a task originates from a cross-team request, track the depth as an integer — how many delegation hops from the original requester. This provides visibility into how far work cascades through the org.
Tasks carry a billing code so that token spend during execution can be attributed upstream to the requesting task/agent. When Agent A asks Agent B to do work, the cost of B's work is tracked against A's request. This enables cost attribution across the org.
The heartbeat is a protocol, not a runtime. Paperclip defines how to initiate an agent's cycle. What the agent does with that cycle — how long it runs, whether it's task-scoped or continuous — is entirely up to the agent.
Agent configuration includes an adapter that defines how Paperclip invokes the agent. Built-in adapters include:
| Adapter | Mechanism | Example |
|---|---|---|
process | Execute a child process | python run_agent.py --agent-id {id} |
http | Send an HTTP request | POST https://openclaw.example.com/hook/{id} |
claude_local | Local Claude Code process | Claude Code heartbeat worker |
codex_local | Local Codex process | Codex CLI heartbeat worker |
opencode_local | Local OpenCode process | OpenCode heartbeat worker |
pi_local | Local Pi process | Pi CLI heartbeat worker |
cursor | Cursor API/CLI bridge | Cursor-integrated heartbeat worker |
openclaw_gateway | OpenClaw gateway API | Managed OpenClaw agent via gateway |
hermes_local | Local Hermes process | Hermes agent heartbeat worker |
The process and http adapters ship as generic defaults. Additional built-in adapters cover common local coding runtimes (see list above), and new adapter types can be registered via the plugin system (see Plugin / Extension Architecture).
Every adapter implements three methods:
invoke(agentConfig, context?) → void // Start the agent's cycle
status(agentConfig) → AgentStatus // Is it running? finished? errored?
cancel(agentConfig) → void // Graceful stop signal (for pause/resume)
This is the full adapter contract. invoke starts the agent, status lets Paperclip check on it, cancel enables the board's pause functionality. Everything else (cost reporting, task updates) is optional and flows through the Paperclip REST API.
When the board (or system) pauses an agent:
This is "graceful signal + stop future heartbeats." The current run gets a chance to land cleanly.
All agent communication flows through the task system.
There is no separate messaging or chat system. Tasks are the communication channel. This keeps all context attached to the work it relates to and creates a natural audit trail.
Full hierarchy: Initiative (company goal) → Projects → Milestones → Issues → Sub-issues. Everything traces back to an initiative, and the "company goal" is just the first/primary initiative.
Token/LLM cost budgeting is a core part of Paperclip. External revenue and expense tracking is a future plugin.
Fully-instrumented Agents report token/API usage back to Paperclip. Costs are tracked at every level:
Costs should be denominated in both tokens and dollars.
Billing codes on tasks (see Org Structure) enable cost attribution across teams — when Agent A requests work from Agent B, B's costs roll up to A's request.
Three tiers:
Budgets can be set to unlimited (no ceiling).
How a Company goes from "created" to "running":
Paperclip ships default Agent templates:
These are starting points. Users can customize or replace them entirely.
The default agent's loop is config-driven. The adapter config contains the instructions that define what the agent does on each heartbeat cycle. There is no hardcoded standard loop — each agent's config determines its behavior.
This means the default CEO config tells the CEO to review strategy, check on reports, etc. The default engineer config tells the engineer to check assigned tasks, pick the highest priority, and work it. But these are config choices, not protocol requirements.
A skill definition that teaches agents how to interact with Paperclip. Provides:
This skill is adapter-agnostic — it can be loaded into Claude Code, injected into prompts, or used as API documentation for custom agents.
Single-tenant, self-hostable. Not a SaaS. One instance = one operator's companies.
The key constraint: it must be trivial to go from "I'm trying this on my machine" to "my agents are running on remote servers talking to my Paperclip instance."
When a user creates an Agent, Paperclip generates a connection string containing: the server URL, an API key, and instructions for how to authenticate. The Agent is assumed to be capable of figuring out how to call the API with its token/key from there.
Flow:
| Layer | Technology |
|---|---|
| Frontend | React + Vite |
| Backend | TypeScript + Express (REST API, not tRPC — need non-TS clients) |
| Database | PostgreSQL (see doc/DATABASE.md for details — PGlite embedded for dev, Docker or hosted Supabase for production) |
| Auth | Better Auth |
Tasks use single assignment (one agent per task) with atomic checkout:
in_progress (claiming it)No optimistic locking or CRDTs needed. The single-assignment model + atomic checkout prevents conflicts at the design level.
Agents can create tasks assigned to humans. The board member (or any human with access) can complete these tasks through the UI.
When a human completes a task, if the requesting agent's adapter supports pingbacks (e.g. OpenClaw hooks), Paperclip sends a notification to wake that agent. This keeps humans rare but possible participants in the workflow.
The agents are discouraged from assigning tasks to humans in the Paperclip SKILL, but sometimes it's unavoidable.
Single unified REST API. The same API serves both the frontend UI and agents. Authentication determines permissions — board auth has full access, agent API keys have scoped access (their own tasks, cost reporting, company context).
No separate "agent API" vs. "board API." Same endpoints, different authorization levels.
Paperclip manages task-linked work artifacts: issue documents (rich-text plans, specs, notes attached to issues) and file attachments. Agents read and write these through the API as part of normal task execution. Full delivery infrastructure (code repos, deployments, production runtime) remains the agent's domain — Paperclip orchestrates the work, not the build pipeline.
When an agent crashes or disappears mid-task, Paperclip does not auto-reassign or auto-release the task. Instead:
in_progress with no recent activity) through dashboards and reportingPrinciple: Paperclip reports problems, it doesn't silently fix them. Automatic recovery hides failures. Good visibility lets the right entity (human or agent) decide what to do.
The core Paperclip system must be extensible. Features like knowledge bases, external revenue tracking, and new Agent Adapters should be addable as plugins without modifying core. This means:
The plugin framework has shipped. Plugins can register new adapter types, hook into lifecycle events, and contribute UI components (e.g. global toolbar buttons). A plugin SDK and CLI commands (paperclipai plugin) are available for authoring and installing plugins.
Each is a distinct page/route:
Full loop with one adapter. V1 must demonstrate the complete Paperclip cycle end-to-end, even if narrow.
Anti-goal for core. The knowledge base is not part of the Paperclip core — it will be a plugin. The task system + comments + agent descriptions provide sufficient shared context.
The architecture must support adding a knowledge base plugin later (clean API boundaries, hookable lifecycle events) but the core system explicitly does not include one.
Things Paperclip explicitly does not do: