docs/plans/happy-agent.md
A new standalone CLI tool (happy-agent) in packages/happy-agent that acts as a dedicated client for controlling Happy Coder agents remotely. Unlike happy-cli which both runs and controls agents, happy-agent only controls them — creating sessions, sending messages, reading history, monitoring state, and stopping sessions.
This is a completely separate client from happy-cli. It has its own authentication flow (account auth via QR code, same as device linking in the mobile app), its own credential storage (~/.happy/agent.key), and is written from scratch with no code sharing.
happy-cli (agent runtime + control), happy-server (Fastify + PostgreSQL + Redis), happy-app (React Native mobile)https://api.cluster-fluster.com + Socket.IO at /v1/updates/v1/auth/account/request + /v1/auth/account/response) — generates ephemeral keypair, displays QR code (happy:///account?[base64url-publicKey]), user scans with existing Happy mobile app to approve, receives encrypted account secret~/.happy/agent.key (separate from happy-cli's ~/.happy/access.key)deriveKey(secret, 'Happy EnCoder', ['content']) → seed → crypto_box_seed_keypair(seed). Per-session random keys are encrypted with the master public key and stored on the server.AgentState.controlledByUser indicates if agent is actively processing; requests field tracks pending tool calls[x] immediately when donepackages/happy-agent/ directory with package.json (name: happy-agent, type: module, bin: ./bin/happy-agent.mjs)tsconfig.json with strict mode, path aliases (@/ → src/), ESM outputbin/happy-agent.mjs entry point wrapper (mirrors happy-cli pattern: spawns node with --no-warnings)src/index.ts as main entry point with argument parsing shellpackage.json workspacesaxios, socket.io-client, tweetnacl, zod, chalk, commander, qrcode-terminaltypescript, vitest, pkgroll, tsxvitest.config.tsyarn install and yarn build worksrc/encryption.ts with encodeBase64, decodeBase64, encodeBase64Url, getRandomBytes functionshmac_sha512(key, data) using Node.js createHmac('sha512', ...)deriveSecretKeyTreeRoot(seed, usage) — HMAC-SHA512 with key = usage + ' Master Seed' (UTF-8), data = seed. Split 64-byte result: key = [0:32], chainCode = [32:64]deriveSecretKeyTreeChild(chainCode, index) — HMAC-SHA512 with key = chainCode, data = [0x00, ...UTF-8(index)]. Split same way.deriveKey(master, usage, path) — derives root, then iterates path elements through child derivationderiveContentKeyPair(secret) — calls deriveKey(secret, 'Happy EnCoder', ['content']) → seed → sha512(seed)[0:32] → tweetnacl.box.keyPair.fromSecretKey() → returns { publicKey, secretKey }encryptWithDataKey(data, dataKey) — AES-256-GCM: [1-byte version=0][12-byte nonce][ciphertext][16-byte auth tag]decryptWithDataKey(bundle, dataKey) — reverse of aboveencryptLegacy(data, secret) — TweetNaCl secretbox: [24-byte nonce][ciphertext + MAC]decryptLegacy(data, secret) — reverse of aboveencrypt(key, variant, data) / decrypt(key, variant, data) dispatcher for 'legacy' | 'dataKey' variantslibsodiumEncryptForPublicKey(data, recipientPublicKey) — encrypts data with NaCl box using ephemeral keypair. Bundle: [32-byte ephemeral pubkey][24-byte nonce][ciphertext]decryptBoxBundle(bundle, recipientSecretKey) — decrypts NaCl box bundle (used for auth response decryption AND per-session key decryption)authChallenge(secret) — generates signing keypair from secret seed, creates random 32-byte challenge, signs with tweetnacl.sign.detached. Returns { challenge, publicKey, signature } for token refresh via /v1/auth'test seed', usage='test usage', path=['child1','child2']E6E55652456F9FE47D6FF46CA3614E85B499F77E7B340FBBB1553307CEDC1E741011C097D2105D27362B987A631496BBF68B836124D1D072E9D1613C6028CF75tweetnacl.sign.detached.verifysrc/config.ts — reads HAPPY_SERVER_URL (default: https://api.cluster-fluster.com), HAPPY_HOME_DIR (default: ~/.happy), derives credential file path as ${happyHomeDir}/agent.keysrc/credentials.ts:
Credentials type: { token: string, secret: Uint8Array, contentKeyPair: { publicKey: Uint8Array, secretKey: Uint8Array } }readCredentials(config) — parses ~/.happy/agent.key JSON { token, secret }, decodes secret from base64, derives contentKeyPair via deriveContentKeyPair(secret). Returns Credentials or null if file missing.writeCredentials(config, token, secret) — writes { token, secret: base64(secret) } to ~/.happy/agent.keyclearCredentials(config) — deletes ~/.happy/agent.keyrequireCredentials(config) — calls readCredentials, throws with "Run happy-agent auth login first" if nullhappy-agent auth)src/auth.ts implementing the account auth flow:
tweetnacl.box.keyPair.fromSecretKey(randomBytes(32))/v1/auth/account/request with { publicKey: base64(keypair.publicKey) }happy:///account? + base64url(keypair.publicKey)qrcode-terminal/v1/auth/account/request every 1 second with same publicKeystate === 'authorized': decrypt response using decryptBoxBundle(decodeBase64(response), keypair.secretKey) to get the account secret (32 bytes)writeCredentials(config, token, secret)happy-agent auth login subcommand that runs the flow abovehappy-agent auth logout subcommand that calls clearCredentials()happy-agent auth status subcommand that reads credentials and prints auth status (authenticated / not authenticated)src/api.ts with functions:
listSessions(config, creds) — GET /v1/sessions, for each session: resolve encryption key (see key resolution below), decrypt metadata/agentState, return decrypted session listlistActiveSessions(config, creds) — GET /v2/sessions/active, same decryption logiccreateSession(config, creds, opts: { tag, metadata }) — POST /v1/sessions:
libsodiumEncryptForPublicKey(sessionKey, creds.contentKeyPair.publicKey) → prepend version byte [0x00] → base64 for dataEncryptionKey fieldencryptWithDataKey(metadata, sessionKey)getSessionMessages(config, creds, sessionId) — GET /v1/sessions/:id/messagesdeleteSession(config, creds, sessionId) — DELETE /v1/sessions/:iddataEncryptionKey: strip version byte, decryptBoxBundle(encrypted, creds.contentKeyPair.secretKey) → per-session AES key, use 'dataKey' variantdataEncryptionKey: use creds.secret as key with 'legacy' variantAuthorization: Bearer <token> headersrc/session.ts — SessionClient class that:
serverUrl/v1/updates with { token, clientType: 'session-scoped', sessionId }update events, decrypts messages using session encryption key (AES-256-GCM or legacy depending on variant), emits typed events (message, state-change)sendMessage(text, meta?) — encrypts user message with session key and emits message event with { sid, message }getMetadata() / getAgentState() — returns current cached decrypted statewaitForIdle(timeoutMs?) — watches agentState.controlledByUser and agentState.requests, resolves when agent has no pending requests and controlledByUser !== truesendStop() — emits session-end eventclose() — disconnects socketlist and statussrc/index.ts using commander with program name happy-agenthappy-agent list — calls listSessions, displays table: ID (truncated), name/summary, path, status (active/inactive), last active time. With --json outputs raw JSON. With --active filters to active only.happy-agent status <session-id> — fetches session via list + filter by ID prefix, connects Socket.IO to get live state, displays: session ID, metadata (path, host, lifecycle state), agent state (idle/busy, pending requests count), last message preview. With --json outputs raw JSON. Disconnects after displaying.src/output.ts — helper for human-readable vs JSON formatting based on --json flagcreate and sendhappy-agent create --tag <tag> [--path <path>] — creates new session with given tag and metadata (path defaults to cwd, host to hostname). Prints session ID. With --json outputs full session JSON.happy-agent send <session-id> <message> — resolves session key, connects Socket.IO, sends user message (encrypted with AES-256-GCM), optionally waits for idle with --wait. Disconnects after. Prints confirmation. With --json outputs message details.history, stop, and waithappy-agent history <session-id> — fetches messages via HTTP, resolves session encryption key (dataKey or legacy), decrypts each message, displays in chronological order with role/timestamp. With --json outputs raw JSON. With --limit <n> limits output.happy-agent stop <session-id> — connects Socket.IO, sends session-end event, disconnects. Prints confirmation.happy-agent wait <session-id> [--timeout <seconds>] — connects Socket.IO, waits for agent idle state (no pending requests, not controlled by user), prints when idle or times out (default 300s). Exit code 0 on idle, 1 on timeout.--json flag works on all applicable commandspackages/happy-agent/ with usage examples for all commandshappy-agent auth login # Authenticate via QR code (scanned by Happy mobile app)
happy-agent auth logout # Clear stored credentials
happy-agent auth status # Show authentication status
happy-agent list [--active] [--json] # List all sessions
happy-agent status <session-id> [--json] # Get live session state
happy-agent create --tag <tag> [--path <path>] [--json] # Create new session
happy-agent send <session-id> <message> [--wait] [--json] # Send message
happy-agent history <session-id> [--limit <n>] [--json] # Read message history
happy-agent stop <session-id> # Stop a session
happy-agent wait <session-id> [--timeout <s>] # Wait for agent to become idle
happy-agent Happy Server Happy Mobile App
| | |
+-- Generate ephemeral keypair | |
+-- POST /v1/auth/account/request -> | |
| { publicKey } | |
| | |
+-- Display QR code in terminal | |
| happy:///account?[base64url-key] | |
| | |
| | <-- User scans QR code ------+
| | |
| | <-- POST /v1/auth/account/response
| | { publicKey, |
| | response: box.encrypt( |
| | accountSecret, |
| | ephemeralPubKey) } |
| | |
+-- Poll /v1/auth/account/request -> | |
| state: 'authorized' | |
| token: JWT | |
| response: encrypted secret | |
| | |
+-- box.open(response, ephemeralSK) | |
| -> accountSecret (32 bytes) | |
+-- Save { token, secret } | |
| to ~/.happy/agent.key | |
| | |
+-- Derive content keypair: | |
| deriveKey(secret, | |
| 'Happy EnCoder', ['content']) | |
| -> seed -> box keypair | |
| (publicKey for encrypting | |
| per-session keys, | |
| secretKey for decrypting them) | |
v Authenticated | |
~/.happy/agent.key){
"token": "jwt-auth-token",
"secret": "base64-encoded-32-byte-account-secret"
}
At load time, the content keypair is derived from the secret:
secret (32 bytes)
-> deriveKey(secret, 'Happy EnCoder', ['content'])
-> seed (32 bytes)
-> sha512(seed)[0:32] -> boxSecretKey
-> tweetnacl.box.keyPair.fromSecretKey(boxSecretKey)
-> { publicKey (32 bytes), secretKey (32 bytes) }
HMAC-SHA512 based key tree (matches mobile app implementation):
deriveSecretKeyTreeRoot(seed, usage):
I = HMAC-SHA512(key = UTF8(usage + ' Master Seed'), data = seed)
key = I[0:32], chainCode = I[32:64]
deriveSecretKeyTreeChild(chainCode, index):
data = [0x00, ...UTF8(index)]
I = HMAC-SHA512(key = chainCode, data = data)
key = I[0:32], chainCode = I[32:64]
deriveKey(master, usage, path):
state = deriveSecretKeyTreeRoot(master, usage)
for each element in path:
state = deriveSecretKeyTreeChild(state.chainCode, element)
return state.key
Test vectors:
seed = UTF8('test seed'), usage = 'test usage', path = ['child1', 'child2']
Root key: E6E55652456F9FE47D6FF46CA3614E85B499F77E7B340FBBB1553307CEDC1E74
Final key: 1011C097D2105D27362B987A631496BBF68B836124D1D072E9D1613C6028CF75
For new sessions (created by happy-agent):
libsodiumEncryptForPublicKey → store as dataEncryptionKey on serverFor existing sessions (created by happy-cli or other clients):
dataEncryptionKey: strip version byte [0], decryptBoxBundle(encrypted, contentKeyPair.secretKey) → per-session AES key, use AES-256-GCMdataEncryptionKey: use secret directly as key with legacy TweetNaCl secretboxAES-256-GCM bundle format: [1-byte version=0][12-byte nonce][ciphertext][16-byte auth tag]
Legacy secretbox bundle format: [24-byte nonce][ciphertext + MAC]
Box encryption bundle format: [32-byte ephemeral pubkey][24-byte nonce][ciphertext]
Agent is considered idle when ALL of these are true:
agentState.controlledByUser is not trueagentState.requests is empty or undefined (no pending tool calls)lifecycleState is not 'archived'axios — HTTP clientsocket.io-client — WebSocket communicationtweetnacl — Encryption (box for key exchange, secretbox for legacy, sign for auth challenge)zod — Runtime validationchalk — Terminal colorscommander — CLI argument parsingqrcode-terminal — QR code display for authenticationManual verification:
happy-agent auth login, scan QR with Happy app, verify credentials savedwait command with a running agent sessionhistory command for sessions created by both happy-agent and happy-cliDistribution:
happy-agentyarn workspace happy-agent build