.agents/skills/agent-testing/references/agent-gateway.md
Captures store + DOM state at 200ms intervals so we can prove or disprove claims like "切回 tab 后消息回到了很早以前". Built for gateway-mode chat but works for any LobeHub streaming session.
Use this when you need to verify the browser↔gateway path end-to-end (gateway mode / Cloud Sandbox / group broadcast over WebSocket). The probe harness below assumes a gateway is already streaming to your browser — this section is how you get one locally.
Why the online gateway can't close the loop locally. The online gateway
(agent-gateway.lobehub.com) verifies the browser's user JWT against the
production app's JWKS. A local dev instance signs that JWT with its own
JWKS_KEY, so the online gateway rejects the WS handshake with
{"type":"auth_failed","reason":"signature verification failed"} → close
1008. The server→gateway push still returns 200 (it uses the static
AGENT_GATEWAY_SERVICE_TOKEN, not JWKS) — so events flow server-side but the
browser never receives them. That's why client / SSE / online-gateway are all
not a real local closed loop. Confirm the failure shape by hooking
WebSocket in the page and reading the first frame after the auth send.
Fix — run the gateway yourself. The gateway worker lives in a sibling
repo: look for agent-gateway/ next to lobehub/ (same parent dir). It's a
Cloudflare Worker (wrangler dev, Durable Objects in local mode) whose
verifyToken checks JWTs against a configurable JWKS_PUBLIC_KEY secret. Point
that secret at your local app's public key and the local gateway trusts your
local JWT → auth_success → full local loop.
# 1. Generate agent-gateway/.dev.vars (public JWK extracted from JWKS_KEY +
# reuse of AGENT_GATEWAY_SERVICE_TOKEN). Source resolves to
# .records/env/gateway.env by default, else .env.local; override with
# JWKS_SOURCE=<env file>. (Managed agent-testing runs have NO .env.local.)
.agents/skills/agent-testing/scripts/agent-gateway/local-gateway-setup.sh
# 2. Start the worker (separate terminal) → http://localhost:8787
cd ../agent-gateway && bun run dev
curl -s -o /dev/null -w '%{http_code}\n' http://localhost:8787/health # → 200
# 3. Decisive check — does the local gateway accept the app's JWT?
node .agents/skills/agent-testing/scripts/agent-gateway/local-gateway-probe.mjs
# → RECV: {"type":"auth_success"} ✅ feasible
# 4. Point the APP at the local gateway and RESTART its dev server:
# AGENT_GATEWAY_URL=http://localhost:8787 (client → ws://localhost:8787/ws)
# AGENT_GATEWAY_SERVICE_TOKEN=<unchanged> (matches gateway .dev.vars SERVICE_TOKEN)
# ENABLE_AGENT_GATEWAY=1 (→ serverConfig.enableGatewayMode)
Key facts the setup relies on: JWKS_PUBLIC_KEY is just JWKS_KEY with the
private fields (d,p,q,dp,dq,qi) stripped — same kid, so signatures verify.
The app's gateway URL flows AGENT_GATEWAY_URL →
getServerGlobalConfig → serverConfig.agentGatewayUrl, and the client builds
the WS URL by swapping http(s)→ws(s) and appending /ws?operationId=…, so an
http://localhost:8787 value Just Works. Server-side, the
GatewayStreamNotifier is wrapped whenever AGENT_GATEWAY_URL && AGENT_GATEWAY_SERVICE_TOKEN are set (AgentRuntime/factory.ts), and it
registers each op with the gateway via POST /api/operations/init carrying only
{operationId, userId} — it does not upload a per-op public key, which is
exactly why the gateway must already trust the signing key via
JWKS_PUBLIC_KEY.
scripts/agent-gateway/
| File | Role |
|---|---|
local-gateway-setup.sh | Writes agent-gateway/.dev.vars from the app's .env.local |
local-gateway-probe.mjs | Signs an app JWT, asserts the local gateway returns auth_success |
probe.js | Injects a 200ms sampler + __PROBE_EVENT marker + __switchTab |
probe-dump.js | Stops the sampler and returns {events, samples} as JSON string |
tab-switch.js | Runs N round-trip switches between two tabs, marks each step |
analyze.mjs | Node post-processor: timeline + regression detection |
# 1. Start Electron with CDP
./.agents/skills/agent-testing/scripts/electron-dev.sh start
# 2. Navigate to a chat, switch runtime to Cloud Sandbox (gateway mode)
# 3. Install the probe + helpers
agent-browser --cdp 9222 eval --stdin \
< .agents/skills/agent-testing/scripts/agent-gateway/probe.js
# 4. Send a tool-call message — manually or via type+press
agent-browser --cdp 9222 eval "window.__PROBE_EVENT('SENT')"
# 5. Run the multi-switch driver (auto-picks active tab as BACK and the
# rightmost inactive tab as AWAY — edit ROUND_TRIPS / DWELL_MS in the
# file if you want different timing)
agent-browser --cdp 9222 eval --stdin \
< .agents/skills/agent-testing/scripts/agent-gateway/tab-switch.js
# 6. Wait for streaming to finish, then dump
agent-browser --cdp 9222 eval --stdin \
< .agents/skills/agent-testing/scripts/agent-gateway/probe-dump.js \
> /tmp/probe.json
# 7. Analyze
node .agents/skills/agent-testing/scripts/agent-gateway/analyze.mjs /tmp/probe.json
The analyzer prints three sections: EVENTS, TIMELINE, REGRESSIONS. If REGRESSIONS is non-empty it means content/reasoning/childN dropped on the same topic — the symptom users describe.
chat.messagesMap only stores the top-level assistantGroup shell. The
actual streamed content, reasoning, and tool calls live in
assistantGroup.children: AssistantContentBlock[]. Any probe that only
reads m.content / m.reasoning will see zeros throughout streaming and
miss everything that matters. probe.js walks both levels and sums:
cT total content lengthrT total reasoning lengthtoolT total tool-call countchildN number of content blocksPlus DOM-side signals (domLen, search/crawl indicator counts) so you can
tell store-side regressions apart from render-side regressions.
<scope>_new key with tmp_* ids and no topicId field.
probe.js falls back to those when activeTopicId is null.domLen drops greater than 100 chars to ignore that noise. · <agent name> suffix, so a search like 'LobeHub Growth' matches the
active tab when the active agent happens to be LobeHub Growth — and you
end up clicking the tab you're already on. probe.js uses the stable
data-contextmenu-trigger attribute (a React useId() value that's set
per-tab and survives focus changes) plus data-active="true" to mark
the active one. Helpers exposed:
__listTabs() / __clickTabByKey(key) / __clickTabByIndex(i) /
__activeTabKey().tab-switch.js fires-and-forgets. The IIFE kicks off an async loop
and returns immediately so the agent-browser CLI eval doesn't blow past
its default 25 s timeout. Wait on the SWITCH_LOOP_DONE event marker
before dumping. Re-running while a loop is in flight is refused — the
chaotic data from overlapping runs is not worth debugging.