ruflo/docs/adr/ADR-033-RUVOCAL-WASM-MCP-INTEGRATION.md
Status: Proposed Date: 2026-05-01 Author: Ruflo Team Deciders: Engineering Related: ADR-002-WASM-CORE-PACKAGE, ADR-029-HUGGINGFACE-CHAT-UI-CLOUD-RUN, ADR-030-MCP-TOOL-GAP-ANALYSIS, ADR-032-RVF-PRIVATE-MCP-TUNNEL
The local copy of the Ruvocal chat UI at ruflo/src/ruvocal/ is a snapshot fork of the SvelteKit-based HuggingFace chat-ui (v0.20.0). The canonical upstream lives at ruvnet/RuVector/ui/ruvocal and has diverged with substantial new functionality, primarily an in-browser WASM MCP server powered by rvagent-wasm.
A directory-level diff between upstream and local shows:
Net-new in upstream (absent locally):
src/lib/wasm/ — WASM loader, types, IndexedDB persistence, capability tests (84 KB)src/lib/components/wasm/GalleryPanel.svelte — UI for browsing/loading WASM templatessrc/lib/components/FoundationBackground.sveltesrc/lib/stores/wasmMcp.ts — Svelte store wrapping WASM MCP server lifecyclesrc/lib/constants/rvagentPresets.ts — preset templatessrc/lib/server/textGeneration/mcp/wasmTools.test.tsstatic/wasm/rvagent_wasm.js + rvagent_wasm_bg.wasm — compiled WASM bundle (~588 KB)config/branding.env.exampleModified upstream (incompatible drift in local):
ChatInput, ChatMessage, ChatWindow, ChatIntroduction, BlockWrapper, TaskGroup, ToolUpdate, FileDropzoneMCPServerManager, AddServerForm, ServerCardlib/server/mcp/clientPool.ts, httpClient.ts, lib/server/router/toolsRoute.ts, lib/server/textGeneration/index.ts, runMcpFlow.ts, toolInvocation.ts, types.ts, utils/toolPrompt.tsmcpServers.ts, settings.ts, Settings.ts, Tool.ts, messageUpdates.ts, switchTheme.ts+layout.svelte, models/+page.svelte, settings layout/model pages, conversation/[id]/+page.svelte & +server.ts, api/mcp/health/+server.ts, api/mcp/servers/+server.ts, api/v2/user/settings/+server.tsapp.html, styles/main.css, static/chatui/{favicon,icon,logo}.svg, static/chatui/manifest.jsonModal.svelte, NavMenu.svelte, RuFloUniverse.svelte, Switch.svelte, WelcomeModal.svelte, Logo.svelte, mcpExamples.ts, .gitignore, rvf.manifest.jsonLocal-only (must preserve):
mcp-bridge/index.js — local MCP bridge implementation (absent in upstream)src/routes/api/v2/debug/ — debug routes used by ruflostub/@reflink/reflink/index.js — reflink stub.env — populated local environmentpackage-lock.json — local lockfilepackage.json is identical between local and upstream — no dependency changes required.
We will pull the upstream improvements into ruflo/src/ruvocal/ on a dedicated feature branch (feat/ruvocal-wasm-mcp-integration) using a directory-level overlay strategy rather than a Git merge, because the local snapshot has no shared history with the upstream repository.
The integration is staged in three commits to keep the diff reviewable:
src/routes/api/v2/debug/ after upstream overlay (upstream lacks it).mcp-bridge/index.js after overlay.stub/@reflink/reflink/index.js..env and package-lock.json untouched.npm install, npm run check, npm run build, then local docker compose + ruflo-browser smoke test.The package.json overlay is safe because it is byte-identical.
ruflo/src/chat-ui/ (the thin HF base-image wrapper) in this ADR.Positive:
GalleryPanel UX for browsing rvagent templates.wasmTools.test.ts, wasm-capabilities.test.ts) raise the coverage floor.Negative / risks:
browser-gated dynamic import, so initial paint is unaffected.mcp-bridge/index.js, routes/api/v2/debug/) must be re-applied after each upstream sync; this ADR documents that requirement so future syncs don't drop them.clientPool, httpClient, toolsRoute, runMcpFlow, etc.) may interact with the local mcp-bridge differently than upstream's. Smoke test before merge.Settings.ts / Tool.ts types could ripple into ruflo packages that import from src/ruvocal. Mitigation: run npm run check before merging.Rollback: revert the feature branch; no data migrations, no external service changes.
Acceptance criteria for merging the branch:
npm install succeeds.npm run check passes (svelte-check, no new TS errors).npm run build produces a working bundle.npm run test — wasmTools.test.ts and wasm-capabilities.test.ts pass.docker compose up -d brings up MongoDB; npm run dev serves at http://localhost:5173.ruflo-browser smoke test: load the home page, open the gallery panel, send a message through a non-WASM model, confirm no console errors.mcp-bridge/index.js, routes/api/v2/debug/, stub/@reflink/reflink/index.js, .env, package-lock.json.Cloud Run deployment is out of scope for this PR but the path is staged in ruflo/src/ruvocal/cloudbuild.yaml. Two infrastructure prerequisites must be satisfied before the first deploy:
ruvocal-mongodb-url.mongo:8 sidecar in the same revision; main container connects to localhost:27017. Requires --container flags on gcloud run deploy.ruv-dev Secret Manager per ADR-029: openai-api-key, google-api-key, openrouter-api-key.Once both are in place:
cd ruflo/src/ruvocal
gcloud builds submit --config=cloudbuild.yaml --project=ruv-dev --region=us-central1
Validation after deploy: npx agent-browser open <run-url> then check console for [WASM MCP] Server initialized successfully · 18 tools.
The thin ruflo/src/chat-ui/Dockerfile wrapper (FROM ghcr.io/huggingface/chat-ui-db:latest) is unsuitable for deploying this integration — it can only patch the upstream HF base image with a few static files; it cannot include compiled WASM source. The full ruvocal Dockerfile build is required.
The Cloud Run pipeline is working end-to-end with the following validations:
| Stage | Result |
|---|---|
Cloud Build (after DOCKER_BUILDKIT=1 fix) | Succeeds: gcr.io/ruv-dev/ruvocal:v1 pushed |
Cloud Run deploy (after granting secretmanager.secretAccessor to default SA on ANTHROPIC_API_KEY, GOOGLE_AI_API_KEY, OPENROUTER_API_KEY) | Service ruvocal revision 00007-4hd serving 100% traffic |
Embedded MongoDB (INCLUDE_DB=true) | Working: mongod starts via entrypoint.sh, /api/v2/conversations, /api/v2/user, /api/v2/feature-flags, /api/v2/public-config, /api/v2/user/settings all return 200 |
| WASM bundle | Reachable: https://ruvocal-875130704813.us-central1.run.app/wasm/rvagent_wasm.js (200, text/javascript), /wasm/rvagent_wasm_bg.wasm (200, application/wasm, 543 KB) |
| Provider API keys via Secret Manager | Mounted at runtime as ANTHROPIC_API_KEY, GOOGLE_API_KEY, OPENROUTER_API_KEY, OPENAI_API_KEY |
dotenv-cli runtime overrides via DOTENV_LOCAL env var | Working — confirmed by PUBLIC_ORIGIN and OPENAI_BASE_URL taking effect at runtime |
ruvocal.ruv.io mapped via gcloud beta run domain-mappings createCNAME ruvocal → ghs.googlehosted.com., proxied:false (gray cloud) so Google can issue and renew the managed cert directly*.run.app URL is always available immediatelyThe deployed instance uses Gemini 2.5 Flash as default via Google's OpenAI-compatible endpoint:
OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/openai/
OPENAI_API_KEY=<from GOOGLE_AI_API_KEY secret>
TASK_MODEL=gemini-2.5-flash
Earlier attempts with https://router.huggingface.co/v1 returned 401 "Invalid username or password" because the available huggingface-token secret doesn't auth against the user-facing router endpoint, and the OpenRouter API key was incorrectly mapped against the HF base URL. Google's OpenAI-compatible endpoint accepts GOOGLE_AI_API_KEY directly and exposes 56 Gemini variants.
The homepage / returns HTTP 500 in production due to the /api/v2/models and /api/v2/models/refresh routes returning the SvelteKit "Page not found" page wrapped in a 500 status.
Root cause (found after extensive investigation): .gitignore line 16 was models/* (unanchored). With no .gcloudignore present, gcloud builds submit falls back to .gitignore for upload filtering — that pattern matched every models/ directory in the tree, stripping src/routes/api/v2/models/* and src/routes/models/* from the source tarball before Docker even saw it. SvelteKit production builds inside the container therefore had no /api/v2/models route registered, so requests returned 404 which got wrapped as 500 by the layout chain.
Fix: ruflo/src/ruvocal/.gcloudignore (commit e3b74f606). All /api/v2/* routes now serve correctly. Verified via gsutil cp of the build's source tarball that all +server.ts files for the models subtree are present after the fix.
This section consolidates changes that landed after the initial PR (#1687) merged:
Gemini 2.5 Flash via the OpenAI-compat endpoint returned Error: 400 status code (no body) on the follow-up call after a tool result (the chat-ui sends the tool message back to the model in the next turn; Gemini's compat layer rejected the structure intermittently). The result was the user-visible "Sorry, something went wrong" after a tool fired.
Switched the deployed default to Claude Sonnet 4.6 via OpenRouter with OPENAI_BASE_URL=https://openrouter.ai/api/v1 and OPENAI_API_KEY mapped from the OPENROUTER_API_KEY Secret Manager secret. Models exposed: Claude Sonnet 4.6 (default), Claude Opus 4.7, Claude Haiku 4.5, Gemini 2.5 Pro, Gemini 2.5 Flash, GPT-4o.
The system prompt explicitly instructs the model to emit multiple tool_calls in a single response when the request implies multiple independent steps. Server logs confirmed toolMsgCount: 4 and toolMsgCount: 6 — multiple tools dispatched in parallel via Promise.all in src/lib/server/textGeneration/mcp/toolInvocation.ts.
mcp-bridge on Cloud Run)Deployed in parallel with ruvocal to expose 207 tools across 5 server groups (Core / Intelligence / Agents / Memory / DevTools). Pinned --min-instances=1 --max-instances=1 so backends stay warm — Cloud Run was previously routing requests to cold instances that had not yet finished spawning npx ruflo mcp start and npx ruvector mcp start. Bumped the bridge's MCP initialize RPC timeout from 30s → 120s to handle cold-start boot times.
Added src/lib/wasm/wasm.worker.ts and src/lib/wasm/workerClient.ts. When the user opts in via ?worker=1 URL param or localStorage.setItem("ruflo:wasm-worker", "true"), callMcp routes through a worker-owned mock WasmMcpServer instead of the in-process module. Default behavior (in-process path) is unchanged.
The worker is self-contained — it does not import $app/environment (which is unresolvable in worker context) and ships a minimal read_file / write_file / list_files MCP surface as a placeholder until the real rvagent_wasm.js bundle is wired in.
Mapped via Cloudflare DNS (CNAME unproxied so Google manages the cert):
flo.ruv.io (primary)ruflo.ruv.ioruvocal.ruv.iosrc/lib/components/RufloHelpModal.svelte) with all 6 tool groups, quick-start, tips, and resource links.src/lib/constants/{mcpExamples,routerExamples}.ts) replacing HF defaults.PUBLIC_APP_NAME=RuFlo brands the title bar, sidebar, welcome modal, and PWA manifest.mcp-bridge returns a structured hint object instead of opaque 400 when search is called with an empty query (Error: search requires a non-empty query string).cloudbuild.yaml deploy step strips its --set-env-vars / --set-secrets flags so subsequent rebuilds preserve manually configured env (DOTENV_LOCAL with MODELS, OPENAI_BASE_URL, etc.). Initial env config is now set out-of-band via gcloud run services update.DOCKER_BUILDKIT=1 env on the cloudbuild docker step (Dockerfile uses COPY --link BuildKit syntax).| Priority | Item | Notes |
|---|---|---|
| P1 | Make Web Worker mode default | Currently opt-in (?worker=1). Blocked on bringing the worker's mock to parity with the main-thread mock (worker has 3 tools, main has 18) and on importing the real rvagent_wasm.js bundle into worker context. |
| P1 | Persistent MongoDB | Cloud Run filesystem is ephemeral — chat history evicts on cold starts. Options (cleanest first): MongoDB Atlas M0 free tier with MONGODB_URL from Secret Manager; Cloud Run multi-container with mongo:8 sidecar + GCS volume mount; MongoDB on Compute Engine. |
| P1 | Google OAuth login | OPENID_CLIENT_ID="" today — anonymous sessions only. Wanted for admin diagnostics, per-user memory namespaces, and usage caps. Set OPENID_CLIENT_ID, OPENID_CLIENT_SECRET, OPENID_SCOPES. |
| P2 | Parallel tool-call visualization parity with Claude Code task panel | Server-side parallel execution works; UI renders one tool-call card per call but doesn't visually group them as a single "step" with collapsed thumbnails, lane layout, or per-tool durations. UX pass needed on ChatMessage.svelte and ToolUpdate.svelte. |
| P2 | Real rvagent_wasm wired into worker | Static bundle (static/wasm/rvagent_wasm.{js,wasm}) ships at 588 KB. Currently the page bundle uses createMockWasmModule() because the real WASM isn't loaded into app.html. Need a <script type="module"> block (or worker-level import("/wasm/rvagent_wasm.js")) that calls init() and exposes the constructors on the global before loadWasm() runs. |
| P3 | Help-modal copy fix | Quick Start section says "default: Gemini 2.5 Flash". Update to "default: Claude Sonnet 4.6" in src/lib/components/RufloHelpModal.svelte. |
| P3 | LLM router (Omni mode) | LLM_ROUTER_ARCH_BASE_URL is empty so the auto-routing alias model isn't created. Re-enable when an Arch-Router is hosted somewhere reachable; restore LLM_ROUTER_ROUTES_PATH and the related LLM_ROUTER_* env vars. |
https://github.com/ruvnet/ruvector → ui/ruvocal/ruflo/src/ruvocal/feat/ruvocal-wasm-mcp-integration (squash-merged to main)ruflo/src/ruvocal/cloudbuild.yaml, ruflo/src/ruvocal/mcp-bridge/cloudbuild.yaml.gcloudignore (CRITICAL — see resolved root cause): ruflo/src/ruvocal/.gcloudignoreruflo/src/ruvocal/src/lib/wasm/{wasm.worker.ts,workerClient.ts}ruflo/src/ruvocal/src/lib/components/RufloHelpModal.svelte