plans/mustardscript-large-attachments.md
Generated by swarm planning session on 2026-04-20
For the local-agent flow, replace attachment-byte inlining with an on-disk storage model under .dyad/media/. The model is told in the user message that attachments are available at logical paths (attachments:<filename>), and a new agent tool, execute_sandbox_script, lets it generate short MustardScript (sandboxed JavaScript subset) snippets to read, slice, search, and aggregate file contents — returning only the concise result it actually needs. This solves context-window overflows, prompt cost, and provider latency on large attachments in the tool-capable local-agent path; as a bonus, the same tool can target any file the AI has scoped access to. When the request is not handled through src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts, keep the current behavior: inline the attachment into the user message and do not add a tool loop to src/ipc/handlers/chat_stream_handlers.ts.
Today, every attachment's bytes are inlined directly into the message payload sent to the LLM:
src/ipc/handlers/chat_stream_handlers.ts:1866–1930 reads attachment content and embeds it into TextPart / ImagePart objects per message.The pain is most acute for power-user workflows: large error logs, spec PDFs, code dumps, long JSON/CSV exports. The fix is to stop inlining in the local-agent path and let the model ask precise questions (via a sandboxed script) about files that live on disk. Non-local-agent/default chat keeps its existing inline behavior until a separate, explicitly scoped default-chat tool-loop project exists.
src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts, every user attachment (text and binary) is copied to .dyad/media/<sha256>.<ext> at send time. No size threshold — uniform rule. Text attachments are no longer inlined in this path.tools: { execute_sandbox_script } or any other tool-loop machinery to src/ipc/handlers/chat_stream_handlers.ts. If the local-agent handler is not used, continue inlining the attachment into the user message exactly as the current default-chat path does.TextPart listing each attachment as attachments:<sanitizedOriginalFilename> with a terse type/size descriptor. The physical on-disk name (<sha256>.<ext>) is resolved by the host; the model never sees it. This is user-message content, not system-prompt content.execute_sandbox_script tool (B). New agent tool wrapping MustardScript with a fixed host-capability set: read_file(path, opts?), list_files(dir), file_stats(path). No write_file, no fetch, no exec, no env. Read-only by design for v1.read_file(path, { start?, length?, encoding? }) allows byte-range and head/tail reads so scripts avoid loading whole files.{ value (≤64KB for LLM), truncated, fullOutputPath?, executionMs, instructionsUsed, heapBytesUsed }. Outputs larger than 64KB are additionally written to .dyad/media/script-output-<hash>.txt and the path is surfaced to both the LLM and the UI — the user-visible card can load the full result (up to ~1MB virtualized).ask default to respect the existing user mental model there. Opt-out to never in Settings → Chat → Scripts.ScriptCard component (mustard-amber accent), label "Script" (no "sandbox"), reuses DyadCard + DyadMcpToolCall expand/collapse. Collapsed by default on success, auto-expanded on error. Header auto-populates from the tool call's description field ("Read last 500 lines of server.log"), falling back to "Ran a script on server.log". Overflow menu on every card: Re-run · Copy script · Copy output · Manage scripts in Settings. Truncated outputs show "LLM saw X of Y" badge.ToolDefinition interface, do not add a generic registry, and do not wire Vercel AI SDK tools into chat_stream_handlers.ts for this project.isEnabled() returns false and the local-agent attachment-info user-message block says "sandbox scripting unavailable on this platform" so the model doesn't attempt it. Attachments still land on disk in the local-agent path. Do not put platform availability in the system prompt.cleanupOldMediaFiles() in src/main.ts for .dyad/media/ attachments (including script-output-*.txt). .dyad is already added to .gitignore via ensureDyadGitignored()..dyad/media/ button (using the literal path, not a euphemistic label), timeout ceiling configuration (2s default, up to 10s), consent toggle (always-allow ↔ ask ↔ never).read_file rejects paths outside ctx.appPath; denies absolute paths, .. escapes, and a denylist covering .env*, .git/, node_modules/, ~/.ssh/, ~/.aws/, ~/.config/, .npmrc, .yarnrc, .pypirc, shell history files, ~/.netrc, *.key, *.pem. Path validation (allowlist + denylist) is the primary file-access guardrail; resource limits and timeouts provide additional containment.read_file size cap of 1MB.ExecutionContext calls in try/catch; add process-level uncaughtException and unhandledRejection guards so unexpected sandbox failures are surfaced instead of relying on a non-existent unhandledException event./NOTICE at repo root aggregating Apache-2.0 attribution (MustardScript + Playwright + any others); include MustardScript's NOTICE content if shipped in its tarball. Add a CI check for new Apache-2.0 deps.ImagePart path. A future pdf_to_text or image_ocr agent tool is the right shape, not pushing bytes into a 16MB VM heap.write_file, longer timeout) and deserves its own scoping pass.{ bytesRead } events). Adds an IPC channel + renderer subscription; M-sized. Ship static states first; revisit if p90 duration exceeds 500ms in telemetry.runner.ts should be designed so swapping is a no-op for callers.execute_sandbox_script is always fresh. Memoization only within a single tool fan-out if needed.execute_sandbox_script is not exposed in default chat in v1. Any default-chat tool support requires a separate scoping pass.error.log into chat and ask "group and count unique stack traces" without hitting context limits or paying for 4MB of tokens — the AI writes a script that reads only what it needs.deprecatedFn" — the AI's script does the grep, I get the answer..dyad/media/" settings button so I can inspect or share the raw files directly.Metrics retired by the "local-agent only + no default-chat tool loop" decisions:
New leading indicators:
read_file / execute_sandbox_script call. Target ≥95% on frontier models. Watch small/local models separately — this is the "did the feature work at all" signal.Kept from prior framing (reframed):
context_length_exceeded / provider-specific) on local-agent attachment-bearing chats. Target: -90%.Instrumentation events to emit for the local-agent path (standard dashboard): attachment.stored, sandbox.script.run, sandbox.script.completed, sandbox.script.timeout, sandbox.script.truncated, sandbox.script.denied, sandbox.tool.unused_with_attachment.
server.log (80MB) into the composer via existing drag-and-drop or file picker (src/hooks/useAttachments.ts). An attachment chip appears — uniform design, no size/type variant. On the user's first-ever local-agent attach, a dismissible inline tip appears under the composer: "Attachments stay on disk — Dyad reads what it needs when you send."local_agent_handler.ts, the file is copied to .dyad/media/<sha256>.log. The outgoing user message gains an attachment-info TextPart:
Attachments available on disk (use attachments:<name> with read_file / execute_sandbox_script):
- attachments:server.log (80 MB, text/plain)
execute_sandbox_script with a short MustardScript that tails attachments:server.log, groups by error code, returns the top 5.ScriptCard renders inline:
skimming, sifting, tailing, etc.), "Running script…" label.description ("Read last 500 lines of server.log"), stats Read 42KB · 812ms, expandable.Re-run and Retry with guidance buttons.local_agent_handler.ts, no script tool is exposed and the attachment continues to be inlined into the user message..dyad/media/server.log. Dyad reads what it needs." Default chat keeps existing inline-attachment semantics.aria-live="polite" announces "Running script".description, stats Read 42KB · 812ms, chevron, keyboard-operable.instructionsUsed, heapBytesUsed for power users..dyad/media/script-output-*.txt..dyad/media/.<button> with aria-expanded; Enter/Space toggles. Focus ring matches existing DyadCard.skimming, sifting, tailing, parsing, digesting, threading).aria-label="Script — read server.log — success, 812ms".aria-live="polite" announces start and completion.prefers-reduced-motion disables scramble-reveal.Five layered components:
src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts), replace attachment byte inlining with always-on-disk attachment references. Do not make this change in src/ipc/handlers/chat_stream_handlers.ts; non-local-agent/default chat keeps inlining attachments into the user message. Handle mixed-history (legacy inlined attachments + new local-agent on-disk attachments) cleanly.TextPart listing attachments:<name> with type/size. Placement: immediately before the user's text in the same user message, so provider prompt-cache boundaries stay consistent. This block is not part of the system prompt.src/ipc/utils/sandbox/ (non-Pro utility, but only wired into local-agent mode for v1). Contains runner.ts (MustardScript wrapper with lazy-init, resource limits, Promise.race timeout, try/catch + uncaughtException / unhandledRejection guard plan), capabilities.ts (read_file / list_files / file_stats host functions with path allowlist + denylist), limits.ts (timeout / heap / instruction budgets).execute_sandbox_script tool. Pro-mode definition stays under src/pro/main/ipc/handlers/local_agent/tools/execute_sandbox_script.ts (reuses the Pro ToolDefinition pattern). It is registered only through the local-agent tool system. No sibling default-chat tool, no direct wiring into streamText() in chat_stream_handlers.ts, and no default-chat generic tool-registry infrastructure.Attachment flow (modify):
src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts — switch local-agent attachment handling to .dyad/media/ references and the attachment-info user-message block.src/ipc/handlers/chat_stream_handlers.ts — preserve existing default-chat inline attachment behavior. Do not add a tool loop or script tool wiring here.src/ipc/utils/media_path_utils.ts — add helpers for resolving attachments:<name> ↔ <sha256>.<ext>.src/ipc/types/chat.ts — ChatAttachmentSchema unchanged on wire; runtime types track onDiskPath + logicalName.src/hooks/useAttachments.ts — frontend stays; chip uniform across types (no badge, tooltip carries the teaching).src/main.ts — cleanupOldMediaFiles() continues to operate; ensure script-output-*.txt is also swept.Sandbox runner (new, shared):
src/ipc/utils/sandbox/runner.tssrc/ipc/utils/sandbox/capabilities.tssrc/ipc/utils/sandbox/limits.tsTool system (new):
src/pro/main/ipc/handlers/local_agent/tools/execute_sandbox_script.ts — Pro definition using the runner.src/pro/main/ipc/handlers/local_agent/tool_definitions.ts — register it for Pro agent.src/ipc/handlers/chat_stream_handlers.ts.UI (new / modify):
src/components/chat/ScriptCard.tsx — new component (reuses DyadCard + DyadMcpToolCall patterns), label "Script", overflow menu, stats strip.src/components/chat/AttachmentsList.tsx — uniform chip; tooltip with on-disk path.src/components/chat/ChatMessage.tsx (or equivalent) — render local-agent script tool-call / tool-result parts using ScriptCard.src/components/chat/* — first-run inline strip (anchored to first Script card), first-attach composer tip, small-model fallback banner.src/pages/settings/* — Settings → Chat → Scripts section with consent toggle, timeout ceiling, "Open .dyad/media/" button.Native binary / packaging:
forge.config.ts — asar-unpack @mustardscript/binding-*/*.node..node files.isEnabled: () => isSupportedPlatform() on execute_sandbox_script; attachment-info user-message block communicates unavailability in local-agent mode only.Licensing:
/NOTICE at repo root with Apache-2.0 attributions. CI check for new Apache-2.0 deps.aiMessagesJson column via tool_call / tool_result parts..dyad/media/ continues to hold attachment files; adds .dyad/media/script-output-<hash>.txt for oversized script returns. .dyad/ already in gitignore.execute_sandbox_script tool:
// Input
{
script: string; // MustardScript source; max 32 KB
description?: string; // One-line human explanation rendered on the card
}
// Output (stringified JSON as tool result)
{
value: string; // Return value, ≤ 64 KB
truncated: boolean;
fullOutputPath?: string; // `.dyad/media/script-output-<hash>.txt` if truncated
executionMs: number;
instructionsUsed: number;
heapBytesUsed: number;
}
Host capabilities exposed into the MustardScript context (fixed set, no free-form):
read_file(path: string, opts?: {
start?: number;
length?: number;
encoding?: 'utf8' | 'base64';
}): string;
list_files(dir: string): string[];
file_stats(path: string): {
size: number;
isText: boolean;
mtime: string;
};
Attachment-info user-message block format (v1, frozen for schema stability):
Attachments available on disk (use attachments:<name> with read_file / execute_sandbox_script):
- attachments:server.log (80 MB, text/plain)
- attachments:spec.txt (4 KB, text/plain)
mustardscript in Dyad; verify optional binding downloads correctly on mac-arm64, mac-x64, linux-x64, win-x64.forge.config.ts asarUnpack for @mustardscript/binding-*/*.node.ExecutionContext exceptions do not escape to kill Electron main (try/catch + process guard).src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts. No threshold branch. Strip inline-text embedding in this path only.src/ipc/handlers/chat_stream_handlers.ts; do not add tool-loop wiring there.TextPart builder with stable placement in the user message.attachment.stored.execute_sandbox_script tool — Pro agent first (4–5 days)src/ipc/utils/sandbox/ runner + capabilities + limits..env*, .git/, node_modules/, ~/.ssh/, ~/.aws/, ~/.config/, ~/.netrc, *.key, *.pem); traversal test coverage.execute_sandbox_script.ts Pro tool definition + registration..dyad/media/script-output-<hash>.txt, with path surfaced in tool result.ScriptCard UI with all states (running/success/error/timeout/empty/truncated); overflow menu.ask default.sandbox.script.{run,completed,timeout,truncated,denied}.src/ipc/handlers/chat_stream_handlers.ts and verify no tools: { execute_sandbox_script }, no Vercel tool-loop additions, and no attachment-specific system-prompt branches are introduced.local_agent_handler.ts, continue inlining attachment content into the user message.aiMessagesJson (reuse existing local-agent tool persistence patterns)./NOTICE file with Apache-2.0 attributions; CI check for new Apache-2.0 deps.docs/security.md section on the MustardScript threat model and mitigations, including explicit note that the path allowlist is the sole security control under always-allow.llama3.1:8b and qwen2.5:7b before GA; otherwise ship with a small-model warning.Total MVP: ~3 weeks (one engineer), with ~1 week buffer if the native-binary spike surfaces platform issues. Each phase ends with something shippable behind a flag.
read_file path escape (.., absolute paths, symlinks where OS permits), denylist coverage (.env, .ssh, .aws, keys, pems), size cap, range-read correctness.fullOutputPath is written and returned for >64KB results.ask default triggers modal; denial is logged.e2e-tests/fixtures/attachments/ (large log, CSV, JSON, small text); keep <1MB total.llama3.1:8b and qwen2.5:7b before GA.| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| MustardScript native binary fails to load in Electron on some platform | Med | High | Phase 0 spike gates everything else; platform-specific isEnabled fallback; documented interpreter fallback plan |
| In-process sandbox is not a hard security boundary; prompt-injected attachment causes the LLM to write an exfil script | Med | High | Conservative path validation with strict allowlist and denylist including .env/.ssh/.aws/.npmrc/.pypirc/shell history/keys/pems; lazy-init; Script card shows source (transparency); sidecar mode on roadmap; launch-blocker review must focus here |
| Small local models (Ollama 7B-class) can't reliably emit tool calls in local-agent mode | High | Med | Fallback banner when a local-agent attachment turn yields no tool call; QA gate ≥80% on llama3.1:8b + qwen2.5:7b; if below, ship with warning banner |
| Oversized script returns re-create the original context-blowup problem | Med | Med | 64KB LLM cap with truncation signaling; spill to disk for UI viewing; tool description encourages .slice/.filter/.reduce returns |
| Native addon crash takes down Electron main process | Low | High | ExecutionContext calls wrapped in try/catch; process-level uncaughtException and unhandledRejection guards; verified in Phase 0 spike |
| Always-on-disk local-agent attachment handling widens prompt-injection attack surface in that mode | Med | High | Denylist extension (noted above); explicit mention in security doc; no-auto-replay policy prevents re-entrancy |
| Cost delta for managed-model users from added local-agent tool-loop tokens | Low | Med | Forecast input-token delta in Phase 4; track tool-loop latency overhead metric post-launch |
| Prompt-cache regression on small local-agent attachments (uniform on-disk means every attachment pays a tool round-trip in that mode) | Med | Low | Scramble-reveal verbs cover latency emotionally; track tool-loop latency overhead p50/p90; accept as scope given user's preference for mental-model consistency in local-agent mode |
| Mixed-history chats (legacy inlined + new local-agent on-disk) render inconsistently | Low | Med | Explicit mixed-history handling in attachment preparation; E2E test coverage; release-notes call-out |
Accidental default-chat tool-loop wiring changes behavior in chat_stream_handlers.ts | Med | High | Explicit Phase 3 audit; tests proving default chat still inlines attachments and exposes no script tool |
| Attachment-specific system-prompt changes fragment behavior or prompt caching | Med | High | System-prompt invariance test for attachment vs. non-attachment turns in all affected modes; attachment metadata stays in user-message parts only |
| Apache-2.0 NOTICE obligation overlooked for bundled deps | Low | Low | One-time /NOTICE authoring; CI check on new Apache-2.0 deps |
| Users surprised by MustardScript v0.1.1 alpha status / maintainership | Low | Med | Pin exact version; add a Dyad-CI canary that re-runs MustardScript's own tests on each bump |
| Cold-start cost of native addon delays first paint | Low | Med | Lazy-init module only on first script execution; never at app startup |
.dyad/media/ grows unbounded across sessions | Med | Low | Reuse cleanupOldMediaFiles(); Settings "Manage attachments" roadmapped for follow-up |
| Settings opt-out rate spikes (users uncomfortable with local-agent scripts) | Low | Med | >2% threshold triggers investigation; first-run inline strip clearly signals how to disable |
| Chat export leaks attachment content users didn't realize was bundled | Low | Med | Explicit "include attachment contents" toggle on export (post-MVP) |
Resolved during implementation (not blocking planning):
^0.1 range to require manual bumps.skimming, sifting, tailing, parsing, digesting, threading); UX to finalize during Phase 2..dyad/media/ on disk; alias as attachments: in local-agent user-message attachment info only (user). Trade-off: filesystem name and LLM-facing name diverge. Gain: zero rename churn across 15+ files. UI copy uses whichever reads naturally; "Open .dyad/media/" button uses the literal path so power users see the transition consistently.chat_stream_handlers.ts (user). If src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts is not used, continue inlining attachments into the user message. Default-chat tool support requires a separate plan.ask default (user + Eng refinement). Trade-off: a per-call modal remains. Gain: Pro/local-agent users keep existing behavior they've opted into.src/ipc/utils/sandbox/ (Eng). Shared utility location avoids coupling the runner to Pro internals, but v1 wires it only through the local-agent tool system. License-compatible (MustardScript is Apache-2.0)..dyad/media/" button uses the literal path (UX). Prevents confusion at the point where the naming split does surface.write_file/fetch/exec. Huge return values use UI-side spill, not script-side write. Drastically reduced attack surface.read_file(path, {start, length}) (PM, Eng). Scripts can tail/head efficiently within per-call caps.pdf_to_text is a separate future tool./NOTICE (Eng).ExecutionContext calls wrapped; process-level guard added.Generated by dyad:swarm-to-plan