Web Fetch Tool for Local Agent Mode

Generated by swarm planning session on 2026-02-25

Summary

Add a new web_fetch tool to the local agent that fetches and reads website content when users share URLs for reference. Unlike the existing Pro-only web_crawl tool (which uses Firecrawl for visual cloning with screenshots), web_fetch performs a direct local HTTP fetch from the user's machine, making it available to all users (free + Pro) at zero infrastructure cost.

Problem Statement

When users paste a URL into the Dyad chat (e.g., "Help me integrate this API: https://docs.stripe.com/api"), the agent cannot access the content behind that URL. Users must manually copy-paste page content, breaking their flow. This is especially painful for developers building with APIs, following tutorials, or referencing documentation — the most common use cases for Dyad's target audience. The existing web_crawl tool only activates for "clone/copy/replicate" intent and requires Dyad Pro, leaving a gap for the broader "read this page for context" use case.

Scope

In Scope (MVP)

New web_fetch tool that fetches a URL and returns content as markdown
Available to all users (free + Pro) — no isDyadPro gate
LLM-triggered via standard tool call mechanism (not auto-detected)
HTML-to-markdown conversion using turndown + @mozilla/readability for content extraction
Content-Type detection: HTML → markdown, JSON → code block, text → as-is, PDF/images → "not supported" message
URL scheme validation (http: and https: only; block file:, ftp:, data:, javascript:, blob: schemes)
Private/localhost IPs allowed (consent dialog is sufficient protection)
Consent-gated with "ask" default
Content truncation at 16,000 characters (matching existing MAX_TEXT_SNIPPET_LENGTH)
Timeout at 10-15 seconds via AbortController
XML streaming preview via <dyad-web-fetch> tag
Clear error messages for timeout, 403/blocked, empty content, unsupported content types

Out of Scope (Follow-up)

Auto-detection of URLs in user input (pre-fetching before LLM runs)
JavaScript rendering / headless browser for SPAs
Screenshot capture
PDF content extraction
Caching of fetched pages within a session
Batch consent UI for multiple URLs in one message
Re-fetch / refresh button on completed cards
Link preview in chat input area

User Stories

As a developer building an app, I want to paste an API documentation URL and have the agent understand its contents, so that I can say "integrate this API" without manually copying docs.
As a user following a tutorial, I want to share a blog post or tutorial URL with the agent, so that it can follow the instructions and implement what the tutorial describes.
As a user referencing a design, I want to share a website URL for style reference (without cloning), so that the agent understands the content and direction I'm going for.
As a free-tier user, I want basic web fetching to work without a Pro subscription, so that I can reference external content in my workflow.

UX Design

User Flow

User types a message that includes a URL (e.g., "Use the Stripe API docs at https://docs.stripe.com/api to add payments")
The LLM recognizes the URL and determines it needs the page content to fulfill the request
A consent dialog appears: Fetch page content: "https://docs.stripe.com/api"
User approves (accept-once / accept-always / decline)
A <dyad-web-fetch> card appears in the chat showing the URL being fetched with a loading state
Content is fetched, processed through Readability + Turndown, truncated if needed, and returned as the tool result
The card transitions to a completed state showing the page title (extracted by Readability) and URL
The AI continues its response using the fetched content as context

Key States

Loading: Card with URL, spinner, "Fetching..." label (use existing DyadStateIndicator pattern)
Completed (HTML): Card with page title (extracted by Readability) + URL in muted text, expandable to show markdown preview
Completed (JSON): Card with application/json badge + URL, expandable content as code block
Completed (text): Card with text/plain badge + URL, content displayed as-is
Error — Timeout: "This page couldn't be reached. Check the URL and try again."
Error — Blocked (403): "This page blocked the request. You may need to copy-paste its content manually."
Error — Empty/JS-only: "This page returned no readable content. It may require JavaScript to render."
Warning — Unsupported type: Amber/warning state (not red error): "PDF files cannot be fetched as text. Try copying the relevant content and pasting it into the chat." (Use <dyad-output type="warning">)
Truncated: Show note on card: "Content truncated (showing first 16,000 characters)"

Interaction Details

Consent preview text: Fetch page content: "https://..." (action-focused, not implementation-detail-focused)
Card icon: Use Link from lucide-react (differentiated from Globe for web_search and ScanQrCode for web_crawl)
Badge color: Use purple to differentiate from the blue used by web_search and web_crawl
Completed card is collapsed by default with page title visible; expandable to show markdown preview
When truncation occurs, surface it in the card UI so users understand the AI only saw partial content

Accessibility

Consent dialog: keyboard-navigable via standard button focus (existing pattern)
Expandable cards: Enter/Space to toggle (existing DyadCard pattern)
Screen reader: announce "Web Fetch completed: [page title]" or "Web Fetch failed: [error]"

Technical Design

Architecture

New tool following the established ToolDefinition<T> pattern. Performs a direct HTTP fetch from the Electron main process using Node.js fetch(), processes HTML through @mozilla/readability for content extraction, then converts to markdown via turndown. Returns the markdown string as the tool result. No changes to existing tools or the agent handler.

Dependency pipeline: fetch(url) → linkedom.parseHTML(html) → new Readability(doc).parse() → new TurndownService().turndown(article.content) → truncateText(markdown)

linkedom is required because both @mozilla/readability and turndown need a DOM document, and Electron's main process doesn't have one. linkedom is lightweight (~50KB) and much faster than JSDOM.

Components Affected

New file: src/pro/main/ipc/handlers/local_agent/tools/web_fetch.ts — Tool implementation
Modified: src/pro/main/ipc/handlers/local_agent/tool_definitions.ts — Import and register webFetchTool in TOOL_DEFINITIONS array
Modified: package.json — Add turndown, @types/turndown, linkedom, @mozilla/readability (or defuddle)
New file (renderer): DyadWebFetch component for rendering the <dyad-web-fetch> XML tag in chat
No changes to: web_crawl.ts, engine_fetch.ts, local_agent_handler.ts, types.ts

Data Model Changes

None. The tool returns a string result via the existing ToolResult type. No schema or storage changes.

API Changes

No external API changes. Internally:

New tool web_fetch added to TOOL_DEFINITIONS array
New XML tag <dyad-web-fetch> for renderer

Tool Description (Critical)

The tool description guides LLM behavior and is the single biggest factor in feature success:

Fetch and read content from a URL. Works with web pages (returns cleaned markdown) and API endpoints (returns JSON).

### When to Use
Use this tool when the user shares a URL and wants you to reference, understand, or use information from that page. Examples:
- User shares API documentation and asks you to integrate it
- User shares a tutorial or blog post and wants you to follow it
- User shares a web page and asks about its content
- User shares an API endpoint URL and wants you to understand the response

### When NOT to Use
- User wants to CLONE / COPY / REPLICATE / RECREATE a website's visual design — use web_crawl instead
- User mentions a URL in passing without wanting you to read it
- You need to search the web for information (no specific URL) — use web_search instead

### Limitations
- Cannot render JavaScript — some dynamic/SPA pages may return limited content
- Content is truncated to ~16,000 characters for very long pages
- PDF and image files are not supported

Key Implementation Details

typescript

// web_fetch.ts - Core structure

const webFetchSchema = z.object({
  url: z.string().describe("URL to fetch"),
});

// URL validation: only http: and https: schemes
// No private IP blocking (user decision: allow with consent)
// Timeout: 10-15 seconds via AbortController
// User-Agent: set a reasonable browser-like string

// Content-Type handling:
// text/html → Readability extraction → Turndown markdown → truncate
// application/json → return as ```json code block → truncate
// text/plain, text/markdown → return as-is → truncate
// application/pdf, image/* → return "not supported" message
// other → attempt text extraction, fall back to "not supported"

// Truncation: reuse MAX_TEXT_SNIPPET_LENGTH (16,000 chars) pattern

export const webFetchTool: ToolDefinition<z.infer<typeof webFetchSchema>> = {
  name: "web_fetch",
  description: DESCRIPTION,
  inputSchema: webFetchSchema,
  defaultConsent: "ask",
  // No isEnabled gate — available to all users

  getConsentPreview: (args) => `Fetch page content: "${args.url}"`,

  buildXml: (args, isComplete) => {
    if (!args.url) return undefined;
    let xml = `<dyad-web-fetch url="${escapeXmlContent(args.url)}">`;
    if (isComplete) xml += "</dyad-web-fetch>";
    return xml;
  },

  execute: async (args, ctx) => {
    // 1. Validate URL scheme (http/https only)
    // 2. Fetch with timeout (AbortController, 15s)
    // 3. Check Content-Type header
    // 4. For HTML: parse with Readability, convert with Turndown
    // 5. For JSON: wrap in code block
    // 6. For text: return as-is
    // 7. For unsupported: return clear message
    // 8. Truncate to MAX_TEXT_SNIPPET_LENGTH
    // 9. Return markdown string as tool result
  },
};

Implementation Plan

Phase 1: Core Tool

Add dependencies: turndown, @types/turndown, linkedom, @mozilla/readability (evaluate defuddle as alternative)
Create src/pro/main/ipc/handlers/local_agent/tools/web_fetch.ts with:
- URL scheme validation
- Fetch with AbortController timeout (15 seconds)
- Content-Type detection and routing
- Readability extraction for HTML
- Turndown markdown conversion
- JSON/text/unsupported content handling
- Truncation using existing pattern
- Proper error messages for common failure modes
Register webFetchTool in tool_definitions.ts TOOL_DEFINITIONS array
Write tool description with clear when-to-use / when-not-to-use guidance

Phase 2: Renderer Component

Create DyadWebFetch component to render <dyad-web-fetch> XML tags
Implement loading state (URL + spinner)
Implement completed state (page title + URL, expandable markdown preview)
Implement error states
Show truncation indicator when content was truncated
Register in the markdown parser's XML tag handler

Phase 3: Testing

Unit tests for URL validation (scheme checking, malformed URLs)
Unit tests for Content-Type handling (HTML, JSON, text, PDF, images)
Unit tests for HTML-to-markdown conversion (simple pages, complex pages, empty bodies)
Unit tests for truncation behavior
Unit tests for timeout/error handling (mock fetch failures, non-200 responses)
Integration test: verify tool appears in buildAgentToolSet output (no isEnabled gate)
Manual E2E testing with real URLs in local agent chat

Testing Strategy

Unit test URL scheme validation: verify file://, ftp://, data: are rejected; http:// and https:// are accepted
Unit test Content-Type routing: verify HTML → readability+turndown, JSON → code block, text → as-is, PDF → error message
Unit test HTML conversion with various inputs: simple pages, pages with scripts/styles, empty bodies, non-UTF-8 encoding
Unit test truncation: verify content over 16K chars is truncated with indicator
Unit test error handling: mock network failures, timeouts, 403/404 responses, non-200 status codes
Integration test: verify webFetchTool is included in tool set for both Pro and non-Pro contexts
Manual test: verify consent dialog, loading card, completed card, error states in the actual UI
Manual test: verify tool is NOT triggered for clone/replicate intent (web_crawl should be used instead)

Risks & Mitigations

Risk	Likelihood	Impact	Mitigation
JS-rendered SPAs return minimal content	Medium	Medium	Clear tool description noting limitation; LLM can explain to user; Pro users can use web_crawl
LLM confuses web_fetch with web_crawl or web_search	Low	Medium	Precise, mutually-exclusive tool descriptions with explicit when/when-not guidance
Large HTML pages block Electron main process during conversion	Low	Medium	Truncate raw HTML before processing; move to worker thread in follow-up if needed
Content quality varies across sites (paywalls, anti-bot)	Medium	Low	Return clear error messages; user can fall back to manual copy-paste
New dependencies (turndown, readability) introduce maintenance burden	Low	Low	Both are mature, stable libraries with large install bases
"Accept always" consent enables unbounded fetch loops	Low	Medium	Monitor; consider per-turn fetch limit in follow-up if abuse is observed

Open Questions

Readability vs. Defuddle: Evaluate defuddle (by Jina AI) as a potential alternative to @mozilla/readability. Defuddle may offer better extraction for modern web pages. Decision can be made during implementation based on testing.
DOM library: linkedom is included as the DOM implementation since both @mozilla/readability and turndown require a DOM document and Electron's main process doesn't provide one. linkedom is lightweight (~50KB) and much faster than JSDOM.
Multiple URLs per message: When a user pastes 2-5 URLs, the LLM may call web_fetch multiple times. Each triggers a separate consent dialog. If this proves disruptive, consider batch consent UI in a follow-up.
Stale content: Fetched content is point-in-time. For long conversations, consider adding timestamps to fetch cards and a re-fetch capability in a follow-up.

Decision Log

Decision	Reasoning
New tool (`web_fetch`) rather than extending `web_crawl`	Use cases are fundamentally different (read vs. clone). Separate tools = cleaner code, clearer LLM descriptions, independent consent settings. All 3 roles agreed independently.
Available to all users (free + Pro)	Local fetch has zero infrastructure cost. Differentiates free tier. Natural upsell to Pro for enhanced crawl+screenshot.
LLM-triggered, not auto-detected	Consistent with existing tool architecture. Auto-detection would require new handler-layer logic and might fetch URLs users didn't intend.
Allow private/localhost IPs	Dyad runs locally; SSRF is a server-side threat model. Fetching localhost:3000 or internal docs is a legitimate use case. Consent dialog provides sufficient protection.
Include @mozilla/readability in v1	Dramatically better content extraction (strips nav, footer, ads). Small marginal cost (one extra dependency). All roles agreed.
Handle Content-Type gracefully	~15 lines of code prevents confusing failures for JSON, text, PDF URLs. Better UX for minimal effort.
Consent default: "ask"	Consistent with web_crawl and web_search. Network requests to arbitrary external URLs warrant explicit approval.
Truncation at 16K characters	Matches existing `MAX_TEXT_SNIPPET_LENGTH`. Prevents context window overflow while providing substantial content.
Tool name: `web_fetch`	Consistent with `web_search`, `web_crawl` naming convention. Clear, concise, action-oriented.

Generated by dyad:swarm-to-plan