Back to Woods

Woods Documentation

docs/README.md

1.2.04.4 KB
Original Source

Woods Documentation

Woods is a Ruby gem that extracts structured data from Rails applications for AI-assisted development. Unlike file-level tools, it uses runtime introspection — booting the Rails app and querying ActiveRecord::Base.descendants, Rails.application.routes, reflection APIs — to produce version-accurate representations with inlined concerns, resolved callback chains, and schema-aware associations.

Current State

All major layers are implemented: 34 extractors (including state machines, events, decorators, database views, caching patterns, factories, test mappings, and more), retrieval pipeline (query classification, hybrid search, RRF ranking), storage backends (pgvector, Qdrant, SQLite), embedding providers (OpenAI, Ollama), two MCP servers (27-tool index server + 31-tool console server), AST analysis, flow extraction, temporal snapshots, Notion export, and evaluation harness. Behavioral depth enrichment adds callback side-effect analysis, resolved Rails config introspection (BehavioralProfile), and optional pre-computed request flow maps (FlowPrecomputer).

What's next: see COVERAGE_GAP_ANALYSIS.md for remaining coverage work (HAML/Slim expansion, configuration semantic parsing, Stimulus/Hotwire).

User Guides

DocumentPurpose
GETTING_STARTED.mdInstall, configure, extract, and inspect — end-to-end walkthrough
CONFIGURATION_REFERENCE.mdAll configuration options with defaults, types, and examples
MCP_SERVERS.mdIndex server vs console server — full tool catalog, setup for Claude Code / Cursor / Windsurf
DOCKER_SETUP.mdDocker-specific guide — split architecture, volume mounts, path translation, MCP config
CONSOLE_MCP_SETUP.mdConsole MCP server setup — stdio, Docker, HTTP/Rack, SSH bridge, tool tiers, safety model
BACKEND_MATRIX.mdInfrastructure selection guide — vector stores, embedding providers, metadata stores, cost modeling
MCP_HTTP_TRANSPORT.mdDesign and usage for the HTTP/Rack MCP transport (exe/woods-mcp-http)
FAQ.mdFrequently asked questions — general, setup, extraction, MCP servers, Docker, storage
TROUBLESHOOTING.mdSymptom → cause → fix for extraction, MCP, embedding, storage, Docker, and Notion problems
WHY_WOODS.mdWhat Woods is, why it exists, before/after examples
ARCHITECTURE.mdPipeline stages, ExtractedUnit, dependency graph, retrieval, storage backends, MCP servers
EXTRACTOR_REFERENCE.mdPer-extractor documentation — what each of the 34 extractors captures, edge cases, example output
MCP_TOOL_COOKBOOK.mdScenario-based MCP tool examples — question → tool → parameters → expected output

Reference

DocumentPurpose
COVERAGE_GAP_ANALYSIS.mdGap analysis identifying missing extraction coverage and untapped data uses
TOKEN_BENCHMARK.mdToken estimation benchmark — tiktoken comparison, divisor calibration
USE_CASES_AND_FEATURE_GAPS.md37 use cases across 4 categories with implementation status
NOTION_INTEGRATION.mdSync codebase data to Notion databases (Data Models + Columns schemas)
self-analysis/Woods analyzed by itself — extraction output, quality audit

Historical design documents from the build phase are in design/ (see design/README.md).

Planned Documentation

DocumentScope
RETRIEVAL_GUIDE.mdQuery classification, search strategies, RRF ranking, token budget tuning
API_REFERENCE.mdKey public classes and interfaces (may generate from YARD)

Documentation Principles

  • Audience-first — each page targets a specific reader (gem user, contributor, agent)
  • Code is the source of truth — docs explain why and how to use, not implementation details that drift
  • Examples over explanations — show configuration, show output, show usage
  • No duplicating CLAUDE.mdCLAUDE.md is for agents working on the gem; docs/ is for users of the gem