docs/README.md
Woods is a Ruby gem that extracts structured data from Rails applications for AI-assisted development. Unlike file-level tools, it uses runtime introspection — booting the Rails app and querying ActiveRecord::Base.descendants, Rails.application.routes, reflection APIs — to produce version-accurate representations with inlined concerns, resolved callback chains, and schema-aware associations.
All major layers are implemented: 34 extractors (including state machines, events, decorators, database views, caching patterns, factories, test mappings, and more), retrieval pipeline (query classification, hybrid search, RRF ranking), storage backends (pgvector, Qdrant, SQLite), embedding providers (OpenAI, Ollama), two MCP servers (27-tool index server + 31-tool console server), AST analysis, flow extraction, temporal snapshots, Notion export, and evaluation harness. Behavioral depth enrichment adds callback side-effect analysis, resolved Rails config introspection (BehavioralProfile), and optional pre-computed request flow maps (FlowPrecomputer).
What's next: see COVERAGE_GAP_ANALYSIS.md for remaining coverage work (HAML/Slim expansion, configuration semantic parsing, Stimulus/Hotwire).
| Document | Purpose |
|---|---|
| GETTING_STARTED.md | Install, configure, extract, and inspect — end-to-end walkthrough |
| CONFIGURATION_REFERENCE.md | All configuration options with defaults, types, and examples |
| MCP_SERVERS.md | Index server vs console server — full tool catalog, setup for Claude Code / Cursor / Windsurf |
| DOCKER_SETUP.md | Docker-specific guide — split architecture, volume mounts, path translation, MCP config |
| CONSOLE_MCP_SETUP.md | Console MCP server setup — stdio, Docker, HTTP/Rack, SSH bridge, tool tiers, safety model |
| BACKEND_MATRIX.md | Infrastructure selection guide — vector stores, embedding providers, metadata stores, cost modeling |
| MCP_HTTP_TRANSPORT.md | Design and usage for the HTTP/Rack MCP transport (exe/woods-mcp-http) |
| FAQ.md | Frequently asked questions — general, setup, extraction, MCP servers, Docker, storage |
| TROUBLESHOOTING.md | Symptom → cause → fix for extraction, MCP, embedding, storage, Docker, and Notion problems |
| WHY_WOODS.md | What Woods is, why it exists, before/after examples |
| ARCHITECTURE.md | Pipeline stages, ExtractedUnit, dependency graph, retrieval, storage backends, MCP servers |
| EXTRACTOR_REFERENCE.md | Per-extractor documentation — what each of the 34 extractors captures, edge cases, example output |
| MCP_TOOL_COOKBOOK.md | Scenario-based MCP tool examples — question → tool → parameters → expected output |
| Document | Purpose |
|---|---|
| COVERAGE_GAP_ANALYSIS.md | Gap analysis identifying missing extraction coverage and untapped data uses |
| TOKEN_BENCHMARK.md | Token estimation benchmark — tiktoken comparison, divisor calibration |
| USE_CASES_AND_FEATURE_GAPS.md | 37 use cases across 4 categories with implementation status |
| NOTION_INTEGRATION.md | Sync codebase data to Notion databases (Data Models + Columns schemas) |
| self-analysis/ | Woods analyzed by itself — extraction output, quality audit |
Historical design documents from the build phase are in design/ (see design/README.md).
| Document | Scope |
|---|---|
| RETRIEVAL_GUIDE.md | Query classification, search strategies, RRF ranking, token budget tuning |
| API_REFERENCE.md | Key public classes and interfaces (may generate from YARD) |
CLAUDE.md is for agents working on the gem; docs/ is for users of the gem