Back to Woods

Configuration Reference

docs/CONFIGURATION_REFERENCE.md

1.2.08.4 KB
Original Source

Configuration Reference

All configuration is done via the Woods.configure block, typically in config/initializers/woods.rb.

ruby
Woods.configure do |config|
  config.output_dir = Rails.root.join('tmp/woods')
  config.max_context_tokens = 8000
  # ...
end

Common Configuration Patterns

CI-Only Extraction (Subset of Extractors)

ruby
Woods.configure do |config|
  config.output_dir = Rails.root.join('tmp/woods')

  # In CI, only extract models and controllers for faster builds
  config.extractors = %i[models controllers services] if ENV['CI']
end

Docker Extraction with Environment-Based Paths

ruby
Woods.configure do |config|
  # Inside Docker, /app is the Rails root
  config.output_dir = ENV.fetch('WOODS_OUTPUT_DIR', Rails.root.join('tmp/woods'))
end

Environment-Conditional Embedding Provider

ruby
Woods.configure do |config|
  # Use OpenAI in production/CI where the API key is set,
  # fall back to Ollama for local development (free, no API key needed)
  if ENV['OPENAI_API_KEY']
    config.embedding_provider = :openai
    config.embedding_model = 'text-embedding-3-small'
    config.embedding_options = { api_key: ENV['OPENAI_API_KEY'] }
  else
    config.embedding_provider = :ollama
    config.embedding_model = 'nomic-embed-text'
    config.embedding_options = { base_url: ENV.fetch('OLLAMA_URL', 'http://localhost:11434') }
  end
end

Core Options

OptionTypeDefaultDescription
output_dirPathname/StringRails.root.join('tmp/woods')Directory where extracted data is written
extractorsArray<Symbol>[:models, :controllers, :services, ...]List of enabled extractors (see Extractors below)
pretty_jsonBooleantrueFormat extracted JSON with indentation
max_context_tokensInteger8000Maximum tokens for retrieval context windows
similarity_thresholdFloat0.7Minimum similarity score (0.0-1.0) for retrieval results
context_formatSymbol:markdownOutput format for retrieval: :claude, :markdown, :plain, :json
include_framework_sourcesBooleantrueExtract Rails and gem source code
concurrent_extractionBooleanfalseEnable parallel extraction (experimental)

Embedding Options

OptionTypeDefaultDescription
embedding_providerSymbolEmbedding backend: :openai or :ollama
embedding_modelString'text-embedding-3-small'Model name for the embedding provider
embedding_optionsHashnilProvider-specific options (see below)

OpenAI Embeddings

ruby
config.embedding_provider = :openai
config.embedding_model = 'text-embedding-3-small'
config.embedding_options = {
  api_key: ENV['OPENAI_API_KEY'],
  dimensions: 1536
}

Ollama Embeddings

ruby
config.embedding_provider = :ollama
config.embedding_model = 'nomic-embed-text'
config.embedding_options = {
  base_url: 'http://localhost:11434'
}

Storage Options

OptionTypeDefaultDescription
vector_storeSymbolVector backend: :in_memory, :pgvector, :qdrant
vector_store_optionsHashnilBackend-specific connection options
metadata_storeSymbolMetadata backend: :in_memory, :sqlite
metadata_store_optionsHashnilBackend-specific options
graph_storeSymbolGraph backend: :in_memory

pgvector (PostgreSQL)

ruby
config.vector_store = :pgvector
config.vector_store_options = {
  connection: ActiveRecord::Base.connection,
  dimensions: 1536
}

Requires the pgvector extension. Run the generator to create migrations:

bash
bundle exec rails generate woods:pgvector
bundle exec rails db:migrate

Qdrant

ruby
config.vector_store = :qdrant
config.vector_store_options = {
  url: 'http://localhost:6333',
  collection: 'woods',
  dimensions: 1536
}

SQLite Metadata

ruby
config.metadata_store = :sqlite
config.metadata_store_options = {
  database: Rails.root.join('tmp/woods/metadata.sqlite3').to_s
}

Presets

For quick setup, use named presets that configure storage + embedding together:

ruby
# Local development — no external services needed
Woods.configure_with_preset(:local)
# → in_memory vectors, SQLite metadata, in_memory graph, Ollama embeddings

# PostgreSQL — requires pgvector extension and OpenAI API key
Woods.configure_with_preset(:postgresql)
# → pgvector vectors, SQLite metadata, in_memory graph, OpenAI embeddings

# Production — requires Qdrant server and OpenAI API key
Woods.configure_with_preset(:production)
# → Qdrant vectors, SQLite metadata, in_memory graph, OpenAI embeddings

Presets can be overridden:

ruby
Woods.configure_with_preset(:local) do |config|
  config.max_context_tokens = 16000
  config.embedding_model = 'mxbai-embed-large'
end

Pipeline Options

OptionTypeDefaultDescription
precompute_flowsBooleanfalsePre-compute per-action request flow maps during extraction
enable_snapshotsBooleanfalseEnable temporal snapshots (requires migrations 004+005)

Session Tracer Options

OptionTypeDefaultDescription
session_tracer_enabledBooleanfalseEnable session tracing middleware
session_storeObjectnilStore backend: FileStore, RedisStore, or SolidCacheStore
session_id_procProcnilCustom proc to extract session ID from requests
session_exclude_pathsArray<String>[]Path patterns to exclude from tracing
ruby
config.session_tracer_enabled = true
config.session_store = Woods::SessionTracer::FileStore.new(
  Rails.root.join('tmp/session_traces')
)
config.session_exclude_paths = ['/health', '/metrics', '/assets']

Gem Indexing

Register additional gems to extract source from:

ruby
config.add_gem 'devise', paths: ['lib/devise/models'], priority: :high
config.add_gem 'pundit', paths: ['lib/pundit'], priority: :medium
config.add_gem 'sidekiq', paths: ['lib/sidekiq/worker', 'lib/sidekiq/job'], priority: :high

Priority levels (:low, :medium, :high) affect retrieval ranking when framework source is relevant to a query.

Extractors

The extractors config accepts an array of symbols. Default set:

ruby
config.extractors = %i[
  models controllers services components view_components
  jobs mailers graphql serializers managers policies validators
  rails_source
]

Additional extractors available (not in default set):

SymbolExtractorWhat it adds
:concernsConcernExtractorActiveSupport::Concern modules
:routesRouteExtractorRails routes (auto-included)
:middlewareMiddlewareExtractorRack middleware stack
:i18nI18nExtractorLocale translation files
:pundit_policiesPunditExtractorPundit authorization policies
:configurationsConfigurationExtractorRails initializers + behavioral profile
:enginesEngineExtractorMounted Rails engines
:view_templatesViewTemplateExtractorERB view templates
:migrationsMigrationExtractorActiveRecord migrations
:action_cable_channelsActionCableExtractorActionCable channels
:scheduled_jobsScheduledJobExtractorRecurring/scheduled jobs
:rake_tasksRakeTaskExtractorRake task definitions
:state_machinesStateMachineExtractorAASM/Statesman state machines
:eventsEventExtractorEvent publish/subscribe patterns
:decoratorsDecoratorExtractorDecorators, presenters, form objects
:database_viewsDatabaseViewExtractorSQL views (Scenic)
:cachingCachingExtractorCache usage patterns
:factoriesFactoryExtractorFactoryBot factory definitions
:test_mappingsTestMappingExtractorTest file → subject class mapping
:porosPoroExtractorPlain Ruby objects in app/models
:libsLibExtractorRuby files in lib/

Database Compatibility

All storage options work with both MySQL and PostgreSQL, except:

  • pgvector — PostgreSQL only (requires the pgvector extension)
  • SQLite metadata store — uses a standalone SQLite database file, independent of your app's database

See BACKEND_MATRIX.md for the full compatibility matrix.