Back to Source Monitor

Index

.vbw-planning/codebase/INDEX.md

0.13.05.4 KB
Original Source

Index

Cross-referenced index of key findings across all mapping documents.

Quick Reference

DocumentFocusKey Finding
STACK.mdTechnology choicesRails 8.1.1 engine, Ruby 3.4+, PostgreSQL, Solid Queue, Tailwind 3
DEPENDENCIES.mdDependency analysis14 runtime gems, PG-only, optional deps loaded silently
ARCHITECTURE.mdSystem design10 domain modules, event-driven, pluggable scrapers
STRUCTURE.mdDirectory layout~324 Ruby files, 124 tests, 24 migrations
CONVENTIONS.mdCode styleRails omakase, frozen strings, Struct-based results
TESTING.mdTest infrastructureMinitest, parallel, SimpleCov branch coverage, nightly profiling
CONCERNS.mdRisks & debtLarge files, PG lock-in, coverage gaps, no default auth
PATTERNS.mdRecurring patternsService objects, adapter pattern, event callbacks, Turbo Streams

Key Entry Points

PurposeFileNotes
Gem entry pointlib/source_monitor.rb102+ require statements, module definition
Engine definitionlib/source_monitor/engine.rbInitializers, asset registration
Configuration DSLlib/source_monitor/configuration.rb12 nested settings classes
Routesconfig/routes.rb24 lines, RESTful resources
Main modelapp/models/source_monitor/source.rbCore domain entity
Dashboardapp/controllers/source_monitor/dashboard_controller.rbLanding page
Fetch pipelinelib/source_monitor/fetching/feed_fetcher.rbCore data ingestion
Scrape pipelinelib/source_monitor/scraping/item_scraper.rbContent extraction orchestrator
Schedulerlib/source_monitor/scheduler.rbPeriodic fetch scheduling
JS entryapp/assets/javascripts/source_monitor/application.jsStimulus app setup
CSS entryapp/assets/stylesheets/source_monitor/application.tailwind.cssTailwind input
Test entrytest/test_helper.rbTest infrastructure setup

Data Model Reference

ModelTableKey Relationships
Sourcesourcemon_sourceshas_many: items, fetch_logs, scrape_logs, health_check_logs, log_entries
Itemsourcemon_itemsbelongs_to: source; has_one: item_content; has_many: scrape_logs, log_entries
ItemContentsourcemon_item_contentsbelongs_to: item (separate table for large scraped content)
FetchLogsourcemon_fetch_logsbelongs_to: source; has_one: log_entry (polymorphic)
ScrapeLogsourcemon_scrape_logsbelongs_to: item, source; has_one: log_entry (polymorphic)
HealthCheckLogsourcemon_health_check_logsbelongs_to: source; has_one: log_entry (polymorphic)
LogEntrysourcemon_log_entriesdelegated_type: loggable (FetchLog/ScrapeLog/HealthCheckLog)
ImportSessionsourcemon_import_sessionsJSONB state for wizard flow
ImportHistorysourcemon_import_historiesRecords completed imports

Job Reference

Job ClassQueueSchedulePurpose
ScheduleFetchesJobfetchRecurringTriggers scheduler to find due sources
FetchFeedJobfetchOn-demandFetches one source's feed
ScrapeItemJobscrapeOn-demandScrapes one item's content
SourceHealthCheckJobfetchOn-demandHealth check for one source
ImportSessionHealthCheckJobfetchOn-demandHealth check during OPML import
ImportOpmlJobfetchOn-demandBulk creates sources from OPML
LogCleanupJobfetchRecurringPrunes old log entries
ItemCleanupJobfetchRecurringPrunes items per retention policy

Configuration Surface Area

SectionKey SettingsDefaults
Queuesfetch_queue_name, scrape_queue_name, concurrencysource_monitor_fetch, source_monitor_scrape, 2 each
HTTPtimeout, retries, user agent, proxy, headers15s/5s timeout, 4 retries
Fetchingadaptive interval params, jitter5min-24hr, 1.25x increase, 0.75x decrease
Healthwindow size, thresholds, auto-pause20 window, 0.8/0.5/0.2 thresholds
Scrapingmax_in_flight, max_bulk_batch25, 100
Retentiondays, max_items, strategynil (no auto-cleanup), :destroy
Realtimeadapter (solid_cable/redis/async)solid_cable
Authenticationhandlers, current_user_methodnil (no auth by default)
Modelstable_name_prefix, concerns, validationssourcemon_

Critical Cross-Cutting Concerns

  1. PG-only (ARCHITECTURE + CONCERNS): FOR UPDATE SKIP LOCKED and NULLS FIRST/LAST SQL are PostgreSQL-specific. No other DB supported.

  2. No default auth (ARCHITECTURE + CONCERNS): Engine mounts without authentication unless host app configures it. Import wizard has a create_guest_user fallback.

  3. Eager loading (STRUCTURE + CONCERNS): All 102+ require statements in lib/source_monitor.rb load at boot time.

  4. Coverage debt (TESTING + CONCERNS): config/coverage_baseline.json lists 2329 lines of known uncovered code, particularly in FeedFetcher, ItemCreator, Configuration, and Dashboard::Queries.

  5. Large files (STRUCTURE + CONCERNS): FeedFetcher (627 lines), Configuration (655 lines), and ImportSessionsController (792 lines) are candidates for extraction.