Back to Source Monitor

Test Infrastructure Research - Phase 06 Audit Findings (T1-T17)

.vbw-planning/milestones/07-rails-audit-and-refactoring/06-test-infrastructure/06-01-RESEARCH.md

0.13.023.5 KB
Original Source

Phase 06: Test Infrastructure Research

Current State

Test Infrastructure Overview

Framework & Setup:

  • Minitest (no RSpec or FactoryBot)
  • WebMock + VCR for HTTP stubbing
  • SimpleCov for coverage tracking
  • TestProf for optimized test runs (before_all, sampling)
  • Thread-based parallelism (not process-based due to PG segfault)
  • Rails 8 dummy app at test/dummy/ with Solid Queue and Hotwire

Key Files:

  • test/test_helper.rb - Main test setup (65-128 lines)
  • test/test_prof.rb - TestProf integration (38 lines)
  • test/support/host_app_harness.rb - Host app test generation (395 lines)
  • VCR cassettes: test/vcr_cassettes/source_monitor/ (4 cassettes found)

Test Count by Directory (estimated):

  • Controllers: 6 test files
  • Jobs: 3 test files
  • Lib/: 50+ test files (architecture, configuration, fetching, items, jobs, logs, models, scrapers, scraping, setup, realtime, health, events, etc.)
  • Models: 2 test files
  • System: 6 test files
  • Integration: 3 test files
  • Tasks: 2 test files
  • Mailers: 1 test file
  • Helpers: 1 test file
  • Examples: 4 test files (template tests)
  • Total: ~80 test files

Test Count by Type:

  • 1214 tests total (from project memory)
  • Thread-parallel workers: N (number of processors) or PARALLEL_WORKERS env var
  • Test order: random
  • Per-test setup: SourceMonitor.reset_configuration! + SourceMonitor.config.http.retry_max = 0

Audit Findings Detail

T1: VCR Cassette Maintenance Not Documented

Finding: VCR is configured but usage patterns are not standardized across tests.

Current State:

  • test/test_helper.rb (lines 59-63):
    • Cassette dir: test/vcr_cassettes/
    • Hooked into WebMock
    • ignore_localhost = true
  • Only 4 cassettes found:
    • source_monitor/fetching/rss_success.yml
    • source_monitor/fetching/atom_success.yml
    • source_monitor/fetching/json_success.yml
    • source_monitor/fetching/netflix_medium_rss.yml
  • Zero VCR.use_cassette calls found in test files (no grep matches)

Gap: Cassette strategy is not documented:

  • When to record vs use cassettes
  • How to regenerate stale cassettes
  • Naming convention (only partial pattern exists: source_monitor/<module>/<descriptor>)
  • Maintenance schedule for outdated recordings

Files Affected: test/test_helper.rb, test/vcr_cassettes/


T2: Fixture-Based Tests Tightly Coupled to User Model

Finding: Test helper fixtures :all loads all fixtures, creating implicit dependencies.

Current State:

  • test/test_helper.rb (lines 51-57):
    ruby
    if ActiveSupport::TestCase.respond_to?(:fixture_paths=)
      fixtures_root = File.expand_path("fixtures", __dir__)
      ActiveSupport::TestCase.fixture_paths = [ fixtures_root ]
      ActionDispatch::IntegrationTest.fixture_paths = ActiveSupport::TestCase.fixture_paths
      ActiveSupport::TestCase.file_fixture_path = fixtures_root
      ActiveSupport::TestCase.fixtures :all  # <-- loads ALL fixtures
    end
    
  • fixtures :all loads every fixture, including users.yml if it exists
  • No explicit fixtures :users found in grep (search returned no matches)
  • But fixtures :all implicitly couples every test to user model if fixtures exist

Gap:

  • Implicit coupling obscures test dependencies
  • Tests that don't need fixtures still load them (performance hit)
  • No isolated unit tests can avoid fixture loading without refactoring

Files Affected: test/test_helper.rb, test/fixtures/ (directory)


T3: No Isolated Unit Tests for Extracted Sub-Modules

Finding: Refactored sub-modules lack dedicated unit test files.

Current State:

  • Phase 3/4 extracted FeedFetcher into sub-modules:
    • AdaptiveInterval - no dedicated test file found
    • SourceUpdater - no dedicated test file found
    • EntryProcessor - test file exists: test/lib/source_monitor/fetching/feed_fetcher/entry_processor_test.rb
  • EntryProcessor has 1 test file
  • Other sub-modules tested only as part of larger integration tests

Gap:

  • Unit tests for AdaptiveInterval and SourceUpdater behavior don't exist
  • Forces reliance on integration tests that exercise multiple layers
  • Harder to debug failures (wider blast radius)
  • Difficult to test edge cases in isolation

Files Affected:

  • lib/source_monitor/fetching/feed_fetcher/adaptive_interval.rb (no test file)
  • lib/source_monitor/fetching/feed_fetcher/source_updater.rb (no test file)
  • lib/source_monitor/fetching/feed_fetcher/entry_processor.rb (test exists)

T4: Integration Test Coverage Gaps

Finding: Integration tests exist but coverage is incomplete for complex flows.

Current State:

  • Integration tests found: 3 files
    • test/integration/engine_mounting_test.rb
    • test/integration/host_install_flow_test.rb
    • test/integration/release_packaging_test.rb
    • test/integration/navigation_test.rb (4 total)
  • Cover: mounting, installation, release, navigation
  • Missing: fetch pipeline end-to-end, scrape pipeline end-to-end, job error handling

Gap:

  • No end-to-end fetch tests (HTTP → parsing → item creation → log entry → events)
  • No end-to-end scrape tests (item selection → adapter → persistence → broadcast)
  • Health check integration (source pause, recovery) not tested
  • Retry logic and circuit breaker not tested

Estimated Test Count Needed: 8-12 additional integration tests


T5: System Tests Only Cover Happy Paths

Finding: System tests test success flows but avoid error scenarios.

Current State:

  • System test files: 6
    • test/system/dashboard_test.rb
    • test/system/items_test.rb
    • test/system/logs_test.rb
    • test/system/mission_control_test.rb
    • test/system/sources_test.rb
    • test/system/dropdown_fallback_test.rb
  • All tests use inline helpers (defined per-file)
  • Tests create records, visit pages, assert UI renders
  • No error state testing (404, 500, validation errors, timeout, network failure)

Gap:

  • No error flow coverage
  • No timeout/network failure UI handling
  • No invalid state recovery
  • No permission/auth error paths

Example Missing Test: "source deletion confirmation shows error if deletion fails"


T6: Duplicated Factory Helpers Across 20+ Test Files

Finding: Factory-like helpers are defined locally in each test file instead of centrally.

Current State:

  • Centralized in test/test_helper.rb: only create_source! (lines 107-119)
  • Local definitions found in:
    • test/lib/source_monitor/items/retention_pruner_test.rb:
      • build_source (lines 147-155)
      • create_item (lines 157-165)

Pattern: Each test file that needs item/source/log creation defines its own helper, e.g.:

ruby
# In retention_pruner_test.rb
def build_source(attributes = {})
  defaults = { name: "Source #{SecureRandom.hex(4)}", ... }
  create_source!(defaults.merge(attributes))
end

def create_item(source:, guid:, published_at:, title:)
  source.items.create!(guid:, url: "...", title:, ...)
end

Gap:

  • Duplication across ~20+ test files (estimated, no full count)
  • Inconsistent signatures
  • Harder to maintain
  • Scattered knowledge of factory patterns

Files Likely Duplicating:

  • Any test file that uses Item.create!, Source.create!, ScrapeLog.create!, FetchLog.create!, etc.

T7: Inconsistent Mocking Approach

Finding: Tests use 3+ different mocking styles without standardization.

Current State:

  • Style 1: Minitest::Mock (via .stub):
    ruby
    # test/lib/source_monitor/fetching/advisory_lock_test.rb:40
    ActiveRecord::Base.connection_pool.stub :with_connection, ->(&block) { block.call(fake_connection) } do
      assert_raises(...)
    end
    
  • Style 2: Class.new (anonymous class mocking):
    ruby
    # test/lib/source_monitor/fetching/advisory_lock_test.rb:30-37
    fake_connection = Class.new do
      def exec_query(sql)
        # mock behavior
      end
    end.new
    
  • Style 3: Inline stubs (direct method stubbing):
    ruby
    # test/system/items_test.rb:99
    SourceMonitor::Scrapers::Readability.stub(:call, result) do
      # test code
    end
    

Gap:

  • No consistency across test types
  • Harder to teach/onboard (3 patterns to learn)
  • Mixing implementation details with test code

T8: WebMock Stub Explosion in Some Tests

Finding: Some tests create large numbers of WebMock stubs inline.

Current State:

  • test/test_helper.rb (lines 65): WebMock disables all external HTTP except localhost
  • Stubs are created in test methods, not shared
  • No centralized stub collection
  • Stub naming not standardized

Gap:

  • Tests with multiple HTTP calls create multiple stub_request calls
  • Hard to understand which calls are being stubbed
  • Difficult to debug stub mismatches
  • No reusable stub library for common patterns

T9: System Test Setup/Teardown Not in Base Class

Finding: System tests define setup/teardown per-file instead of in a base class.

Current State:

  • System tests require application_system_test_case.rb (no file found yet)
  • Actual system tests define their own setup/teardown inline:
    ruby
    # test/system/dashboard_test.rb:9-23
    def setup
      super
      SourceMonitor.reset_configuration!
      SourceMonitor::Jobs::Visibility.reset!
      SourceMonitor::Jobs::Visibility.setup!
      purge_solid_queue_tables
      SourceMonitor::Dashboard::TurboBroadcaster.setup!
    end
    
    def teardown
      SourceMonitor.reset_configuration!
      SourceMonitor::Jobs::Visibility.reset!
      purge_solid_queue_tables
      super
    end
    
  • Helper methods defined in test files:
    • purge_solid_queue_tables (lines 167-184, dashboard_test.rb)
    • seed_queue_activity (lines 140-165, dashboard_test.rb)
    • apply_turbo_stream_messages (lines 186-226, dashboard_test.rb)
    • connect_turbo_cable_stream_sources (not shown but referenced)
    • capture_turbo_stream_broadcasts (not shown but referenced)
    • parse_turbo_streams (lines 229-245, dashboard_test.rb)
    • assert_item_order (test/system/items_test.rb:160-168)

Gap:

  • Setup/teardown logic duplicated across files
  • Shared helper methods not centralized
  • No ApplicationSystemTestCase base class (missing file)
  • Capybara wait configuration not centralized

Files Affected: All system test files (test/system/*.rb)


T10: Insufficient Controller Error Path Testing

Finding: Controller tests focus on happy paths; error scenarios are light.

Current State:

  • Controller tests found: 6 files
    • test/controllers/source_monitor/application_controller_test.rb (55 lines, 4 tests)
    • test/controllers/source_monitor/source_fetches_controller_test.rb (44 lines, 2 tests)
    • test/controllers/source_monitor/health_controller_test.rb
    • test/controllers/source_monitor/logs_controller_test.rb
    • test/controllers/source_monitor/source_bulk_scrapes_controller_test.rb
    • test/controllers/source_monitor/source_health_checks_controller_test.rb
  • Example test (source_fetches_controller_test.rb:14-27):
    ruby
    test "queues a fetch and renders turbo stream" do
      source = create_source!(fetch_status: "idle")
      assert_enqueued_jobs 1 do
        post source_monitor.source_fetch_path(source), as: :turbo_stream
      end
      assert_response :success
    end
    
  • Error test (lines 29-41): one error test per controller

Gap:

  • One error test per controller is insufficient
  • Missing: validation errors, permission denied, record not found, server errors
  • No test for CSRF protection edge cases
  • Toast/flash message error flow not tested

T11: Time-Dependent Tests Without Explicit Assertions

Finding: Tests use travel_to but assertions don't always validate the time-based behavior.

Current State:

  • test/lib/source_monitor/items/retention_pruner_test.rb uses travel_to:
    ruby
    test "removes items older than retention period" do
      travel_to Time.zone.local(2025, 10, 1, 12, 0, 0) do
        create_item(...)  # old item created
      end
      travel_to Time.zone.local(2025, 10, 10, 12, 0, 0) do
        create_item(...)  # recent item created
        result = RetentionPruner.call(source:)
        assert_equal 1, result.removed_by_age  # validates deletion
      end
    end
    
  • Tests properly assert outcomes, but time travel setup is verbose
  • No missing travel_back ensures (appear to be present)

Gap:

  • Boilerplate time travel setup
  • Harder to read test intent
  • No test helper to simplify time-based test setup
  • No ensure travel_back enforcer in base class

T12: Job Tests Mix Inline and Async Patterns

Finding: Job testing uses both :test and :inline adapters without clear convention.

Current State:

  • Default adapter: :test (test_helper.rb:48)
    ruby
    ActiveJob::Base.queue_adapter = :test
    
  • Tests use with_inline_jobs when execution needed:
    ruby
    # test/system/items_test.rb:101
    with_inline_jobs do
      click_button "Manual Scrape"
    end
    
  • Some tests use assert_enqueued_with for :test adapter
  • Other tests execute jobs inline directly

Gap:

  • No clear convention for when to use each approach
  • Naming with_inline_jobs suggests it should be in test_helper but it's in test_prof.rb
  • No job test base class or pattern docs

T13: Counter Cache Tests Lack Atomicity Checks

Finding: Counter cache fields (items_count on Source) are not validated for atomicity.

Current State:

  • Source model has counter_cache: true on has_many :items
  • Tests create items and assert count:
    ruby
    # test/system/items_test.rb:88
    initial_item_count = Item.count
    item = Item.create!(...)
    # Later assertion checks the count
    
  • No tests for concurrency/race conditions
  • No tests for counter cache recovery after corruption

Gap:

  • No atomicity tests (concurrent item creation)
  • No counter cache inconsistency recovery test
  • Missing: reset_items_counter! testing

T14: Inconsistent Test Naming Style

Finding: Test names vary in style and specificity.

Current State:

  • Verbose style: "removes items older than the configured retention period"
  • Terse style: "filters items" (hypothetical)
  • Mixed prefixes:
    • Some start with verbs: "respects configured scraper adapters"
    • Some start with nouns: "dashboard displays stats, job metrics, and quick actions"
  • Test names in test_helper (lines 67-92) use comments for setup, not clear test intent

Gap:

  • No naming convention documented
  • Hard to scan test files for related tests
  • Inconsistency makes test discovery harder

T15: No Shared Behavior Test Modules for Concerns

Finding: Concerns (Loggable, etc.) are tested inline; no shared test modules.

Current State:

  • Loggable concern used by FetchLog, ScrapeLog, HealthCheckLog
  • Each model tested separately (implied, no shared spec module)
  • Example: test/lib/source_monitor/logs/entry_sync_test.rb tests log behavior directly

Gap:

  • No shared test for Loggable behavior across 3 models
  • Duplicated tests for concern behavior
  • Harder to maintain concern contracts

T16: ApplicationSystemTestCase Missing Wait Configuration

Finding: ApplicationSystemTestCase doesn't exist or lacks Capybara wait config.

Current State:

  • No test/dummy/app/test/application_system_test_case.rb file
  • System tests hardcode waits: wait: 5, wait: 10, visible: :all, wait: 5
  • Capybara default wait: 2 seconds (Rails default)
  • Tests use explicit waits for async behavior:
    ruby
    # test/system/dashboard_test.rb:86
    assert_selector "turbo-cable-stream-source", visible: :all, wait: 5
    

Gap:

  • No centralized wait time configuration
  • Arbitrary wait values (5s, 10s) scattered across tests
  • No Capybara screenshot/save_page on failure setup
  • No JavaScript driver logging
  • No Capybara::Minitest integration configuration

T17: No Temp File Cleanup in Test Teardown

Finding: Tests that create temp files don't have centralized cleanup.

Current State:

  • test/test_helper.rb includes cleanup for engine tables but not temp files
  • test/support/host_app_harness.rb manages temp directories for integration tests
  • No explicit FileUtils.rm_rf in test teardown blocks
  • System tests may leave temp screenshots/artifacts

Gap:

  • Temp files accumulate in test/tmp/
  • test/dummy/tmp/ grows unbounded
  • No automatic cleanup of failed test artifacts
  • Disk space issues in long test runs

Existing Patterns

Factories (Centralized & Local)

Centralized in test_helper.rb:

ruby
def create_source!(attributes = {})
  defaults = {
    name: "Test Source",
    feed_url: "https://example.com/feed-#{SecureRandom.hex(4)}.xml",
    website_url: "https://example.com",
    fetch_interval_minutes: 60,
    scraper_adapter: "readability"
  }
  source = SourceMonitor::Source.new(defaults.merge(attributes))
  source.save!(validate: false)
  source
end

Local Patterns (per-file):

  • build_source(attributes) - wraps create_source! with defaults
  • create_item(source:, guid:, title:, published_at:) - creates Item records
  • Direct Model.create! calls for other models (FetchLog, ScrapeLog, etc.)

Mocking Patterns

  1. Minitest Stub (.stub):

    ruby
    Object.stub(:method, return_value) { code }
    
  2. Anonymous Class Mocking:

    ruby
    fake = Class.new { def method; ...; end }.new
    
  3. Direct Method Stubbing:

    ruby
    Class.stub(:method, value) { code }
    

WebMock & VCR

  • WebMock blocks all external requests except localhost
  • VCR configured but underutilized (4 cassettes total)
  • Tests typically stub individual requests inline:
    ruby
    stub_request(:get, "https://example.com/feed.xml")
      .to_return(status: 200, body: File.read(file_fixture("feeds/rss_sample.xml")))
    

Test Setup/Teardown

Standard Setup (test_helper.rb:82-91):

ruby
setup do
  SourceMonitor.reset_configuration!
  SourceMonitor.config.http.retry_max = 0
end

System Test Setup (per-file):

ruby
def setup
  super
  SourceMonitor.reset_configuration!
  SourceMonitor::Jobs::Visibility.reset!
  SourceMonitor::Jobs::Visibility.setup!
  purge_solid_queue_tables
  # dashboard-specific setup
end

Test Isolation

Thread-Based Parallelism:

ruby
parallelize(workers: worker_count, with: :threads)
  • Avoids PG fork segfault
  • Each test gets fresh SourceMonitor config atomically

Scoped Assertions:

ruby
assert_equal 1, SourceMonitor::Item.where(source: source).count  # GOOD
assert_equal 1, SourceMonitor::Item.count  # BAD - counts across parallel tests

Recommendations

Priority 1: Foundational Infrastructure (Start Here)

  1. Create ApplicationSystemTestCase base class (T16)

    • File: test/dummy/app/test/application_system_test_case.rb
    • Configure Capybara waits: default 5-10 seconds for async
    • Add screenshot/save_page on failure
    • Centralize helper methods: purge_solid_queue_tables, seed_queue_activity, apply_turbo_stream_messages, assert_item_order
    • Include cleanup hooks for temp files (T17)
  2. Centralize Factory Helpers (T6)

    • Create test/test_helper_factories.rb or module
    • Consolidate: create_source!, build_source, create_item, create_fetch_log, create_scrape_log
    • Include in ActiveSupport::TestCase
    • Document signatures and defaults
    • Audit 20+ test files for duplicates and consolidate
  3. Document VCR Strategy (T1)

    • Create test/VCR_README.md
    • Define naming convention: source_monitor/<domain>/<scenario> (existing partial convention)
    • Document when to record vs use cassettes
    • Add regeneration guide for stale cassettes
    • Implement cassette recording check in CI

Priority 2: Unit Test Gaps (Execute Phase 06)

  1. Add Unit Tests for Sub-Modules (T3)

    • Create test/lib/source_monitor/fetching/feed_fetcher/adaptive_interval_test.rb
      • Test interval calculation logic, edge cases
    • Create test/lib/source_monitor/fetching/feed_fetcher/source_updater_test.rb
      • Test source state mutations, log creation, event firing
  2. Add Integration Tests for Pipelines (T4)

    • Create test/integration/fetch_pipeline_integration_test.rb
      • HTTP fetch → parsing → item creation → log entry → events
    • Create test/integration/scrape_pipeline_integration_test.rb
      • Item selection → adapter → persistence → broadcast

Priority 3: Test Coverage Completeness (Execute Phase 06)

  1. Add Error Path Tests for Controllers (T10)

    • Add 3-5 error tests per controller
    • Cover: 404, 422, 500, permission denied, timeout
    • Use stubs to simulate failures
  2. Add Error Scenario System Tests (T5)

    • "Fetch fails with HTTP timeout" → user sees error toast
    • "Source deletion fails" → shows error, reverts UI
    • "Item scrape times out" → shows pending state, recovers
  3. Standardize Mocking Approach (T7)

    • Adopt single pattern: Minitest stub (most concise)
    • Create mocking guide in CLAUDE.md
    • Refactor Class.new mocks to use stub

Priority 4: Consistency & Maintainability (Execute Phase 06 or later)

  1. Remove Fixture Loading (T2)

    • Replace fixtures :all with explicit per-test setup_fixtures for tests that need them
    • OR: Move to factory-only approach (no fixtures)
    • Benchmark performance impact
  2. Add Concern Test Modules (T15)

    • Create test/support/shared_loggable_tests.rb
    • Include in FetchLog, ScrapeLog, HealthCheckLog tests
    • Ensure consistent Loggable behavior contract
  3. Standardize Test Naming (T14)

    • Convention: Start with action verb (imperative mood)
    • Format: "action [condition] [expected outcome]"
    • Example: "removes items older than retention period"
  4. Enforce Time Travel Cleanup (T11)

    • Create travel_to_at(time) { code } helper in test_helper
    • Automatically calls travel_back in ensure
    • Reduces boilerplate, prevents travel leaks
  5. Add Counter Cache Atomicity Tests (T13)

    • Test reset_items_counter! correctness
    • Add concurrency test (2 threads creating items)
    • Validate counter cache matches actual count
  6. Centralize Job Testing Convention (T12)

    • Document when to use :test vs :inline adapter
    • Move with_inline_jobs to test_helper (out of test_prof.rb)
    • Add with_test_adapter helper for explicit test-mode tests
  7. Add Temp File Cleanup Hook (T17)

    • Extend ApplicationSystemTestCase to clean test/tmp/* in teardown
    • Add Rails test log cleanup for failed tests

Effort Estimate

TaskEffort
ApplicationSystemTestCase + helpers (T16, T9)2-3 hours
Centralize factories (T6)2-3 hours
VCR documentation (T1)1 hour
Sub-module unit tests (T3)3-4 hours
Integration tests (T4)4-5 hours
Controller error tests (T10)3-4 hours
System test error paths (T5)3-4 hours
Mocking standardization (T7)1-2 hours
Remaining consolidations (T2, T11-T15, T17)5-6 hours
Total25-32 hours

Recommended Phase 06 Scope:

  • Implement ApplicationSystemTestCase + factories (Priority 1) = 4-6 hours
  • Add sub-module unit tests (Priority 2) = 3-4 hours
  • Add core integration tests (Priority 2) = 4-5 hours
  • Add error path tests (Priority 3) = 3-4 hours
  • Subtotal: 14-19 hours (1-2 weeks at 10 hours/week)

Future Phases:

  • Priority 4 cleanup work (9-13 hours) as separate phase if time permits

Key Metrics & Baselines

MetricCurrentTarget
Test files~80~100 (20 new)
Total tests12141350+ (136+ new)
Unit test coverage (sub-modules)PartialComplete
Integration test coverage4 scenarios10+ scenarios
System test error paths0%50%+
Controller error tests1/controller5+/controller
Centralized factories1 (create_source!)6+
ApplicationSystemTestCaseMissingPresent
VCR strategy docMissingPresent