Back to Source Monitor

Research: Simplify Source Status

.vbw-planning/milestones/ui-fixes-and-smart-scraping/phases/05-simplify-source-status/.context-lead.md

0.13.04.8 KB
Original Source

Phase 05 Context (Compiled)

Goal

Not available

Success Criteria

Not available

Requirements (Not available)

No matching requirements found

Active Decisions

None

Codebase Map Available

Codebase mapping exists in .vbw-planning/codebase/. Key files:

  • ARCHITECTURE.md
  • CONCERNS.md
  • PATTERNS.md
  • DEPENDENCIES.md
  • STRUCTURE.md
  • CONVENTIONS.md
  • TESTING.md
  • STACK.md

Read ARCHITECTURE.md, CONCERNS.md, and STRUCTURE.md first to bootstrap codebase understanding.

Research Findings

Research: Simplify Source Status

Findings

Current Status/Health Fields on Source Model

Operational State:

  • active (boolean, default: true) — Controls whether source is actively fetched
  • fetch_status (string, default: "idle") — Async operation state (values: "idle", "queued", "fetching", "failed")

Health Monitoring:

  • health_status (string, default: "healthy") — Primary health indicator
  • health_status_changed_at (datetime) — Tracks when status last changed
  • rolling_success_rate (decimal, 0-1) — Moving average of fetch success
  • auto_paused_at (datetime) — When auto-pause was triggered
  • auto_paused_until (datetime) — When auto-pause expires
  • health_auto_pause_threshold (decimal, nullable) — Per-source override
  • failure_count (integer, default: 0)
  • last_error (text)
  • last_error_at (datetime)
  • consecutive_fetch_failures (integer, default: 0)

Current Health Status Values (7 total)

  1. "healthy" — Success rate >= 0.8 (80%)
  2. "warning" — Success rate 0.5-0.8 (50-80%) — "Needs Attention" in UI
  3. "critical" — Success rate < 0.5 — "Failing" in UI
  4. "declining" — 3+ consecutive fetch failures
  5. "improving" — 2+ consecutive successes after a failure
  6. "auto_paused" — Active pause window (auto_paused_until.future?)
  7. "unknown" — Fallback (rarely used)

Auto-Pause Mechanism

Auto-pause does NOT toggle active — it sets health_status = "auto_paused" while keeping active=true. This is the core confusion: auto-paused sources still show as "active" in the filter because active boolean remains true.

  • Triggered when rolling_success_rate < auto_pause_threshold (default 0.2/20%)
  • Clears when rate recovers to auto_resume_threshold (default 0.6/60%)
  • Manual reset via "Reset to Active Status" button calls SourceHealthReset

View Usage

Index filters:

  • health_status_eq dropdown: "Healthy", "Warning", "Declining", "Critical" (missing Auto-Paused, Improving)
  • active_eq toggle: "Active" vs "Paused"

Health badge helper maps all 7 values to colored labels with interactive action menus for critical/declining/auto_paused.

Row partial: Shows "Paused" badge (amber) if !source.active?, otherwise shows health badge.

SourceHealthMonitor Decision Tree

ruby
if auto_paused_active?(auto_paused_until)
  "auto_paused"
elsif consecutive_failures(logs) >= 3
  "declining"
elsif improving_streak?(logs)
  "improving"
elsif rate >= 0.8
  "healthy"
elsif rate >= 0.5
  "warning"
else
  "critical"
end

Relevant Patterns

  • Health transitions triggered after every fetch completion
  • Configurable thresholds in configuration/health_settings.rb
  • No enum/constant centralizing health status values — scattered in code
  • No CHECK constraint on health_status column
  • Import session uses separate "healthy"/"unhealthy" values

Risks

High:

  • Filter queries hardcode status strings in views and controller
  • Health badge helper has explicit case mapping for all 7 values
  • Interactive action menu checks specific status values
  • 9+ test files hardcode status strings

Medium:

  • Monitor logic decision tree embeds status values
  • Import session health check uses different values

Low:

  • No DB constraint on health_status (easy to change values)
  • Documentation not centralized (scattered)

Recommendations

Core insight: The confusion stems from conflating operational state with health diagnosis. auto_paused is treated as a health status but it's really an operational state.

Proposed simplification:

  1. Separate concerns: Auto-pause should control scheduling (operational), not mask health diagnosis
  2. Consolidate health statuses (7 → 4):
    • "healthy" + "warning" → "working"
    • "critical" → "failing"
    • "declining" (keep)
    • "improving" (keep)
    • Remove "auto_paused" from health_status (it's operational)
    • Remove "unknown"
  3. Auto-pause becomes visible in operational state:
    • Show "Paused (Auto)" in UI when auto_paused_until.future?
    • Health status stays at actual value (e.g., source is "failing AND auto-paused")
    • Filter "Active" correctly excludes auto-paused sources
  4. Update filters: Health dropdown shows only: Working, Declining, Improving, Failing
  5. Effort: Medium — 2-3 migrations, 5-6 view files, helper, 9+ test files, monitor logic