Back to Source Monitor

Research: Simplify Source Status

.vbw-planning/milestones/ui-fixes-and-smart-scraping/phases/05-simplify-source-status/05-RESEARCH.md

0.13.04.3 KB
Original Source

Research: Simplify Source Status

Findings

Current Status/Health Fields on Source Model

Operational State:

  • active (boolean, default: true) — Controls whether source is actively fetched
  • fetch_status (string, default: "idle") — Async operation state (values: "idle", "queued", "fetching", "failed")

Health Monitoring:

  • health_status (string, default: "healthy") — Primary health indicator
  • health_status_changed_at (datetime) — Tracks when status last changed
  • rolling_success_rate (decimal, 0-1) — Moving average of fetch success
  • auto_paused_at (datetime) — When auto-pause was triggered
  • auto_paused_until (datetime) — When auto-pause expires
  • health_auto_pause_threshold (decimal, nullable) — Per-source override
  • failure_count (integer, default: 0)
  • last_error (text)
  • last_error_at (datetime)
  • consecutive_fetch_failures (integer, default: 0)

Current Health Status Values (7 total)

  1. "healthy" — Success rate >= 0.8 (80%)
  2. "warning" — Success rate 0.5-0.8 (50-80%) — "Needs Attention" in UI
  3. "critical" — Success rate < 0.5 — "Failing" in UI
  4. "declining" — 3+ consecutive fetch failures
  5. "improving" — 2+ consecutive successes after a failure
  6. "auto_paused" — Active pause window (auto_paused_until.future?)
  7. "unknown" — Fallback (rarely used)

Auto-Pause Mechanism

Auto-pause does NOT toggle active — it sets health_status = "auto_paused" while keeping active=true. This is the core confusion: auto-paused sources still show as "active" in the filter because active boolean remains true.

  • Triggered when rolling_success_rate < auto_pause_threshold (default 0.2/20%)
  • Clears when rate recovers to auto_resume_threshold (default 0.6/60%)
  • Manual reset via "Reset to Active Status" button calls SourceHealthReset

View Usage

Index filters:

  • health_status_eq dropdown: "Healthy", "Warning", "Declining", "Critical" (missing Auto-Paused, Improving)
  • active_eq toggle: "Active" vs "Paused"

Health badge helper maps all 7 values to colored labels with interactive action menus for critical/declining/auto_paused.

Row partial: Shows "Paused" badge (amber) if !source.active?, otherwise shows health badge.

SourceHealthMonitor Decision Tree

ruby
if auto_paused_active?(auto_paused_until)
  "auto_paused"
elsif consecutive_failures(logs) >= 3
  "declining"
elsif improving_streak?(logs)
  "improving"
elsif rate >= 0.8
  "healthy"
elsif rate >= 0.5
  "warning"
else
  "critical"
end

Relevant Patterns

  • Health transitions triggered after every fetch completion
  • Configurable thresholds in configuration/health_settings.rb
  • No enum/constant centralizing health status values — scattered in code
  • No CHECK constraint on health_status column
  • Import session uses separate "healthy"/"unhealthy" values

Risks

High:

  • Filter queries hardcode status strings in views and controller
  • Health badge helper has explicit case mapping for all 7 values
  • Interactive action menu checks specific status values
  • 9+ test files hardcode status strings

Medium:

  • Monitor logic decision tree embeds status values
  • Import session health check uses different values

Low:

  • No DB constraint on health_status (easy to change values)
  • Documentation not centralized (scattered)

Recommendations

Core insight: The confusion stems from conflating operational state with health diagnosis. auto_paused is treated as a health status but it's really an operational state.

Proposed simplification:

  1. Separate concerns: Auto-pause should control scheduling (operational), not mask health diagnosis
  2. Consolidate health statuses (7 → 4):
    • "healthy" + "warning" → "working"
    • "critical" → "failing"
    • "declining" (keep)
    • "improving" (keep)
    • Remove "auto_paused" from health_status (it's operational)
    • Remove "unknown"
  3. Auto-pause becomes visible in operational state:
    • Show "Paused (Auto)" in UI when auto_paused_until.future?
    • Health status stays at actual value (e.g., source is "failing AND auto-paused")
    • Filter "Active" correctly excludes auto-paused sources
  4. Update filters: Health dropdown shows only: Working, Declining, Improving, Failing
  5. Effort: Medium — 2-3 migrations, 5-6 view files, helper, 9+ test files, monitor logic