Back to Source Monitor

.Context Dev

.vbw-planning/milestones/ui-fixes-and-smart-scraping/phases/03-dashboard-pagination/.context-dev.md

0.13.014.8 KB
Original Source

Phase 03 Context

Goal

Not available

Codebase Map Available

Codebase mapping exists in .vbw-planning/codebase/. Key files:

  • ARCHITECTURE.md
  • CONCERNS.md
  • PATTERNS.md
  • DEPENDENCIES.md
  • STRUCTURE.md
  • CONVENTIONS.md
  • TESTING.md
  • STACK.md

Read CONVENTIONS.md, PATTERNS.md, STRUCTURE.md, and DEPENDENCIES.md first to bootstrap codebase understanding.

Changed Files (Delta)

  • .vbw-planning/discovery.json
  • .vbw-planning/STATE.md
  • app/assets/builds/source_monitor/application.css
  • Gemfile.lock
  • test/dummy/Gemfile.lock

Code Slices

.vbw-planning/discovery.json (156 lines, first 30 shown)

{
  "answered": [
    {
      "question": "What matters most in the conventions cleanup?",
      "answer": "All of the above: Model conventions, Controller patterns, Dead code removal",
      "category": "scope",
      "phase": "4",
      "date": "2026-02-10"
    },
    {
      "question": "How should we handle convention violations that would change public API behavior?",
      "answer": "Fix everything -- rename/restructure even if it changes method signatures or route patterns",
      "category": "api-policy",
      "phase": "4",
      "date": "2026-02-10"
    },
    {
      "question": "Favicon discovery strategy?",
      "answer": "Multi-strategy cascade: /favicon.ico -> HTML parsing (full GET, Nokogiri, prefer largest) -> Google Favicon API. Skip DuckDuckGo.",
      "area": "favicon-discovery",
      "phase": "02",
      "date": "2026-02-20"
    },
    {
      "question": "How to handle downloaded favicons before storage?",
      "answer": "Store raw original via Active Storage, define two variants: 32x32 (standard) and 64x64 (retina). SVGs stored as-is AND rasterized to PNG.",
      "area": "image-processing",
      "phase": "02",
      "date": "2026-02-20"
    },

.vbw-planning/STATE.md (25 lines)

# State

**Project:** SourceMonitor
**Milestone:** ui-fixes-and-smart-scraping
**Phase:** 03 (Dashboard Pagination)
**Plans:** 4 planned, 0 complete
**Progress:** 50%
**Status:** Planned

## Decisions

| Decision | Date | Context |
|----------|------|---------|
| Active Storage for favicons | 2026-02-20 | has_one_attached with guard, consistent with ItemContent pattern |
| Smarter scrape limit | 2026-02-20 | Count only running jobs, not queued; keeps safety but removes false bottleneck |
| Browser-like default UA | 2026-02-20 | Simple global fix for bot-blocked feeds like Uber |
| Health check triggers status update | 2026-02-20 | Successful manual health check should transition declining -> improving |
| Toast cap + hover expand | 2026-02-20 | Max 3 visible, +N more badge, hover to see all |

## Todos

- [x] Fix deprecation: `rails/tasks/statistics.rake` removed from Rakefile (2026-02-21)

## Blockers
None

app/assets/builds/source_monitor/application.css (2179 lines, first 30 shown)

*, ::before, ::after {
  --tw-border-spacing-x: 0;
  --tw-border-spacing-y: 0;
  --tw-translate-x: 0;
  --tw-translate-y: 0;
  --tw-rotate: 0;
  --tw-skew-x: 0;
  --tw-skew-y: 0;
  --tw-scale-x: 1;
  --tw-scale-y: 1;
  --tw-pan-x:  ;
  --tw-pan-y:  ;
  --tw-pinch-zoom:  ;
  --tw-scroll-snap-strictness: proximity;
  --tw-gradient-from-position:  ;
  --tw-gradient-via-position:  ;
  --tw-gradient-to-position:  ;
  --tw-ordinal:  ;
  --tw-slashed-zero:  ;
  --tw-numeric-figure:  ;
  --tw-numeric-spacing:  ;
  --tw-numeric-fraction:  ;
  --tw-ring-inset:  ;
  --tw-ring-offset-width: 0px;
  --tw-ring-offset-color: #fff;
  --tw-ring-color: rgb(59 130 246 / 0.5);
  --tw-ring-offset-shadow: 0 0 #0000;
  --tw-ring-shadow: 0 0 #0000;
  --tw-shadow: 0 0 #0000;
  --tw-shadow-colored: 0 0 #0000;

Gemfile.lock (426 lines, first 30 shown)

PATH
  remote: .
  specs:
    source_monitor (0.10.2)
      cssbundling-rails (~> 1.4)
      faraday (~> 2.9)
      faraday-follow_redirects (~> 0.4)
      faraday-gzip (~> 3.0)
      faraday-retry (~> 2.2)
      feedjira (>= 3.2, < 5.0)
      jsbundling-rails (~> 1.3)
      nokolexbor (~> 0.5)
      rails (>= 8.0.3, < 10.0)
      ransack (~> 4.2)
      ruby-readability (~> 0.7)
      solid_cable (>= 3.0, < 4.0)
      solid_queue (>= 0.3, < 3.0)
      turbo-rails (~> 2.0)

GEM
  remote: https://rubygems.org/
  specs:
    action_text-trix (2.1.16)
      railties
    actioncable (8.1.2)
      actionpack (= 8.1.2)
      activesupport (= 8.1.2)
      nio4r (~> 2.0)
      websocket-driver (>= 0.6.1)
      zeitwerk (~> 2.6)

test/dummy/Gemfile.lock (409 lines, first 30 shown)

PATH
  remote: ../..
  specs:
    source_monitor (0.10.2)
      cssbundling-rails (~> 1.4)
      faraday (~> 2.9)
      faraday-follow_redirects (~> 0.4)
      faraday-gzip (~> 3.0)
      faraday-retry (~> 2.2)
      feedjira (>= 3.2, < 5.0)
      jsbundling-rails (~> 1.3)
      nokolexbor (~> 0.5)
      rails (>= 8.0.3, < 10.0)
      ransack (~> 4.2)
      ruby-readability (~> 0.7)
      solid_cable (>= 3.0, < 4.0)
      solid_queue (>= 0.3, < 3.0)
      turbo-rails (~> 2.0)

GEM
  remote: https://rubygems.org/
  specs:
    action_text-trix (2.1.16)
      railties
    actioncable (8.1.2)
      actionpack (= 8.1.2)
      activesupport (= 8.1.2)
      nio4r (~> 2.0)
      websocket-driver (>= 0.6.1)
      zeitwerk (~> 2.6)

Active Plan


phase: "03" plan: "04" title: "Health Distribution Badge Counts on Dashboard" wave: 1 depends_on: [] must_haves:

  • "Health status distribution query added to StatsQuery"
  • "Badge counts rendered on dashboard stats section"
  • "Aggregate stays above the fold"
  • "Tests cover the query and rendering"

Plan 04: Health Distribution Badge Counts on Dashboard

Goal

Add health status distribution counts (Healthy N, Warning N, Declining N, Critical N) to the dashboard stats section. The distribution is computed via a new query in StatsQuery and rendered as inline badge counts below the existing stats cards.

Task 1: Add health status distribution to StatsQuery

What: Extend StatsQuery#call to include a health_distribution hash with counts per health status.

Files to modify:

  • lib/source_monitor/dashboard/queries/stats_query.rb

Implementation details:

  • Add health_distribution key to the returned hash
  • Query: SourceMonitor::Source.active.group(:health_status).count (returns { "healthy" => 42, "warning" => 3, ... })
  • This is a single SQL query: SELECT health_status, COUNT(*) FROM sources WHERE active = true GROUP BY health_status
  • Ensure all known statuses are present in the result (default to 0 for missing): %w[healthy warning declining critical].each_with_object({}) { |s, h| h[s] = raw_counts.fetch(s, 0) }
  • Only count active sources (inactive sources don't have meaningful health status)

Acceptance criteria:

  • stats[:health_distribution] returns { "healthy" => N, "warning" => N, "declining" => N, "critical" => N }
  • Zero-count statuses still appear in the hash with value 0
  • Single additional DB query (GROUP BY), not N+1

Task 2: Render health distribution badges on dashboard

What: Add a row of inline badge counts below the existing stats cards grid.

Files to modify:

  • app/views/source_monitor/dashboard/_stats.html.erb

Implementation details:

  • Below the existing grid of stat cards, add a <div> with inline flex badges
  • Each badge shows the health status label + count, using the same color scheme from HealthBadgeHelper:
    • Healthy: bg-green-100 text-green-700
    • Warning: bg-amber-100 text-amber-700
    • Declining: bg-orange-100 text-orange-700
    • Critical: bg-rose-100 text-rose-700
  • Badge format: <span class="inline-flex items-center gap-1 rounded-full px-3 py-1 text-xs font-semibold [color-classes]">Healthy <span class="font-bold">42</span></span>
  • Only render badges for statuses with count > 0 (don't show "Critical 0" if no critical sources)
  • If no active sources exist, don't render the badge row at all
  • The badge row uses id="source_monitor_dashboard_health_distribution" for Turbo Stream targeting

Acceptance criteria:

  • Badges appear below stats cards
  • Only non-zero statuses show badges
  • Colors match existing health status badge convention
  • Visible above the fold on standard viewport

Task 3: Update Dashboard::Queries metrics recording for health distribution

What: Record health distribution metrics via the existing instrumentation pattern.

Files to modify:

  • lib/source_monitor/dashboard/queries.rb

Implementation details:

  • In record_stats_metrics, add gauges for each health status count:
    • SourceMonitor::Metrics.gauge(:dashboard_stats_health_healthy, stats[:health_distribution]["healthy"])
    • Same for warning, declining, critical
  • This follows the existing pattern of recording stat values as gauges

Acceptance criteria:

  • Health distribution metrics are recorded when stats are computed
  • No new notification events needed (reuses existing stats instrumentation)

Task 4: Write tests for health distribution

What: Test the query and verify the badge rendering.

Files to create:

  • test/lib/source_monitor/dashboard/stats_query_test.rb

Test cases:

  • test "health_distribution counts active sources by health_status" -- Create sources with different health statuses, verify counts
  • test "health_distribution excludes inactive sources" -- Create inactive source with "critical" status, verify it's not counted
  • test "health_distribution includes zero for missing statuses" -- Only healthy sources exist, verify warning/declining/critical are 0
  • test "health_distribution handles no active sources" -- No active sources, verify all counts are 0

Acceptance criteria:

  • All tests pass
  • Tests use create_source! factory with explicit health_status values
  • Tests are isolated (scoped to test-created data)

File Disjointness (Wave 1)

This plan modifies:

  • lib/source_monitor/dashboard/queries/stats_query.rb
  • app/views/source_monitor/dashboard/_stats.html.erb
  • lib/source_monitor/dashboard/queries.rb (metrics recording only -- different methods than Plan 02's upcoming_fetch_schedule method)
  • test/lib/source_monitor/dashboard/stats_query_test.rb (NEW)

Conflict analysis with Plan 02: Both plans modify lib/source_monitor/dashboard/queries.rb. However:

  • Plan 02 modifies the upcoming_fetch_schedule method signature and its cache key
  • Plan 04 modifies the record_stats_metrics private method only
  • These are disjoint code regions within the same file. To be safe, Plan 04 should modify record_stats_metrics by appending lines (not restructuring), making merge trivial.

No overlap with Plan 01 (paginator, shared partial, application_helper) or Plan 03 (sources controller/index).

Research Findings


phase: "03" type: research title: "Dashboard Pagination Research" date: 2026-03-07

Phase 03 Research: Dashboard Pagination

Findings

Dashboard Schedule (Current State)

  • UpcomingFetchSchedule (127 lines) loads ALL active sources into memory, groups by fetch window in Ruby
  • 5 buckets: 0-30 min, 30-60 min, 60-120 min, 120-240 min, 240+ min
  • DashboardController#index calls Dashboard::Queries which caches results per-request
  • Dashboard view renders _fetch_schedule.html.erb partial with grouped data
  • No pagination or limits on schedule data — all active sources loaded

Existing Paginator

  • Pagination::Paginator (91 lines) at lib/source_monitor/pagination/paginator.rb
  • Result struct: records, page, per_page, has_next_page, has_previous_page
  • Offset-based: fetches (page-1)*per_page + 1 records to detect next page
  • Does NOT track total count or total pages — only has_next/has_previous
  • Supports both ActiveRecord::Relation and Array scopes
  • Default: 25/page, max: 100/page
  • Needs modification to add total_count and total_pages for jump-to-page

Sources Index Pagination

  • Controller uses Paginator.new(scope: @q.result, page: params[:page], per_page: params[:per_page])
  • View renders prev/next buttons at bottom within source_monitor_sources_table Turbo Frame
  • Preserves search params and per_page across pagination links
  • No page number display, no jump-to-page, no total count shown

Dashboard Stats

  • StatsQuery computes: total sources, active sources, failed sources, total items, fetches today
  • Health status distribution NOT currently computed — needs new query
  • Source model has health_status column with values: healthy, warning, declining, critical

Source Model Scopes

  • Source.active scope exists (where active: true)
  • health_status is a string column — can group by it
  • next_fetch_at is a datetime column — can use range queries for schedule buckets
  • No existing scope for fetch window bucketing

Turbo Frame Usage

  • Sources index wrapped in turbo_frame_tag "source_monitor_sources_table"
  • Dashboard uses Turbo Cable broadcasts for real-time updates
  • Schedule section does NOT use Turbo Frames currently — needs frames for independent pagination

N+1 Prevention

  • Sources index pre-computes avg word counts, activity rates via GROUP BY queries
  • Dashboard schedule currently loads source objects directly — minimal associations

Relevant Patterns

  1. Paginator reuse: Extend existing Paginator with optional total_count / total_pages — backward compatible
  2. AR scope per bucket: Source.active.where(next_fetch_at: now..now+30.minutes) for each window
  3. Turbo Frame per section: Each schedule bucket in its own frame for independent pagination
  4. Health distribution: Source.active.group(:health_status).count — single SQL query
  5. Pagination partial: Extract shared partial from sources index, reuse in dashboard schedule sections
  6. Empty bucket hiding: Conditional render only when bucket scope has records

Risks

  1. Paginator total_count: Adding COUNT(*) adds a second query per paginated section. For 5 schedule sections + 1 sources index = 6 count queries. Mitigate: count query is cheap on indexed columns.
  2. Dashboard query explosion: 5 independent schedule sections × 2 queries each (data + count) = 10 queries vs current 1. Mitigate: all are simple indexed WHERE on next_fetch_at.
  3. Turbo Frame state: Independent pagination per section means multiple page params. Need namespaced params (e.g., schedule_0_30_page=2).
  4. Schedule section ordering: With DB-level pagination, need to ensure consistent ordering within each bucket.

Recommendations

  1. Extend Paginator first: Add optional include_total: true flag to compute total pages. Keep backward compatible.
  2. Create schedule bucket scopes: Add class method or scope on Source for each fetch window.
  3. Extract pagination partial: Share between sources index and dashboard schedule.
  4. Add health distribution query: Simple group(:health_status).count to StatsQuery.
  5. Use Turbo Frames: One frame per schedule section for independent lazy pagination.
  6. Test with volume: Create system test with 100+ sources to verify performance.