Back to Source Monitor

.Context Dev

.vbw-planning/milestones/polish-and-reliability/phases/01-backend-fixes/.context-dev.md

0.13.018.0 KB
Original Source

Phase 01 Context

Goal

Not available

Codebase Map Available

Codebase mapping exists in .vbw-planning/codebase/. Key files:

  • ARCHITECTURE.md
  • CONCERNS.md
  • PATTERNS.md
  • DEPENDENCIES.md
  • STRUCTURE.md
  • CONVENTIONS.md
  • TESTING.md
  • STACK.md

Read CONVENTIONS.md, PATTERNS.md, STRUCTURE.md, and DEPENDENCIES.md first to bootstrap codebase understanding.

Changed Files (Delta)

  • .gitignore
  • .vbw-planning/config.json
  • .vbw-planning/milestones/default/phases/01-coverage-analysis-quick-wins/PLAN-01-SUMMARY.md
  • .vbw-planning/milestones/default/phases/01-coverage-analysis-quick-wins/PLAN-01.md
  • .vbw-planning/milestones/default/phases/01-coverage-analysis-quick-wins/PLAN-02-SUMMARY.md
  • .vbw-planning/milestones/default/phases/01-coverage-analysis-quick-wins/PLAN-02.md
  • .vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-01-SUMMARY.md
  • .vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-01.md
  • .vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-02-SUMMARY.md
  • .vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-02.md
  • .vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-03-SUMMARY.md
  • .vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-03.md
  • .vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-04-SUMMARY.md
  • .vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-04.md
  • .vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-05-SUMMARY.md
  • .vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-05.md
  • .vbw-planning/milestones/default/phases/03-large-file-refactoring/03-VERIFICATION-wave1.md
  • .vbw-planning/milestones/default/phases/03-large-file-refactoring/03-VERIFICATION.md
  • .vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-01-SUMMARY.md
  • .vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-01.md
  • .vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-02-SUMMARY.md
  • .vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-02.md
  • .vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-03-SUMMARY.md
  • .vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-03.md
  • .vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-04-SUMMARY.md
  • .vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-04.md
  • .vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/04-CONTEXT.md
  • .vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/PLAN-01-SUMMARY.md
  • .vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/PLAN-01.md
  • .vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/PLAN-02-SUMMARY.md
  • .vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/PLAN-02.md
  • .vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/PLAN-03-SUMMARY.md
  • .vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/PLAN-03.md
  • .vbw-planning/milestones/default/ROADMAP.md
  • .vbw-planning/milestones/default/STATE.md
  • .vbw-planning/phases/01-aia-certificate-resolution/.context-dev.md
  • .vbw-planning/phases/01-aia-certificate-resolution/PLAN-01-SUMMARY.md
  • .vbw-planning/phases/01-aia-certificate-resolution/PLAN-01.md
  • .vbw-planning/phases/01-aia-certificate-resolution/PLAN-02-SUMMARY.md
  • .vbw-planning/phases/01-aia-certificate-resolution/PLAN-02.md
  • .vbw-planning/phases/01-aia-certificate-resolution/PLAN-03-SUMMARY.md
  • .vbw-planning/phases/01-aia-certificate-resolution/PLAN-03.md
  • .vbw-planning/phases/02-test-performance/.context-dev.md
  • .vbw-planning/phases/02-test-performance/.context-lead.md
  • .vbw-planning/phases/02-test-performance/.context-qa.md
  • .vbw-planning/phases/02-test-performance/02-RESEARCH.md
  • .vbw-planning/phases/02-test-performance/02-VERIFICATION.md
  • .vbw-planning/phases/02-test-performance/PLAN-01-SUMMARY.md
  • .vbw-planning/phases/02-test-performance/PLAN-01.md
  • .vbw-planning/phases/02-test-performance/PLAN-02-SUMMARY.md
  • .vbw-planning/phases/02-test-performance/PLAN-02.md
  • .vbw-planning/phases/02-test-performance/PLAN-03-SUMMARY.md
  • .vbw-planning/phases/02-test-performance/PLAN-03.md
  • .vbw-planning/phases/02-test-performance/PLAN-04-SUMMARY.md
  • .vbw-planning/phases/02-test-performance/PLAN-04.md
  • .vbw-planning/ROADMAP.md
  • .vbw-planning/STATE.md
  • CLAUDE.md

Code Slices

.gitignore (32 lines)

/.bundle/
/doc/
/log/*.log
/pkg/
/tmp/
/node_modules/
/coverage/
/test/dummy/db/*.sqlite3
/test/dummy/db/*.sqlite3-*
/test/dummy/log/*.log*
/test/dummy/storage/
/test/dummy/tmp/
/test/tmp/
/test/lib/tmp/install_generator/config/routes.rb
/app/assets/builds/*
!/app/assets/builds/.keep
!/app/assets/builds/source_monitor/
.vbw-planning/.cost-ledger.json
.vbw-planning/.notification-log.jsonl
.vbw-planning/.session-log.jsonl
.vbw-planning/.hook-errors.log
.vbw-planning/.claude-md-migrated
.vbw-planning/.watchdog-pid
.vbw-planning/.watchdog.log
.vbw-planning/.agent-pids
.vbw-planning/.agent-panes
.vbw-planning/.active-agent
.vbw-planning/.active-agent-count
.vbw-planning/.todo-flat-migrated
/codebase_analysis.md
*.gem
.vbw-worktrees/

.vbw-planning/config.json (44 lines)

{
  "effort": "thorough",
  "autonomy": "standard",
  "auto_commit": true,
  "planning_tracking": "manual",
  "auto_push": "never",
  "verification_tier": "standard",
  "skill_suggestions": true,
  "auto_install_skills": false,
  "discovery_questions": true,
  "context_compiler": true,
  "visual_format": "unicode",
  "max_tasks_per_plan": 5,
  "prefer_teams": "always",
  "branch_per_milestone": false,
  "plain_summary": true,
  "active_profile": "default",
  "custom_profiles": {},
  "model_profile": "quality",
  "model_overrides": {},
  "agent_max_turns": {
    "scout": 15,
    "qa": 25,
    "architect": 30,
    "debugger": 80,
    "lead": 50,
    "dev": 75
  },
  "qa_skip_agents": [
    "docs"
  ],
  "worktree_isolation": "on",
  "token_budgets": false,
  "two_phase_completion": false,
  "metrics": false,
  "smart_routing": false,
  "validation_gates": false,
  "snapshot_resume": false,
  "lease_locks": false,
  "event_recovery": false,
  "monorepo_routing": false,
  "rolling_summary": false,
  "compaction_trigger": 130000
}

.vbw-planning/ROADMAP.md (72 lines, first 30 shown)

# Roadmap

## Milestone: polish-and-reliability

### Phases

1. [ ] **Backend Fixes** -- Fix browser User-Agent default, health check status transitions, and smarter scrape rate limiting
2. [ ] **Favicon Support** -- Automatically save source favicons via Active Storage with background fetch job
3. [ ] **Toast Stacking** -- Cap visible toast notifications with hover-to-expand for bulk operation UX

### Phase Details

#### Phase 1: Backend Fixes

**Goal:** Fix three independent backend issues: bot-blocked feeds due to User-Agent, health check not updating status, and overly aggressive scrape limiting.

**Requirements:**
- REQ-UA-01: Change default User-Agent from "SourceMonitor/VERSION" to a browser-like string
- REQ-HC-01: After a successful manual health check on a declining/critical/warning source, trigger SourceHealthMonitor re-evaluation or directly transition status to "improving"
- REQ-SL-01: Refine max_in_flight_per_source to only count actively-running scrape jobs (not queued ones)

**Success Criteria:**
- [ ] Default UA string resembles a real browser (e.g., Mozilla/5.0 compatible)
- [ ] Successful manual health check on a declining source transitions it to improving
- [ ] Scrape limit counts only actively-running jobs, queued items don't count toward the cap
- [ ] All existing tests pass, new tests cover changed behavior
- [ ] RuboCop zero offenses, Brakeman zero warnings

#### Phase 2: Favicon Support

.vbw-planning/STATE.md (30 lines)

# State

## Current Position

- **Milestone:** polish-and-reliability
- **Phase:** 1 -- Backend Fixes
- **Status:** Planned
- **Progress:** 0%
- **Plans:** 3

## Decisions

| Decision | Date | Context |
|----------|------|---------|
| Active Storage for favicons | 2026-02-20 | has_one_attached with guard, consistent with ItemContent pattern |
| Smarter scrape limit | 2026-02-20 | Count only running jobs, not queued; keeps safety but removes false bottleneck |
| Browser-like default UA | 2026-02-20 | Simple global fix for bot-blocked feeds like Uber |
| Health check triggers status update | 2026-02-20 | Successful manual health check should transition declining -> improving |
| Toast cap + hover expand | 2026-02-20 | Max 3 visible, +N more badge, hover to see all |

## Todos

## Metrics

- **Started:** 2026-02-20
- **Phases:** 3
- **Tests at start:** 1033

## Blockers
None

CLAUDE.md (224 lines, first 30 shown)

# SourceMonitor

**Core value:** Drop-in Rails engine for feed monitoring, content scraping, and operational dashboards.

## Active Context

**Milestone:** polish-and-reliability
**Phase:** 1 -- Backend Fixes (pending planning)
**Last shipped:** aia-ssl-fix (2026-02-20) -- 2 phases, 7 plans, 8 commits
**Previous:** upgrade-assurance (2026-02-13), generator-enhancements (2026-02-12)

## Key Decisions

- Keep PostgreSQL-only for now
- Keep host-app auth model
- Ruby autoload for lib/ modules (not Zeitwerk)
- PG parallel fork segfault resolved: switched to thread-based parallelism in aia-ssl-fix milestone

## Installed Skills

- agent-browser (global)
- flowdeck (global)
- ralph-tui-create-json (global)
- ralph-tui-prd (global)
- vastai (global)
- find-skills (global)

## Learned Patterns

- Sub-module extraction: create `module/submodule.rb` with `require_relative`, lazy accessors, forwarding methods for backward compat

Active Plan


phase: 1 plan: 1 title: "HTTP Client Hardening" wave: 1 depends_on: [] must_haves:

  • DEFAULT_USER_AGENT changed to "Mozilla/5.0 (compatible; SourceMonitor/VERSION)" in http.rb
  • HTTPSettings#default_user_agent returns same polite-bot string
  • Accept header prepends text/html
  • Accept-Language and DNT headers added to default_headers
  • FeedFetcher#request_headers sends Referer from source.website_url
  • All existing http_test.rb assertions updated for new header values
  • New tests for Accept-Language, DNT, and Referer headers
  • bin/rails test passes, bin/rubocop zero offenses

Plan 01: HTTP Client Hardening

Objective

Update default HTTP headers to reduce bot-blocking: browser-like User-Agent, broader Accept, Accept-Language, DNT, and per-source Referer header.

Context

  • @lib/source_monitor/http.rb -- DEFAULT_USER_AGENT (line 17), default_headers (lines 89-97)
  • @lib/source_monitor/configuration/http_settings.rb -- default_user_agent (lines 44-46)
  • @lib/source_monitor/fetching/feed_fetcher.rb -- request_headers (lines 104-111)
  • @test/lib/source_monitor/http_test.rb -- header assertions (lines 92-97, 111)

REQ-UA-01: Change default User-Agent from "SourceMonitor/VERSION" to a browser-like string.

Tasks

Task 1: Update User-Agent default

Files: lib/source_monitor/http.rb, lib/source_monitor/configuration/http_settings.rb

  1. In http.rb line 17, change DEFAULT_USER_AGENT to "Mozilla/5.0 (compatible; SourceMonitor/#{SourceMonitor::VERSION})"
  2. In http_settings.rb line 45, update default_user_agent to return the same string: "Mozilla/5.0 (compatible; SourceMonitor/#{SourceMonitor::VERSION})"

Task 2: Add Accept-Language, DNT, and broader Accept

Files: lib/source_monitor/http.rb

In default_headers method (lines 89-97):

  1. Change Accept value to: "text/html, application/rss+xml, application/atom+xml, application/json;q=0.9, text/xml;q=0.8"
  2. Add "Accept-Language" => "en-US,en;q=0.9"
  3. Add "DNT" => "1"

Task 3: Add Referer header in FeedFetcher

Files: lib/source_monitor/fetching/feed_fetcher.rb

In request_headers method (lines 104-111), after transforming custom_headers:

  • Add headers["Referer"] = source.website_url if source.website_url.present?
  • Must go before the conditional cache headers so per-source custom_headers can still override

Task 4: Update existing tests and add new tests

Files: test/lib/source_monitor/http_test.rb, test/lib/source_monitor/fetching/feed_fetcher_test.rb

In http_test.rb:

  1. Update "allows overriding headers while preserving defaults" (line 96) -- Assert new Accept value with text/html prefix
  2. Add test: "includes Accept-Language and DNT in default headers" -- assert en-US,en;q=0.9 and "1"
  3. Add test: "default user agent is browser-like" -- assert includes Mozilla/5.0 and SourceMonitor/

In feed_fetcher_test.rb (or create new test section):

  1. Add test: "request_headers includes Referer from source website_url" -- create source with website_url, verify Referer in headers
  2. Add test: "request_headers omits Referer when website_url is blank" -- create source without website_url, verify no Referer

Files

ActionPath
MODIFYlib/source_monitor/http.rb
MODIFYlib/source_monitor/configuration/http_settings.rb
MODIFYlib/source_monitor/fetching/feed_fetcher.rb
MODIFYtest/lib/source_monitor/http_test.rb
MODIFYtest/lib/source_monitor/fetching/feed_fetcher_test.rb

Verification

bash
bin/rails test test/lib/source_monitor/http_test.rb test/lib/source_monitor/fetching/feed_fetcher_test.rb
bin/rubocop lib/source_monitor/http.rb lib/source_monitor/configuration/http_settings.rb lib/source_monitor/fetching/feed_fetcher.rb

Success Criteria

  • Default UA contains Mozilla/5.0 and SourceMonitor/
  • Accept header starts with text/html
  • Accept-Language and DNT present in every default request
  • Referer sent when source has website_url, omitted when blank
  • Per-source custom_headers still override all defaults
  • All tests pass, zero RuboCop offenses

Research Findings

Phase 1: Backend Fixes — Research

Item 1: HTTP Client Hardening

Key Files

  • lib/source_monitor/http.rbDEFAULT_USER_AGENT, default_headers, client, configure_request
  • lib/source_monitor/configuration/http_settings.rbHTTPSettings with user_agent, headers accessors
  • lib/source_monitor/fetching/feed_fetcher.rbrequest_headers (per-source custom_headers merge)
  • test/lib/source_monitor/http_test.rb — header merge/override tests

Current Behavior

  • DEFAULT_USER_AGENT = "SourceMonitor/#{SourceMonitor::VERSION}" (line 17 of http.rb)
  • default_headers builds: User-Agent, Accept (RSS-only), Accept-Encoding (line 89-97)
  • configure_request merges: default_headers(settings).merge(headers) (line 60) — passed headers override defaults
  • FeedFetcher#request_headers starts from source.custom_headers, adds If-None-Match/If-Modified-Since
  • HTTPSettings#default_user_agent returns same "SourceMonitor/#{VERSION}" string
  • user_agent accessor supports callables (lambda/proc) via resolve_callable

Change Plan

  • Change HTTPSettings#default_user_agent to return polite bot string: "Mozilla/5.0 (compatible; SourceMonitor/#{VERSION})"
  • Add Accept-Language: en-US,en;q=0.9 and DNT: 1 to default_headers
  • Add Referer header in FeedFetcher#request_headers using source.website_url
  • Update default_headers Accept to prepend text/html: "text/html, application/rss+xml, ..."
  • Update tests asserting on DEFAULT_USER_AGENT and Accept values

Item 2: Health Check Status Transition

Key Files

  • app/jobs/source_monitor/source_health_check_job.rbperform, broadcast_outcome
  • lib/source_monitor/health/source_health_check.rbcall, create_log, successful_status?
  • lib/source_monitor/health/source_health_monitor.rbcall, determine_status, improving_streak?
  • lib/source_monitor/health.rbsetup!, register_fetch_callback (after_fetch_completed wiring)
  • app/jobs/source_monitor/fetch_feed_job.rbperform(source_id, force: true) to enqueue full fetch

Current Behavior

  • SourceHealthCheckJob#perform does HTTP GET, creates HealthCheckLog, broadcasts toast — never updates health_status
  • SourceHealthMonitor runs via after_fetch_completed callback — computes rolling success rate from fetch_logs
  • health_status values: healthy, warning, critical, improving, declining, auto_paused (free string, no enum)
  • improving_streak? requires 2+ consecutive successes with prior failure in window
  • FetchFeedJob.perform_later(source.id, force: true) enqueues immediate full fetch bypassing should_run?

Change Plan

  • After successful health check on degraded source, enqueue FetchFeedJob.perform_later(source.id, force: true)
  • Degraded = health_status in %w[declining critical warning]
  • Full fetch creates real fetch_log → triggers SourceHealthMonitor → natural status transition
  • Add check in SourceHealthCheckJob#perform after broadcast_outcome

Item 3: Scrape Rate Limiting

Key Files

  • lib/source_monitor/configuration/scraping_settings.rbDEFAULT_MAX_IN_FLIGHT = 25, reset!
  • lib/source_monitor/scraping/enqueuer.rbrate_limit_exhausted? (line 108-114)
  • lib/source_monitor/scraping/state.rbIN_FLIGHT_STATUSES = %w[pending processing]
  • lib/source_monitor/scraping/bulk_source_scraper.rb — loop with break on rate_limited (line 75-124)
  • lib/source_monitor/scraping/bulk_result_presenter.rb — message builder (line 42-75)
  • test/lib/source_monitor/scraping/bulk_source_scraper_test.rb — rate limit test

Current Behavior

  • DEFAULT_MAX_IN_FLIGHT = 25 — counts pending + processing items
  • When limit hit: enqueuer returns :rate_limited, bulk scraper breaks loop, presenter shows message
  • Setting to nil already disables the limit
  • normalize_numeric ensures only positive integers or nil

Change Plan

  • Change DEFAULT_MAX_IN_FLIGHT = nil (was 25)
  • Config option remains: users can set their own limit
  • Rate limit check, presenter, tests all already handle nil correctly
  • Update tests that depend on default being 25