.vbw-planning/milestones/07-rails-audit-and-refactoring/02-model-layer-hardening/03-PLAN.md
Extract Source.scrape_candidates raw SQL subquery into a dedicated Query Object, following the existing StatsQuery pattern and eliminating inline raw SQL from the Source model.
@app/models/source_monitor/source.rb -- scrape_candidates class method (lines 64-80) contains raw SQL subquery with string interpolation@lib/source_monitor/dashboard/queries/stats_query.rb -- existing query object pattern to followdef scrape_candidates(threshold: SourceMonitor.config.scraping.scrape_recommendation_threshold)
threshold_value = threshold.to_i
return none if threshold_value <= 0
active
.where(scraping_enabled: false)
.where("#{table_name}.id IN (SELECT i.source_id FROM #{Item.table_name} i ...)")
end
Files: lib/source_monitor/queries/scrape_candidates_query.rb (new file)
Create query object following project conventions:
# frozen_string_literal: true
module SourceMonitor
module Queries
class ScrapeCandidatesQuery
def initialize(threshold: SourceMonitor.config.scraping.scrape_recommendation_threshold)
@threshold = threshold.to_i
end
def call
return SourceMonitor::Source.none if @threshold <= 0
SourceMonitor::Source.active
.where(scraping_enabled: false)
.where(id: source_ids_below_threshold)
end
private
def source_ids_below_threshold
SourceMonitor::Item
.joins(:item_content)
.where.not(SourceMonitor::ItemContent.table_name => { feed_word_count: nil })
.group(:source_id)
.having("AVG(#{SourceMonitor::ItemContent.table_name}.feed_word_count) < ?", @threshold)
.select(:source_id)
end
end
end
end
Key improvements:
joins, group, having, select? for threshold instead of string interpolationwhere(id: subquery) generates WHERE id IN (SELECT ...) via ActiveRecordTests: New test file Acceptance: Returns same results as original implementation
Files: lib/source_monitor.rb, lib/source_monitor/queries.rb (new file)
Create module file lib/source_monitor/queries.rb:
# frozen_string_literal: true
module SourceMonitor
module Queries
autoload :ScrapeCandidatesQuery, "source_monitor/queries/scrape_candidates_query"
end
end
Add autoload in lib/source_monitor.rb (in the autoload section):
autoload :Queries, "source_monitor/queries"
Acceptance: SourceMonitor::Queries::ScrapeCandidatesQuery is loadable
Files: app/models/source_monitor/source.rb
Replace the existing scrape_candidates method body (lines 64-80) with:
def scrape_candidates(threshold: SourceMonitor.config.scraping.scrape_recommendation_threshold)
SourceMonitor::Queries::ScrapeCandidatesQuery.new(threshold:).call
end
This keeps the same public API -- existing callers don't need to change.
Tests: Existing tests should pass unchanged
Acceptance: Source.scrape_candidates returns identical results
Files: test/lib/source_monitor/queries/scrape_candidates_query_test.rb (new file)
Tests:
test "returns sources with avg feed word count below threshold" -- create source with items having low word count ItemContent, query with threshold above avg, assert source includedtest "excludes sources above threshold" -- create source with items having high word count, query with low threshold, assert source excludedtest "excludes sources with scraping enabled" -- create source with scraping_enabled: true and low word count, assert excludedtest "excludes inactive sources" -- create inactive source with low word count, assert excludedtest "returns none for zero or negative threshold" -- assert empty relationUse create_source! factory + create items with ItemContent records.
Acceptance: All 5 tests pass
| Action | Path |
|---|---|
| CREATE | lib/source_monitor/queries/scrape_candidates_query.rb |
| CREATE | lib/source_monitor/queries.rb |
| MODIFY | lib/source_monitor.rb |
| MODIFY | app/models/source_monitor/source.rb |
| CREATE | test/lib/source_monitor/queries/scrape_candidates_query_test.rb |
bin/rails test test/lib/source_monitor/queries/scrape_candidates_query_test.rb
bin/rails test test/models/source_monitor/source_test.rb
bin/rubocop lib/source_monitor/queries/ app/models/source_monitor/source.rb