.vbw-planning/milestones/ui-fixes-and-smart-scraping/phases/04-smart-scrape-recommendations/04-PLAN-04.md
Build a "test scrape" feature that picks one recent item from a source, scrapes it on-demand, and displays a comparison page showing feed word count vs scraped word count. This lets users validate whether enabling scraping would improve content quality before committing.
Files:
app/controllers/source_monitor/source_scrape_tests_controller.rbDescription: Create a new controller following CRUD-everything pattern. This is a singular nested resource under sources (one test at a time):
module SourceMonitor
class SourceScrapeTestsController < ApplicationController
before_action :set_source
def create
item = pick_test_item
unless item
handle_no_item
return
end
result = SourceMonitor::Scraping::ItemScraper.new(item: item, source: @source).call
@test_result = {
item: item.reload,
scrape_result: result,
feed_word_count: item.item_content&.feed_word_count,
scraped_word_count: item.item_content&.scraped_word_count,
feed_content_preview: item.content.to_s.truncate(500),
scraped_content_preview: item.item_content&.scraped_content.to_s.truncate(500),
improvement: compute_improvement(item)
}
respond_to do |format|
format.turbo_stream do
render turbo_stream: turbo_stream.replace(
"scrape_test_result_#{@source.id}",
partial: "source_monitor/source_scrape_tests/result",
locals: { source: @source, test_result: @test_result }
)
end
format.html { render :show }
end
end
private
def set_source
@source = Source.find(params[:source_id])
end
def pick_test_item
@source.items
.joins(:item_content)
.where.not(sourcemon_item_contents: { feed_word_count: nil })
.order(published_at: :desc)
.first
end
def handle_no_item
respond_to do |format|
format.turbo_stream do
responder = SourceMonitor::TurboStreams::StreamResponder.new
responder.toast(message: "No items with feed content available for test scrape.", level: :warning)
render turbo_stream: responder.render(view_context)
end
format.html do
redirect_to source_monitor.source_path(@source), alert: "No items available for test scrape."
end
end
end
def compute_improvement(item)
feed = item.item_content&.feed_word_count.to_i
scraped = item.item_content&.scraped_word_count.to_i
return 0 if feed.zero?
((scraped - feed).to_f / feed * 100).round(1)
end
end
end
Tests:
test/controllers/source_monitor/source_scrape_tests_controller_test.rb:
Files:
config/routes.rbDescription: Add singular nested resource under sources (CRUD pattern):
resources :sources do
# ... existing nested resources ...
resource :scrape_test, only: :create, controller: "source_scrape_tests"
end
Tests:
POST /sources/:source_id/scrape_test routes to source_scrape_tests#createFiles:
app/views/source_monitor/source_scrape_tests/_result.html.erbDescription: Create a partial that displays the comparison between feed and scraped content:
<div id="scrape_test_result_<%= source.id %>" class="rounded-lg border border-slate-200 bg-white shadow-sm">
<div class="border-b border-slate-200 px-5 py-4">
<h3 class="text-lg font-medium">Scrape Test Result</h3>
<p class="mt-1 text-xs text-slate-500">
Tested item: "<%= test_result[:item].title.to_s.truncate(60) %>"
</p>
</div>
<div class="px-5 py-4">
<div class="grid grid-cols-2 gap-6">
<div>
<dt class="text-xs font-medium uppercase tracking-wide text-slate-500">Feed Word Count</dt>
<dd class="mt-1 text-2xl font-semibold text-slate-900"><%= test_result[:feed_word_count] || "N/A" %></dd>
</div>
<div>
<dt class="text-xs font-medium uppercase tracking-wide text-slate-500">Scraped Word Count</dt>
<dd class="mt-1 text-2xl font-semibold text-slate-900"><%= test_result[:scraped_word_count] || "N/A" %></dd>
</div>
</div>
<% if test_result[:improvement] && test_result[:improvement] != 0 %>
<div class="mt-4">
<% color = test_result[:improvement] > 0 ? "text-green-600" : "text-amber-600" %>
<span class="text-sm font-medium <%= color %>">
<%= test_result[:improvement] > 0 ? "+" : "" %><%= test_result[:improvement] %>% word count change
</span>
</div>
<% end %>
<% if test_result[:scrape_result]&.success? %>
<div class="mt-4 rounded-md bg-green-50 px-3 py-2 text-sm text-green-700">
Scrape successful. Enabling scraping for this source would capture more content.
</div>
<% else %>
<div class="mt-4 rounded-md bg-amber-50 px-3 py-2 text-sm text-amber-700">
Scrape had issues: <%= test_result[:scrape_result]&.message || "Unknown error" %>
</div>
<% end %>
</div>
</div>
Tests:
Files:
app/views/source_monitor/sources/_details.html.erb OR app/views/source_monitor/sources/show.html.erbDescription:
Add a "Test Scrape" button on the source show page, visible only when the source has scraping disabled. The button triggers POST /sources/:id/scrape_test via Turbo:
<% unless source.scraping_enabled? %>
<div id="scrape_test_result_<%= source.id %>" class="mt-4">
<%= button_to "Test Scrape",
source_monitor.source_scrape_test_path(source),
method: :post,
class: "inline-flex items-center rounded-md border border-violet-200 bg-violet-50 px-3 py-1.5 text-sm font-medium text-violet-700 hover:bg-violet-100",
data: { turbo_stream: true },
title: "Scrape a recent item to compare feed vs scraped word count" %>
</div>
<% end %>
The Turbo Stream response from the controller replaces this div with the result partial.
Tests: