third_party/readability/test_extension/README.md
This is a Chromium extension that serves as a development and testing tool for Mozilla's Readability.js library.
It provides a suite of tools for developers and testers to analyze how Mozilla's Readability.js library processes web pages. This provides insight into the distillation algorithm.
Clone Page (Clone): Captures the DOM of the active webpage, processes
it into a self-contained HTML document, and opens it in a new tab. This
provides a static, offline-ready, and refreshable version of the page for
analysis.
Check Readerable (Readerable?): Performs a quick, preliminary check to
see if a page is likely to be distillable, based on Mozilla's
isProbablyReaderable() heuristic.
Distill Page (Distill): Replaces the content of the current tab with
the distilled article. This is useful for debugging layout transitions, as
it keeps the DevTools instance and any mobile simulation active. On a
cloned page, this happens in-place, and refreshing the tab reverts to the
original cloned state.
Distill Page (New Tab) (Distill New): Opens the distilled content in a
new tab. This is ideal for side-by-side comparisons with the original page
or for testing on dynamic pages where reloading is undesirable.
Extract Metadata (Metadata): Displays a table of the metadata
extracted by Readability.js (e.g., title, byline, excerpt) in a new tab.
Boxify DOM (Boxify): On a cloned page, this adds a black outline to
every single element. This effect can be undone by refreshing the page.
Visualize Readability (Visualize): On a cloned page, this feature
provides a visual debugging tool to see exactly how Readability processed
the page.
Context-Aware Menu: The extension's popup menu is dynamically generated based on the current page, showing only the relevant actions.
Get Required Files
This extension does not bundle the Readability.js library or the core
Chromium dom_distiller components. Before loading the extension, you must
make the necessary files available inside the extension's directory. From
the third_party/readability/test_extension/ directory, run the following
commands to create symbolic links:
# Link to (modded) Readability.js files
ln -s ../modded_src/Readability.js .
ln -s ../modded_src/Readability-readerable.js .
# Link to dom_distiller core files
ln -s ../../../components/dom_distiller/core/css/distilledpage_common.css .
ln -s ../../../components/dom_distiller/core/css/distilledpage_new.css .
ln -s ../../../components/dom_distiller/core/javascript/dom_distiller_viewer_main.js .
ln -s ../../../components/dom_distiller/core/javascript/content_processing.js .
ln -s ../../../components/dom_distiller/core/javascript/font_size_slider.js .
ln -s ../../../components/dom_distiller/core/javascript/image_classifier.js .
ln -s ../../../components/dom_distiller/core/javascript/list_classifier.js .
ln -s ../../../components/dom_distiller/core/javascript/pinch_handler.js .
ln -s ../../../components/dom_distiller/core/javascript/settings_dialog.js .
Load the Extension in Chrome
chrome://extensions.third_party/readability/test_extension directory.Clone: Opens a self-contained, static version of the page in a new
tab.Readerable?: Checks if the page is likely to be distillable and
shows the result in the popup.Distill: Replaces the current page's content with the distilled
article.Distill New: Opens the distilled content in a new tab.Metadata: Opens a new tab with a table of the extracted metadata.Readerable?: Checks if the page is likely to be distillable and
shows the result in the popup.Distill: Replaces the cloned page's content with the distilled
article in-place.Distill New: Opens the distilled content in a new tab.Metadata: Opens a new tab with a table of the extracted metadata.Boxify: Draws an outline around every element on the page.Visualize: Highlights the elements on the page to show which were
kept and which were discarded by the Readability algorithm.manifest.json: The extension's manifest file.background.js: The service worker that orchestrates script injection and
the creation of new pages.popup.html / popup.js: Defines the UI and logic for the extension's
dynamically generated popup menu.extractor.js: Injected to capture a webpage's DOM.distiller.js: A lightweight script that invokes the shared
article_processor.js.metadata_extractor.js: Injected to extract metadata.readability_checker.js: Injected to run the isProbablyReaderable()
check.cloned.html / cloned.js: The page and script for displaying a cloned
webpage and handling its specific actions, including in-place distillation
and visualization.viewer.html / viewer.js: The page and script for displaying a
distilled article.metadata.html / metadata.js / metadata.css: The page, script, and
styling for displaying the extracted metadata table.article_processor.js: A shared script that centralizes the main invocation
of Readability.parse() and the rendering of the final article HTML.article_renderer.js: A shared script used by both cloned.js and
viewer.js to render a distilled article, ensuring consistent output and
avoiding code duplication.metadata_processor.js: A shared script that centralizes the extraction of
metadata from a document.Readability.js: The core library being tested.Readability-readerable.js: The standalone script for the
isProbablyReaderable() check.distilledpage_common.css / distilledpage_new.css: Core CSS files for
styling the distilled content view.dom_distiller_viewer_main.js and related files: Core JavaScript modules
from Chromium that handle rendering and user interaction (e.g., font size,
themes, pinch-to-zoom) with the distilled content.