AI Intelligence - Worldmonitor

AI Summarization

AI Summarization Chain

The World Brief is generated by a 4-tier provider chain that prioritizes local compute, falls back through cloud APIs, and degrades to browser-side inference as a last resort:

┌─────────────────────────────────────────────────────────────────┐
│                   Summarization Request                        │
│  (headlines deduplicated by Jaccard similarity > 0.6)          │
└───────────────────────┬─────────────────────────────────────────┘
                        │
                        ▼
┌─────────────────────────────────┐    timeout/error
│  Tier 1: Ollama / LM Studio    │──────────────┐
│  Local endpoint, no cloud       │               │
│  Auto-discovered model          │               │
└─────────────────────────────────┘               │
                                                  ▼
                                   ┌─────────────────────────────┐    timeout/error
                                   │  Tier 2: Groq               │──────────────┐
                                   │  Llama 3.1 8B, temp 0.3     │               │
                                   │  Fast cloud inference        │               │
                                   └─────────────────────────────┘               │
                                                                                 ▼
                                                                  ┌─────────────────────────────┐    timeout/error
                                                                  │  Tier 3: OpenRouter          │──────────────┐
                                                                  │  Multi-model fallback        │               │
                                                                  └─────────────────────────────┘               │
                                                                                                                ▼
                                                                                                 ┌──────────────────────────┐
                                                                                                 │  Tier 4: Browser T5      │
                                                                                                 │  Transformers.js (ONNX)  │
                                                                                                 │  No network required     │
                                                                                                 └──────────────────────────┘

All three API tiers (Ollama, Groq, OpenRouter) share a common handler factory (_summarize-handler.js) that provides identical behavior:

Headline deduplication — before sending to any LLM, headlines are compared pairwise using word-overlap similarity. Near-duplicates (>60% overlap) are merged, reducing the prompt by 20–40% and preventing the LLM from wasting tokens on repeated stories
Variant-aware prompting — the system prompt adapts to the active dashboard variant. Geopolitical summaries emphasize conflict escalation and diplomatic shifts; tech summaries focus on funding rounds and AI breakthroughs; finance summaries highlight market movements and central bank signals
Language-aware output — when the UI language is non-English, the prompt instructs the LLM to generate the summary in that language
Redis deduplication — summaries are cached with a composite key (summary:v3:{mode}:{variant}:{lang}:{hash}) so the same headlines viewed by 1,000 concurrent users trigger exactly one LLM call. Cache TTL is 24 hours
Graceful fallback — if a provider returns {fallback: true} (missing API key or endpoint unreachable), the chain silently advances to the next tier. Progress callbacks update the UI to show which provider is being attempted

The Ollama tier communicates via the OpenAI-compatible /v1/chat/completions endpoint, making it compatible with any local inference server that implements this standard (Ollama, LM Studio, llama.cpp server, vLLM, etc.).

Country Brief Pages

Clicking any country on the map opens a full-page intelligence dossier — a single-screen synthesis of all intelligence modules for that country. The brief is organized into a two-column layout:

Left column:

Instability Index — animated SVG score ring (0–100) with four component breakdown bars (Unrest, Conflict, Security, Information), severity badge, and trend indicator
Intelligence Brief — AI-generated analysis (Ollama local / Groq / OpenRouter, depending on configured provider) with inline citation anchors [1]–[8] that scroll to the corresponding news source when clicked
Top News — 8 most relevant headlines for the country, threat-level color-coded, with source and time-ago metadata

Right column:

Active Signals — real-time chip indicators for protests, military aircraft, naval vessels, internet outages, earthquakes, displacement flows, climate stress, conflict events, and the country's stock market index (1-week change)
7-Day Timeline — D3.js-rendered event chart with 4 severity-coded lanes (protest, conflict, natural, military), interactive tooltips, and responsive resizing
Prediction Markets — top 3 Polymarket contracts by volume with probability bars and external links
Infrastructure Exposure — pipelines, undersea cables, datacenters, military bases, nuclear facilities, and ports within a 600km radius of the country centroid, ranked by distance

Headline relevance filtering: each country has an alias map (e.g., US → ["united states", "american", "washington", "pentagon", "biden", "trump"]). Headlines are filtered using a negative-match algorithm — if another country's alias appears earlier in the headline title than the target country's alias, the headline is excluded. This prevents cross-contamination (e.g., a headline about Venezuela mentioning "Washington sanctions" appearing in the US brief).

Export options: briefs are exportable as JSON (structured data with all scores, signals, and headlines), CSV (flattened tabular format), or PNG image. A print button triggers the browser's native print dialog for PDF export.

Local-First Country Detection

Map clicks resolve to countries using a local geometry service rather than relying on network reverse-geocoding (Nominatim). The system loads a GeoJSON file containing polygon boundaries for ~200 countries and builds an indexed spatial lookup:

Bounding box pre-filter — each country's polygon(s) are wrapped in a bounding box ([minLon, minLat, maxLon, maxLat]). Points outside the bbox are rejected without polygon intersection testing.
Ray-casting algorithm — for points inside the bbox, a ray is cast from the point along the positive x-axis. The number of polygon edge intersections determines inside/outside status (odd = inside). Edge cases are handled: points on segment boundaries return true, and polygon holes are subtracted (a point inside an outer ring but also inside a hole is excluded).
MultiPolygon support — countries with non-contiguous territories (e.g., the US with Alaska and Hawaii, Indonesia with thousands of islands) use MultiPolygon geometries where each polygon is tested independently.

This approach provides sub-millisecond country detection entirely in the browser, with no network latency. The geometry data is preloaded at app startup and cached for the session. For countries not in the GeoJSON (rare), the system falls back to hardcoded rectangular bounding boxes, and finally to network reverse-geocoding as a last resort.

Deduction & Forecasting

AI Deduction & Forecasting

The Deduction Panel is an interactive AI geopolitical analysis tool that produces near-term timeline forecasts grounded in live intelligence data.

Request pipeline:

The analyst enters a free-text query (e.g., "What will happen in the next 24 hours in the Middle East?") and an optional geographic context field
Before submission, buildNewsContext() pulls the 15 most recent NewsItem titles from the live feed and prepends them as structured context ("Recent News:\n- Headline (Source)") — ensuring the LLM always has current situational awareness
The query is sent to the deductSituation RPC endpoint, which calls callLlm() with a provider fallback chain (Groq, OpenRouter, or any OpenAI-compatible endpoint via LLM_API_URL/LLM_API_KEY/LLM_MODEL env vars) with a system prompt instructing it to act as a "senior geopolitical intelligence analyst and forecaster"
Temperature is 0.3 (low, for analytic consistency), max 1,500 tokens. Chain-of-thought think tags are stripped as defense-in-depth
Results are cached in Redis for 1 hour by deduct:situation:v1:{hash(query|geoContext)} — identical queries serve instantly from cache

Cross-panel integration: Any panel can dispatch a wm:deduct-context custom DOM event with { query, geoContext, autoSubmit }, which pre-fills the Deduction Panel and optionally auto-submits. This enables contextual forecasting from any part of the dashboard — clicking "Analyze" on a theater posture card can automatically trigger a regional deduction. A 5-second cooldown prevents rapid re-submission.

The panel is lazy-loaded (import()) to exclude DOMPurify from the main bundle unless the panel is actually accessed, keeping the web bundle lean.

Memory & Classification

Client-Side Headline Memory (RAG)

The Headline Memory system provides browser-local Retrieval-Augmented Generation — a persistent semantic index of news headlines that runs entirely on the user's device.

Ingestion pipeline:

RSS Feed Parse → isHeadlineMemoryEnabled()? → ML Worker (Web Worker)
                                                    │
                                          ┌─────────┴──────────┐
                                          │  ONNX Embeddings   │
                                          │  all-MiniLM-L6-v2  │
                                          │  384-dim float32   │
                                          └─────────┬──────────┘
                                                    │
                                          ┌─────────┴──────────┐
                                          │  IndexedDB Store   │
                                          │  5,000 vector cap  │
                                          │  LRU by ingestAt   │
                                          └────────────────────┘

After each RSS feed fetch and parse, if Headline Memory is enabled and the embeddings model is loaded, each headline's title, publication date, source, URL, and location tags are sent to the ML Worker
The worker sanitizes text (strips control chars, truncates to 200 chars), embeds via the ONNX pipeline (pooling: 'mean', normalize: true), and deduplicates by content hash
Vectors are written to IndexedDB via a serialized promise queue (preventing concurrent transaction conflicts). When the 5,000-vector cap is exceeded, the oldest entries by ingestedAt are evicted

Search: Queries are embedded using the same model, then a full cursor scan computes cosine similarity against all stored vectors. Results are ranked by score, capped at topK (1–20), and filtered by minScore (0–1). Multiple query strings can be searched simultaneously (up to 5), with the max score per record across all queries used for ranking.

Opt-in mechanism: The setting defaults to false (stored as wm-headline-memory in localStorage). Enabling it triggers mlWorker.init() → loadModel('embeddings'). Disabling it unloads the model and optionally terminates the entire worker if no other ML features are active. The ai-flow-changed CustomEvent propagates toggle changes to all interested components.

Threat Classification Pipeline

Every news item passes through a three-stage classification pipeline:

Keyword classifier (instant, source: 'keyword') — pattern-matches against ~120 threat keywords organized by severity tier (critical → high → medium → low → info) and 14 event categories (conflict, protest, disaster, diplomatic, economic, terrorism, cyber, health, environmental, military, crime, infrastructure, tech, general). Keywords use word-boundary regex matching to prevent false positives (e.g., "war" won't match "award"). Each match returns a severity level, category, and confidence score. Variant-specific keyword sets ensure the tech variant doesn't flag "sanctions" in non-geopolitical contexts.
Browser-side ML (async, source: 'ml') — Transformers.js runs NER, sentiment analysis, and topic classification directly in the browser with no server dependency. Provides a second classification opinion without any API call.
LLM classifier (batched async, source: 'llm') — headlines are collected into a batch queue and fired as parallel classifyEvent RPCs via the sebuf proto client. Each RPC calls the configured LLM provider (Groq Llama 3.1 8B at temperature 0, or Ollama for local inference). Results are cached in Redis (24h TTL) keyed by headline hash. When 500-series errors occur, the LLM classifier automatically pauses its queue to avoid wasting API quota, resuming after an exponential backoff delay. When the LLM result arrives, it overrides the keyword result only if its confidence is higher.

This hybrid approach means the UI is never blocked waiting for AI — users see keyword results instantly, with ML and LLM refinements arriving within seconds and persisting for all subsequent visitors. Each classification carries its source tag (keyword, ml, or llm) so downstream consumers can weight confidence accordingly.

Alerts

Breaking News Alert Pipeline

The dashboard monitors five independent alert origins and fuses them into a unified breaking news stream with layered deduplication, cooldowns, and source quality gating:

Origin	Trigger	Example
RSS alert	News item with `isAlert: true` and threat level critical/high	Reuters flash: missile strike confirmed
Keyword spike	Trending keyword exceeds spike threshold	"nuclear" surges across 8+ feeds in 2 hours
Hotspot escalation	Hotspot escalation score exceeds critical threshold	Taiwan Strait tension crosses 80/100
Military surge	Theater posture assessment detects strike packaging	Tanker + AWACS + fighters co-present in MENA
OREF siren	Israel Home Front Command issues incoming rocket/missile alert	Rocket barrage detected in northern Israel

Anti-noise safeguards:

Per-event dedup — each alert is keyed by a content hash; repeated alerts for the same event are suppressed for 30 minutes
Global cooldown — after any alert fires, a 60-second global cooldown prevents rapid-fire notification bursts
Recency gate — items older than 15 minutes at processing time are silently dropped, preventing stale events from generating alerts after a reconnection
Source tier gating — Tier 3+ sources (niche outlets, aggregators) must have LLM-confirmed classification (threat.source !== 'keyword') to fire an alert; Tier 1–2 sources bypass this gate
User sensitivity control — configurable between critical-only (only critical severity fires) and critical-and-high (both critical and high severities)

When an alert passes all gates, the system dispatches a wm:breaking-news CustomEvent on document, which the Breaking News Banner consumes to display a persistent top-of-screen notification. Optional browser Notification API popups and an audio chime are available as user settings. Clicking the banner scrolls to the RSS panel that sourced the alert and applies a 1.5-second flash highlight animation.

Browser-Side ML

Browser-Side ML Pipeline

The dashboard runs a full ML pipeline in the browser via Transformers.js, with no server dependency for core intelligence. This is automatically disabled on mobile devices to conserve memory.

Capability	Model	Use
Text embeddings	sentence-similarity	Semantic clustering of news headlines
Sequence classification	threat-classifier	Threat severity and category detection
Summarization	T5-small	Last-resort fallback when Ollama, Groq, and OpenRouter are all unavailable
Named Entity Recognition	NER pipeline	Country, organization, and leader extraction

Hybrid clustering combines fast Jaccard similarity (n-gram overlap, threshold 0.4) with ML-refined semantic similarity (cosine similarity, threshold 0.78). Jaccard runs instantly on every refresh; semantic refinement runs when the ML worker is loaded and merges clusters that are textually different but semantically identical (e.g., "NATO expands missile shield" and "Alliance deploys new air defense systems").

News velocity is tracked per cluster — when multiple Tier 1–2 sources converge on the same story within a short window, the cluster is flagged as a breaking alert with sourcesPerHour as the velocity metric.

Browser-Based Machine Learning

For offline resilience and reduced API costs, the system includes browser-based ML capabilities using ONNX Runtime Web.

Available Models

Model	Task	Size	Use Case
T5-small	Text summarization	~60MB	Offline briefing generation
DistilBERT	Sentiment analysis	~67MB	News tone classification

Fallback Strategy

Browser ML serves as the final fallback when cloud APIs are unavailable:

User requests summary
    ↓
1. Try Groq API (fast, free tier)
    ↓ (rate limited or error)
2. Try OpenRouter API (fallback provider)
    ↓ (unavailable)
3. Use Browser T5 (offline, always available)

Lazy Loading

Models are loaded on-demand to minimize initial page load:

Models download only when first needed
Progress indicator shows download status
Once cached, models load instantly from IndexedDB

Worker Isolation

All ML inference runs in a dedicated Web Worker:

Main thread remains responsive during inference
30-second timeout prevents hanging
Automatic cleanup on errors

Limitations

Browser ML has constraints compared to cloud models:

Aspect	Cloud (Llama 3.3)	Browser (T5)
Context window	128K tokens	512 tokens
Output quality	High	Moderate
Inference speed	2-3 seconds	5-10 seconds
Offline support	No	Yes

Browser summarization is intentionally limited to 6 headlines × 80 characters to stay within model constraints.

AI Insights Panel

The Insights Panel provides AI-powered analysis of the current news landscape, transforming raw headlines into actionable intelligence briefings.

World Brief Generation

Every 2 minutes (with rate limiting), the system generates a concise situation brief using a multi-provider fallback chain:

Priority	Provider	Model	Latency	Use Case
1	Groq	Llama 3.3 70B	~2s	Primary provider (fast inference)
2	OpenRouter	Llama 3.3 70B	~3s	Fallback when Groq rate-limited
3	Browser	T5 (ONNX)	~5s	Offline fallback (local ML)

Caching Strategy: Redis server-side caching prevents redundant API calls. When the same headline set has been summarized recently, the cached result is returned immediately.

Focal Point Detection

The AI receives enriched context about focal points—entities that appear in both news coverage AND map signals. This enables intelligence-grade analysis:

[INTELLIGENCE SYNTHESIS]
FOCAL POINTS (entities across news + signals):
- IRAN [CRITICAL]: 12 news mentions + 5 map signals (military_flight, protest, internet_outage)
  KEY: "Iran protests continue..." | SIGNALS: military activity, outage detected
- TAIWAN [ELEVATED]: 8 news mentions + 3 map signals (military_vessel, military_flight)
  KEY: "Taiwan tensions rise..." | SIGNALS: naval vessels detected

Headline Scoring Algorithm

Not all news is equally important. Headlines are scored to identify the most significant stories for the briefing:

Score Boosters (high weight):

Military keywords: war, invasion, airstrike, missile, deployment, mobilization
Violence indicators: killed, casualties, clashes, massacre, crackdown
Civil unrest: protest, uprising, coup, riot, martial law

Geopolitical Multipliers:

Flashpoint regions: Iran, Russia, China, Taiwan, Ukraine, North Korea, Gaza
Critical actors: NATO, Pentagon, Kremlin, Hezbollah, Hamas, Wagner

Score Reducers (demoted):

Business context: CEO, earnings, stock, revenue, startup, data center
Entertainment: celebrity, movie, streaming

This ensures military conflicts and humanitarian crises surface above routine business news.

Sentiment Analysis

Headlines are analyzed for overall sentiment distribution:

Sentiment	Detection Method	Display
Negative	Crisis, conflict, death keywords	Red percentage
Positive	Agreement, growth, peace keywords	Green percentage
Neutral	Neither detected	Gray percentage

The overall sentiment balance provides a quick read on whether the news cycle is trending toward escalation or de-escalation.

Velocity Detection

Fast-moving stories are flagged when the same topic appears in multiple recent headlines:

Headlines are grouped by shared keywords and entities
Topics with 3+ mentions in 6 hours are marked as "high velocity"
Displayed separately to highlight developing situations

Focal Point Detector

The Focal Point Detector is the intelligence synthesis layer that correlates news entities with map signals to identify "main characters" driving current events.

The Problem It Solves

Without synthesis, intelligence streams operate in silos:

News feeds show 344 sources with thousands of headlines
Map layers display military flights, protests, outages independently
No automated way to see that IRAN appears in news AND has military activity AND an internet outage

How It Works

Entity Extraction: Extract countries, companies, and organizations from all news clusters using the entity registry (66 entities with aliases)
Signal Aggregation: Collect all map signals (military flights, protests, outages, vessels) and group by country
Cross-Reference: Match news entities with signal countries
Score & Rank: Calculate focal scores based on correlation strength

Focal Point Scoring

FocalScore = NewsScore + SignalScore + CorrelationBonus

NewsScore (0-40):
  base = min(20, mentionCount × 4)
  velocity = min(10, newsVelocity × 2)
  confidence = avgConfidence × 10

SignalScore (0-40):
  types = signalTypes.count × 10
  count = min(15, signalCount × 3)
  severity = highSeverityCount × 5

CorrelationBonus (0-20):
  +10 if entity appears in BOTH news AND signals
  +5 if news keywords match signal types (e.g., "military" + military_flight)
  +5 if related entities also have signals

Urgency Classification

Urgency	Criteria	Visual
Critical	Score > 70 OR 3+ signal types	Red badge
Elevated	Score > 50 OR 2+ signal types	Orange badge
Watch	Default	Yellow badge

Signal Type Icons

Focal points display icons indicating which signal types are active:

Icon	Signal Type	Meaning
✈️	military_flight	Military aircraft detected nearby
⚓	military_vessel	Naval vessels in waters
📢	protest	Civil unrest events
🌐	internet_outage	Network disruption
🚢	ais_disruption	Shipping anomaly

Example Output

A focal point for IRAN might show:

Display: "Iran [CRITICAL] ✈️📢🌐"
News: 12 mentions, velocity 0.5/hour
Signals: 5 military flights, 3 protests, 1 outage
Narrative: "12 news mentions | 5 military flights, 3 protests, 1 internet outage | 'Iran protests continue amid...'"
Correlation Evidence: "Iran appears in both news (12) and map signals (9)"

Integration with CII

Focal point urgency levels feed into the Country Instability Index:

Critical focal point → CII score boost for that country
Ensures countries with multi-source convergence are properly flagged
Prevents "silent" instability when news alone wouldn't trigger alerts