optional-skills/research/osint-investigation/references/sources/icij-offshore.md
The International Consortium of Investigative Journalists (ICIJ) publishes a combined database of offshore entities from the Panama Papers, Paradise Papers, Pandora Papers, Bahamas Leaks, and Offshore Leaks. ~800,000+ offshore entities with their officers, intermediaries, and addresses.
https://offshoreleaks-data.icij.org/offshoreleaks/csv/full-oldb.LATEST.zip (~70 MB ZIP, refreshed periodically)https://offshoreleaks.icij.org//reconcile now returns 404. ICIJ has removed it. The bulk ZIP is the
remaining stable access path. The skill's fetch_icij_offshore.py caches
the ZIP locally (default ~/.cache/hermes-osint/icij/, refreshes after
30 days) and searches it offline.Key fields emitted by fetch_icij_offshore.py:
| Column | Type | Description |
|---|---|---|
node_id | int | ICIJ canonical node ID |
name | str | Entity / officer / intermediary name |
node_type | str | entity / officer / intermediary / address |
country_codes | str | Semicolon-separated ISO codes |
countries | str | Country names |
jurisdiction | str | Offshore jurisdiction (BVI, Panama, etc.) |
incorporation_date | str | YYYY-MM-DD |
inactivation_date | str | YYYY-MM-DD (if struck) |
source | str | Panama Papers / Paradise Papers / Pandora Papers / etc. |
entity_url | str | Link to ICIJ page |
connections | str | Semicolon-separated node IDs of related entities |
name (public companies with offshore arms)name (federal contractors with offshore structure)name (sanctioned entities using offshore vehicles)Join key: normalized entity/officer name. node_id is canonical for cross-
referencing within ICIJ. Connections graph traversal is in-script (BFS over
connections).
inactivation_datePath: scripts/fetch_icij_offshore.py
# Search by entity name (case-insensitive substring across the bulk DB)
python3 SKILL_DIR/scripts/fetch_icij_offshore.py --entity "EXAMPLE CORP" \
--out data/icij.csv
# Search by officer (individual person)
python3 SKILL_DIR/scripts/fetch_icij_offshore.py --officer "SMITH JOHN" \
--out data/icij.csv
# Search by jurisdiction (filter on cached results)
python3 SKILL_DIR/scripts/fetch_icij_offshore.py --officer "SMITH" \
--jurisdiction "BRITISH VIRGIN ISLANDS" --out data/icij_bvi.csv
# Force a fresh download (default refresh window is 30 days)
python3 SKILL_DIR/scripts/fetch_icij_offshore.py --entity "EXAMPLE CORP" \
--force-refresh --out data/icij.csv
First call downloads the ~70 MB ZIP under ~/.cache/hermes-osint/icij/
(or $HERMES_OSINT_CACHE/icij/). Subsequent calls reuse the cache for 30 days.
https://offshoreleaks.icij.org/reconcile