Back to Hermes Agent

ICIJ Offshore Leaks Database

optional-skills/research/osint-investigation/references/sources/icij-offshore.md

2026.6.54.3 KB
Original Source

ICIJ Offshore Leaks Database

1. Summary

The International Consortium of Investigative Journalists (ICIJ) publishes a combined database of offshore entities from the Panama Papers, Paradise Papers, Pandora Papers, Bahamas Leaks, and Offshore Leaks. ~800,000+ offshore entities with their officers, intermediaries, and addresses.

2. Access Methods

  • Bulk download (primary): https://offshoreleaks-data.icij.org/offshoreleaks/csv/full-oldb.LATEST.zip (~70 MB ZIP, refreshed periodically)
  • Search UI (human): https://offshoreleaks.icij.org/
  • Auth: None
  • Note: The previous Open Refine reconciliation endpoint at /reconcile now returns 404. ICIJ has removed it. The bulk ZIP is the remaining stable access path. The skill's fetch_icij_offshore.py caches the ZIP locally (default ~/.cache/hermes-osint/icij/, refreshes after 30 days) and searches it offline.

3. Data Schema

Key fields emitted by fetch_icij_offshore.py:

ColumnTypeDescription
node_idintICIJ canonical node ID
namestrEntity / officer / intermediary name
node_typestrentity / officer / intermediary / address
country_codesstrSemicolon-separated ISO codes
countriesstrCountry names
jurisdictionstrOffshore jurisdiction (BVI, Panama, etc.)
incorporation_datestrYYYY-MM-DD
inactivation_datestrYYYY-MM-DD (if struck)
sourcestrPanama Papers / Paradise Papers / Pandora Papers / etc.
entity_urlstrLink to ICIJ page
connectionsstrSemicolon-separated node IDs of related entities

4. Coverage

  • Worldwide offshore entity records
  • Earliest records: 1970s (Bahamas Leaks). Most data 1990–2018.
  • NOT updated in real-time — new leaks added when ICIJ publishes them
  • ~810,000 offshore entities + ~750,000 officers + ~150,000 intermediaries

5. Cross-Reference Potential

  • SEC EDGARname (public companies with offshore arms)
  • USAspendingname (federal contractors with offshore structure)
  • OFAC SDNname (sanctioned entities using offshore vehicles)

Join key: normalized entity/officer name. node_id is canonical for cross- referencing within ICIJ. Connections graph traversal is in-script (BFS over connections).

6. Data Quality

  • Offshore entity names sometimes appear in multiple leaks with slight variations
  • Officers may be nominees (front persons), not beneficial owners
  • Some entries have minimal info (just a name + jurisdiction)
  • The connections graph is incomplete — some relationships are documented in source materials but not in the structured database
  • Inactive/struck-off entities are still included with inactivation_date

7. Acquisition Script

Path: scripts/fetch_icij_offshore.py

bash
# Search by entity name (case-insensitive substring across the bulk DB)
python3 SKILL_DIR/scripts/fetch_icij_offshore.py --entity "EXAMPLE CORP" \
    --out data/icij.csv

# Search by officer (individual person)
python3 SKILL_DIR/scripts/fetch_icij_offshore.py --officer "SMITH JOHN" \
    --out data/icij.csv

# Search by jurisdiction (filter on cached results)
python3 SKILL_DIR/scripts/fetch_icij_offshore.py --officer "SMITH" \
    --jurisdiction "BRITISH VIRGIN ISLANDS" --out data/icij_bvi.csv

# Force a fresh download (default refresh window is 30 days)
python3 SKILL_DIR/scripts/fetch_icij_offshore.py --entity "EXAMPLE CORP" \
    --force-refresh --out data/icij.csv

First call downloads the ~70 MB ZIP under ~/.cache/hermes-osint/icij/ (or $HERMES_OSINT_CACHE/icij/). Subsequent calls reuse the cache for 30 days.

  • Public record as published by ICIJ under explicit publication
  • No copyright on the underlying facts (entity names, jurisdictions)
  • ICIJ asks for attribution if used in derivative reporting
  • Ethical note: Presence in this database does NOT imply wrongdoing. Many offshore structures are legal. The database is a research tool, not a list of criminals.

9. References