optional-skills/research/osint-investigation/references/sources/gdelt.md
GDELT (Global Database of Events, Language, and Tone) monitors world news in 100+ languages with full-text indexing. Updated every 15 minutes. ~2015 → present, ~1B+ articles indexed. Free anonymous access.
GDELT is wider than Google News (more international, more long-tail sources) and indexed by tone/sentiment, themes (CAMEO codes), people, and organizations.
https://api.gdeltproject.org/api/v2/doc/dochttps://api.gdeltproject.org/api/v2/events/eventsThe fetch script automatically retries after a 6-second sleep when a 429 is received.
Key fields emitted by fetch_gdelt.py:
| Column | Type | Description |
|---|---|---|
title | str | Article title |
url | str | Article URL |
seen_date | str | When GDELT first saw the article (UTC) |
domain | str | Publisher domain |
language | str | Source language |
source_country | str | 2-letter country code |
tone | str | GDELT-computed tone score (negative = negative coverage) |
social_image | str | Open Graph image URL when available |
title / url (news context for any subject)Join key: entity name appearing in article title or full-text. GDELT also extracts named entities into a separate stream (GKG) not exposed by this fetcher — query GDELT directly for entity-level filtering.
Path: scripts/fetch_gdelt.py
# Recent news mentioning an entity
python3 SKILL_DIR/scripts/fetch_gdelt.py --query "Nous Research" \
--timespan 6m --out data/gdelt.csv
# Phrase-exact (use double quotes inside single quotes for the shell)
python3 SKILL_DIR/scripts/fetch_gdelt.py --query '"Dillon Rolnick"' \
--timespan 1y --out data/gdelt.csv
# Filter to a country / language
python3 SKILL_DIR/scripts/fetch_gdelt.py --query "Microsoft" \
--source-country US --source-lang English --out data/gdelt.csv
# Date range
python3 SKILL_DIR/scripts/fetch_gdelt.py --query "Microsoft" \
--start 2024-01-01 --end 2024-12-31 --out data/gdelt.csv
GDELT supports its own query operators: phrase quoting, AND/OR/NOT,
sourcecountry:US, theme:ECON_BANKRUPTCY, tone<-5, etc.
See https://blog.gdeltproject.org/gdelt-doc-2-0-api-debuts/ for syntax.