docs/source/use-cases/scientists.rst
.. _use-case-scientists:
Academic identity has a unique anchor that social-media identity does not: the ORCID (Open Researcher and Contributor ID). Every ORCID is verified, globally unique, and tied to a confirmed email — a single confirmed ORCID match between two platforms means the same human, with effectively zero false-positive rate.
Maigret uses ORCID as a first-class identifier: extract it once from a username-keyed profile (iNaturalist, GitHub bio, lab homepage), then pivot into the academic graph (ORCID, OpenAlex, arXiv, DBLP, Scholia) — a chain of HTTP calls that turns an anonymous handle into a CV with employer, education, publication list, citation count, co-author graph, and topical area.
Most OSINT work on academic targets stalls at the same bottleneck: people use one alias on a citizen-science site, a different alias on Twitter, a third in a forum signature, and only their real name on PubMed. Connecting the personas by hand means trawling Google Scholar, ResearchGate, university web pages, and old conference programmes. ORCID short-circuits this entirely — one verified identifier resolves all of them.
Common scenarios:
kueda. iNaturalist's public API returns their ORCID, and one further request reveals their real name, employer (California Academy of Sciences), education (UC Berkeley), and 700+ citable observations.Maigret currently checks the following ORCID-keyed platforms (use --id-type orcid when starting from a bare ORCID, or rely on the recursive chain when starting from a username):
.. list-table:: :header-rows: 1 :widths: 18 42 40
Step 1: Find a profile that publishes ORCID
The most reliable bridges from a username to an ORCID are platforms whose users want their academic identity discoverable. In practice:
api.inaturalist.org/v1/users/{username} returns the ORCID directly in the orcid field. Maigret extracts it automatically./users/{u}/social_accounts. A pattern like orcid\.org/(\d{4}-\d{4}-\d{4}-\d{3}[\dX]) matches both http://orcid.org/... and orcid.org/... variants.--parse <URL> to scrape an arbitrary page for known identifiers — useful when the target's footprint lives at faculty.example.edu/~jdoe.Step 2: Let the recursive search run
.. code-block:: console
maigret <username> --tags science -v
Recursive search is enabled by default. As soon as Maigret extracts an orcid field from any found profile, it queues an --id-type orcid second wave automatically. No manual chaining needed.
Step 3: Start from a bare ORCID
If you already have the ORCID (e.g. from a paper's author footer, a grant database, or a Wikidata entry):
.. code-block:: console
maigret 0000-0002-9322-3515 --id-type orcid -a
This bypasses the username-bridge step entirely.
The ORCID record itself contains the link back to ordinary social platforms:
researcher-urls — a list of self-declared external links. Typical entries: lab homepage, Twitter/Bluesky/Mastodon profile, personal blog. These are entered by the researcher and therefore confirmed.external-identifiers — IDs from sibling academic systems (Scopus Author ID, ResearcherID/Publons, Loop profile). Each unlocks another extraction pipeline.last_known_institutions — current employer with country, ROR ID, and OpenAlex institution ID; useful for pivoting into the institution's directory.<url> tags — DBLP records often embed direct links to Google Scholar, ResearchGate, and personal pages.wikidata_qid — once you have a Wikidata QID, the SPARQL endpoint unlocks the broadest external ID set in any single OSINT system: VIAF, LCCN, GND, Twitter handle, Mastodon handle, IMDb, GitHub, ORCID, and ~200 others.history.verified-primary-email, an ORCID-confirmed flag that means the address was email-validated at registration. Use this as a strong pivot into email-keyed lookups (Holehe, HIBP, Gravatar, etc.).public_email field — relevant for scientists who self-host code on a CERN/research-institute GitLab. GitHub's REST API does not expose email by default.raw_author_names against the ORCID other-names. Discrepancies often expose pre-marriage names, transliterations from non-Latin scripts, or co-author misattributions worth flagging.--id-type orcid must match ^\d{4}-\d{4}-\d{4}-\d{3}[\dX]$. The trailing character may legitimately be X (ISO checksum); strip https://orcid.org/ prefixes before passing the bare ID.