metadata-ingestion/docs/sources/snowplow/snowplow_pre.md
The snowplow module ingests metadata from Snowplow into DataHub. It is intended for production ingestion workflows and module-specific capabilities are documented below.
The Snowplow source extracts metadata from Snowplow's behavioral data platform, including:
Snowplow is an open-source behavioral data platform that collects, validates, and models event-level data. This connector supports both:
Before running ingestion, ensure network connectivity to the source, valid authentication credentials, and read permissions for metadata APIs required by this module.
https://console.snowplowanalytics.com/{org-id}/...The connector requires read-only access to the following BDP Console API endpoints:
To extract basic schema metadata:
read:data-structures - Read access to data structures (event and entity schemas)read:organizations - Access to organization information| Capability | Required Permissions | Configuration |
|---|---|---|
| Schema Metadata | read:data-structures | Enabled by default |
| Event Specifications | read:event-specs | extract_event_specifications: true |
| Tracking Scenarios | read:tracking-scenarios | extract_tracking_scenarios: true |
| Tracking Plans | read:data-products | extract_tracking_plans: true |
Test your API credentials and permissions:
# Get JWT token
curl -X POST \
-H "X-API-Key-ID: <API_KEY_ID>" \
-H "X-API-Key: <API_KEY>" \
https://console.snowplowanalytics.com/api/msc/v1/organizations/<ORG_ID>/credentials/v3/token
# List data structures
curl -H "Authorization: Bearer <JWT>" \
https://console.snowplowanalytics.com/api/msc/v1/organizations/<ORG_ID>/data-structures/v1
For open-source Snowplow with Iglu:
See the recipe files for complete configuration examples: