metadata-ingestion/docs/sources/glue/README.md
Glue is a data platform used to store and query analytical or operational data. Learn more in the official Glue documentation.
The DataHub integration for Glue covers core metadata entities such as datasets/tables/views, schema fields, and containers. Depending on module capabilities, it can also capture features such as lineage, usage, profiling, ownership, tags, and stateful deletion detection.
:::tip If you also have files in S3 that you'd like to ingest, we recommend you use Glue's built-in data catalog. See here for a quick guide on how to set up a crawler on Glue and ingest the outputs with DataHub. :::
| Source Concept | DataHub Concept | Notes |
|---|---|---|
"glue" | Data Platform | |
| Glue Database | Container | Subtype Database |
| Glue Table | Dataset | Subtype Table |
| Glue Job | Data Flow | |
| Glue Job Transform | Data Job | |
| Glue Job Data source | Dataset | |
| Glue Job Data sink | Dataset |