docs/managed-datahub/welcome-acryl.md
Welcome to the DataHub Cloud! We at DataHub are on a mission to make data reliable by bringing clarity to the who, what, when, & how of your data ecosystem. We're thrilled to be on this journey with you; and cannot wait to see what we build together!
Close communication is not only welcomed, but highly encouraged. For all questions, concerns, & feedback, please reach out to us directly at [email protected].
Before you go further, you'll need to have a DataHub instance provisioned. The DataHub integrations team will provide you the following once it has been deployed:
Once you have these, you're ready to go.
:::info If you wish to have a private connection to your DataHub instance, DataHub Cloud supports AWS PrivateLink to complete this connection to your existing AWS account. Please see more details here. :::
DataHub Cloud currently supports the following means to log into a DataHub instance:
.well-known/openid-configuration. Sometimes, identity providers will not explicitly include this URL in their setup guides, though this endpoint will exist as per the OIDC specification. For more info see here.The callback URL to register in your Identity Provider will be
https://your-acryl-domain.acryl.io/callback/oidc
Note that we do not yet support LDAP or SAML authentication. Please let us know if either of these integrations would be useful for your organization.
DataHub Cloud is first and foremost a metadata Search & Discovery product. As such, the two most important parts of the experience are
DataHub Cloud employs a push-based metadata ingestion model. In practice, this means running an DataHub-provided agent inside your organization's infrastructure, and pushing that data out to your DataHub instance in the cloud. One benefit of this approach is that metadata can be aggregated across any number of distributed sources, regardless of form or location.
This approach comes with another benefit: security. By managing your own instance of the agent, you can keep the secrets and credentials within your walled garden. Skip uploading secrets & keys into a third-party cloud tool.
To push metadata into DataHub, DataHub Cloud provides an ingestion framework written in Python. Typically, push jobs are run on a schedule at an interval of your choosing. For our step-by-step guide on ingestion, click here.
There are 2 primary ways to find metadata: search and browse. Both can be accessed via the DataHub home page.
By default, we provide rich search capabilities across your ingested metadata. This includes the ability to search by tags, descriptions, column names, column descriptions, and more using the global search bar found on the home page.