docs/quick-ingestion-guides/redshift/setup.md
To configure ingestion from Redshift, you'll need a User configured with the proper permission sets.
This setup guide will walk you through the steps you'll need to take in your Amazon Redshift cluster.
CREATE USER datahub WITH PASSWORD 'Datahub1234';
datahub user. For most users, the recommended set below will be sufficient:For a typical provisioned cluster with default settings:
-- Core system access (required for lineage and usage statistics)
ALTER USER datahub WITH SYSLOG ACCESS UNRESTRICTED;
-- Core metadata extraction (always required)
GRANT SELECT ON pg_catalog.svv_redshift_databases TO datahub;
GRANT SELECT ON pg_catalog.svv_redshift_schemas TO datahub;
GRANT SELECT ON pg_catalog.svv_external_schemas TO datahub;
GRANT SELECT ON pg_catalog.svv_table_info TO datahub;
GRANT SELECT ON pg_catalog.svv_external_tables TO datahub;
GRANT SELECT ON pg_catalog.svv_external_columns TO datahub;
GRANT SELECT ON pg_catalog.pg_class_info TO datahub;
-- Essential pg_catalog tables for table discovery
GRANT SELECT ON pg_catalog.pg_class TO datahub;
GRANT SELECT ON pg_catalog.pg_namespace TO datahub;
GRANT SELECT ON pg_catalog.pg_description TO datahub;
GRANT SELECT ON pg_catalog.pg_database TO datahub;
GRANT SELECT ON pg_catalog.pg_attribute TO datahub;
GRANT SELECT ON pg_catalog.pg_attrdef TO datahub;
-- Datashare lineage (enabled by default)
GRANT SELECT ON pg_catalog.svv_datashares TO datahub;
-- Provisioned cluster materialized views
GRANT SELECT ON pg_catalog.stv_mv_info TO datahub;
For Serverless Workgroups:
-- Use these instead of stv_mv_info (from Provisioned section above)
GRANT SELECT ON pg_catalog.svv_user_info TO datahub;
GRANT SELECT ON pg_catalog.svv_mv_info TO datahub;
For Shared Databases (Datashare Consumers):
-- Required when is_shared_database = True
GRANT SELECT ON pg_catalog.svv_redshift_tables TO datahub;
GRANT SELECT ON pg_catalog.svv_redshift_columns TO datahub;
Important: The above permissions only provide access to metadata. For data profiling, classification, or any feature that reads actual table data, you need:
-- Schema access (required to access tables within schemas)
GRANT USAGE ON SCHEMA public TO datahub;
GRANT USAGE ON SCHEMA your_schema_name TO datahub;
-- Table data access (required for profiling and classification)
GRANT SELECT ON ALL TABLES IN SCHEMA public TO datahub;
GRANT SELECT ON ALL TABLES IN SCHEMA your_schema_name TO datahub;
-- For production environments (future tables/views):
-- IMPORTANT: Default privileges only apply to objects created by the user who runs this command
-- Option 1: If you (as admin) will create all future tables/views:
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO datahub;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON VIEWS TO datahub;
ALTER DEFAULT PRIVILEGES IN SCHEMA your_schema_name GRANT SELECT ON TABLES TO datahub;
ALTER DEFAULT PRIVILEGES IN SCHEMA your_schema_name GRANT SELECT ON VIEWS TO datahub;
-- Option 2: If other users will create tables/views, run this for each user:
-- ALTER DEFAULT PRIVILEGES FOR ROLE other_user_name IN SCHEMA public GRANT SELECT ON TABLES TO datahub;
-- ALTER DEFAULT PRIVILEGES FOR ROLE other_user_name IN SCHEMA public GRANT SELECT ON VIEWS TO datahub;
-- Option 3: For all future users (requires superuser):
-- ALTER DEFAULT PRIVILEGES FOR ALL ROLES IN SCHEMA public GRANT SELECT ON TABLES TO datahub;
-- ALTER DEFAULT PRIVILEGES FOR ALL ROLES IN SCHEMA public GRANT SELECT ON VIEWS TO datahub;
:::caution Data Access vs Metadata Access
The permissions are split into two categories:
Default privileges only apply to objects created by the user who ran the ALTER DEFAULT PRIVILEGES command. If multiple users create tables in your schemas, you need to:
FOR ROLE other_user_name for each user who creates objects, ORFOR ALL ROLES (requires superuser privileges)Common gotcha: If User A runs ALTER DEFAULT PRIVILEGES and User B creates a table, DataHub won't have access to User B's table unless you used Option 2 or 3 above.
Alternative approach: Instead of default privileges, consider using a scheduled job to periodically grant access to new tables:
-- Run this periodically to catch new tables
GRANT SELECT ON ALL TABLES IN SCHEMA your_schema_name TO datahub;
:::
Once you've confirmed all of the above in Redshift, it's time to move on to configure the actual ingestion source within the DataHub UI.