metadata-ingestion/docs/sources/starrocks/starrocks_post.md
Stateful ingestion is supported and tracks previously ingested entities. When enabled, it can automatically soft-delete entities that are no longer present in StarRocks. To use it, set a pipeline_name and enable stateful ingestion:
pipeline_name: starrocks_ingestion
source:
type: starrocks
config:
stateful_ingestion:
enabled: true
The connector discovers all catalogs by default, including external catalogs (Hive, Iceberg, Hudi, Delta Lake). You can control this with include_external_catalogs and catalog_pattern.
If tables from external catalogs are missing, verify that the ingestion user has USAGE privileges on those catalogs and SELECT on the tables. Check the ingestion logs for warnings about specific databases or tables that failed during reflection.
StarRocks supports complex types (e.g., ARRAY(JSON())) that may not map directly to DataHub types. These columns will still be ingested but their parsed field type will not be populated.