metadata-integration/java/docs/sdk-v2/entities-overview.md
SDK V2 provides high-level entity classes that represent DataHub metadata entities. This guide covers common patterns across all entity types.
SDK V2 supports all major DataHub entity types including Dataset, Chart, Dashboard, Container, DataFlow, DataJob, MLModel, and MLModelGroup. Each entity type has a dedicated guide with comprehensive examples and API documentation.
See the Entity Guides section in the sidebar for the complete list of available entities and their documentation.
All entities follow a consistent lifecycle:
// 1. Construction
Dataset dataset = Dataset.builder()
.platform("snowflake")
.name("my_table")
.build();
// 2. Metadata Addition
dataset.addTag("pii")
.addOwner("urn:li:corpuser:john", OwnershipType.TECHNICAL_OWNER);
// 3. Persistence
client.entities().upsert(dataset);
// 4. Loading
Dataset loaded = client.entities().get(datasetUrn);
Each entity type has a strongly-typed URN class:
// Dataset
DatasetUrn datasetUrn = dataset.getDatasetUrn();
// Chart
ChartUrn chartUrn = chart.getChartUrn();
// Dashboard
DashboardUrn dashboardUrn = dashboard.getDashboardUrn();
// DataJob
DataJobUrn dataJobUrn = dataJob.getDataJobUrn();
// Generic access
Urn genericUrn = entity.getUrn();
URN Construction:
// Built automatically from builder
Dataset dataset = Dataset.builder()
.platform("snowflake")
.name("db.schema.table")
.env("PROD")
.build();
// URN: urn:li:dataset:(urn:li:dataPlatform:snowflake,db.schema.table,PROD)
All mutation methods return this for method chaining:
dataset.addTag("pii")
.addTag("analytics")
.addOwner("urn:li:corpuser:owner1", OwnershipType.TECHNICAL_OWNER)
.addCustomProperty("team", "data-eng")
.setDomain("urn:li:domain:Finance");
Dataset dataset = Dataset.builder()
.platform("snowflake")
.name("my_table")
.description("My description")
.build();
// aspectCache populated with builder aspects
// pendingPatches empty
DatasetUrn urn = new DatasetUrn("snowflake", "my_table", "PROD");
Dataset dataset = client.entities().get(urn);
// aspectCache populated with server aspects
// Aspects have timestamps for freshness tracking
// Entity bound to client enables lazy loading
Dataset dataset = client.entities().reference(urn);
dataset.bindToClient(client, mode);
String desc = dataset.getDescription(); // Fetches on first access
Entities cache aspects locally with TTL-based freshness:
// Cached aspects with 60-second default TTL
dataset.setCacheTtlMs(120000); // 2 minutes
// Check cache status
Map<String, RecordTemplate> aspects = dataset.getAllAspects();
When bound to a client, aspects are fetched on-demand:
Dataset dataset = client.entities().get(urn);
// aspectCache may not have all aspects
String description = dataset.getDescription();
// Triggers lazy fetch if not cached or expired
Mutations accumulate as patches until save:
dataset.addTag("tag1"); // Creates patch
dataset.addTag("tag2"); // Creates patch
dataset.addOwner("user", OwnershipType.TECHNICAL_OWNER); // Creates patch
// Check pending patches
boolean hasPending = dataset.hasPendingPatches();
List<MetadataChangeProposal> patches = dataset.getPendingPatches();
// Emit all patches
client.entities().update(dataset);
// Patches cleared after emission
dataset.clearPendingPatches();
The upsert() method emits everything accumulated on the entity:
set*() methods)add*/remove* methods)What gets sent depends on how the entity was created and what operations you performed:
Builder with patches:
Dataset dataset = Dataset.builder()
.platform("snowflake")
.name("my_table")
.description("Description")
.build();
dataset.addTag("pii"); // Creates patch
client.entities().upsert(dataset); // Sends: cached aspects + tag patch
Patches only (loaded entity):
Dataset dataset = client.entities().get(urn);
dataset.addTag("pii"); // Creates patch
client.entities().upsert(dataset); // Sends: tag patch only
Full aspect replacement:
Dataset dataset = client.entities().get(urn);
dataset.setDescription("New description"); // Creates full aspect MCP
client.entities().upsert(dataset); // Sends: complete description aspect
Combined operations:
Dataset dataset = Dataset.builder()
.platform("snowflake")
.name("my_table")
.build();
dataset.setDescription("Description"); // Full aspect MCP
dataset.addTag("pii"); // Patch
dataset.addOwner("user", OwnershipType.TECHNICAL_OWNER); // Patch
client.entities().upsert(dataset); // Sends: cached aspects + description aspect + 2 patches
Entities respect the client's operation mode:
// SDK mode client
DataHubClientV2 sdkClient = DataHubClientV2.builder()
.server("http://localhost:8080")
.operationMode(OperationMode.SDK) // Default
.build();
dataset.setDescription("User description");
sdkClient.entities().upsert(dataset);
// Writes to editableDatasetProperties
// INGESTION mode client
DataHubClientV2 ingestionClient = DataHubClientV2.builder()
.server("http://localhost:8080")
.operationMode(OperationMode.INGESTION)
.build();
dataset.setDescription("Ingested description");
ingestionClient.entities().upsert(dataset);
// Writes to datasetProperties
// Add tags
entity.addTag("pii");
entity.addTag("urn:li:tag:analytics");
// Remove tags
entity.removeTag("pii");
import com.linkedin.common.OwnershipType;
// Add owners
entity.addOwner("urn:li:corpuser:john", OwnershipType.TECHNICAL_OWNER);
entity.addOwner("urn:li:corpuser:jane", OwnershipType.DATA_STEWARD);
// Remove owners
entity.removeOwner("urn:li:corpuser:john");
// Add terms
entity.addTerm("urn:li:glossaryTerm:CustomerData");
entity.addTerm("urn:li:glossaryTerm:PII");
// Remove terms
entity.removeTerm("urn:li:glossaryTerm:CustomerData");
// Set domain
entity.setDomain("urn:li:domain:Marketing");
// Remove domain
entity.removeDomain();
// Add custom properties
entity.addCustomProperty("team", "data-engineering");
entity.addCustomProperty("retention", "90_days");
// Remove custom properties
entity.removeCustomProperty("retention");
Handle exceptions gracefully:
try {
client.entities().upsert(dataset);
} catch (IOException e) {
// Network or serialization errors
log.error("I/O error: {}", e.getMessage());
} catch (ExecutionException e) {
// Server-side errors
log.error("Server error: {}", e.getCause().getMessage());
} catch (InterruptedException e) {
// Operation cancelled
Thread.currentThread().interrupt();
log.error("Operation interrupted");
}
Each entity type has a dedicated guide with detailed information including:
See the Entity Guides section in the sidebar for the complete list of available entity documentation.
Example guides include Dataset Entity, Chart Entity, Dashboard Entity, and DataJob Entity.