src/backend/base/langflow/api/v1/mappers/deployments/RULES.md
This document captures the architecture and contract rules for the deployments implementation.
The goal is strict separation of concerns between:
src/backend/base/langflow/api/v1)src/backend/base/langflow/api/v1/mappers/deployments)src/backend/base/langflow/services/adapters/deployment/*)src/lfx/src/lfx/services/adapters/deployment/*)deployments.py must not import or branch on provider-specific payload models, constants, slot names, or parser logic.
Allowed in routes:
Not allowed in routes:
The mapper translates API payloads to Adapters and the Langflow DB:
provider_data into adapter-layer input models (e.g. VerifyCredentials, AdapterDeploymentCreate). The adapter then makes the actual provider SDK/network calls.resolve_provider_account_create returns a DeploymentProviderAccount model and resolve_provider_account_update returns the full update diff dict).Mapper responsibility includes:
provider_data for DB storageMapper responsibility excludes:
Adapter responsibility includes:
Adapter should not:
Do not expose flow_version_id as an adapter contract requirement.
Use adapter-neutral correlation:
source_ref: strLangflow can choose to populate source_ref with serialized flow-version IDs, but this is an implementation decision at the mapper/API boundary.
Do not rely on ordering between:
flow_version_idssnapshot_idsInstead require explicit create-time bindings:
{ source_ref, snapshot_id }Mapper reconciliation outputs must have explicit schemas, not ad-hoc dict/list assumptions.
Baseline contracts:
CreateFlowArtifactProviderDataCreateSnapshotBindingCreateSnapshotBindingsCreatedSnapshotIdsFlowVersionPatchPublic mapper methods must clearly indicate:
Generic provider data typing belongs with payload taxonomy definitions (payloads.py) when tied to payload slots.
If a generic is used for slot shape constraints, pair it with a BaseModel-bound generic for slot declarations.
Mapper contract models are the last stop before route orchestration consumes data.
Rules:
Validation of provider payload/result shape should happen through configured payload slots in the appropriate layer:
Do not bypass slot validation with plain dict assumptions where a slot exists.
Slot names should be concise and aligned with existing naming conventions.
Example rule applied:
flow_artifact over verbose suffixes like flow_artifact_provider_dataProvider-specific create/update reconciliation data should be emitted through dedicated result slots:
deployment_create_resultdeployment_update_resultDo not stash custom reconciliation fields in untyped generic dicts without slot-backed schemas.
When a payload/result shape is already represented by a known slot model or explicit schema model:
DeploymentPayloadSchemas) as the source of truth instead of duplicating free-standing slot constants.For required provider reconciliation/result payloads:
When a mapper/adapter boundary payload is required by the active contract:
When a provider defines a canonical DeploymentPayloadSchemas registry object shared by adapter + mapper:
payloads.py module as the canonical ownership location for provider payload/result contract models and the registry instance.PAYLOAD_SCHEMAS.PAYLOAD_SCHEMAS (no aliasing).PayloadSlot.parse is the canonical boundary for provider payload/result parsing.
Rules:
raw=None must fail fast via AdapterPayloadMissingError (no silent defaults).None input, use direct adapter_model.model_validate(raw) as the parse path.model_validate.model_validate supports for the model contract (for example dict or model-like input), while keeping missing-payload behavior explicit.Provider mappers sit at the API boundary and must avoid coupling to adapter-owned model symbols when slot contracts already exist.
Rules:
WatsonxApi*) for API input shaping/branching/output.Watsonx*ResultData).PAYLOAD_SCHEMAS slots directly.Rules:
DeploymentUpdateResult, ExecutionCreateResult, ExecutionStatusResult (or similarly broad contract types) at mapper boundaries....[AdapterPayload] in mapper signatures unless the boundary truly guarantees dict-only inputs.Keep method naming families distinct by purpose:
resolve_* for API input resolution/translationshape_* for outbound API shapingutil_* for reconciliation/util extraction helpersAvoid overlapping verbs that blur intent (resolve vs reconcile, etc.).
Utility methods used by route orchestration (snapshot bindings, created snapshot IDs, flow-version patch extraction, flow artifact provider data construction) should use the utility naming family consistently.
Do not co-locate registry infrastructure with base mapper behavior when it hurts contract readability.
Keep:
base.pyregistry.pycontracts.pyExamples:
These must be implemented as mapper overrides, not route conditionals.
When adapter service classes become heavy, move private create/update helpers into focused helper modules (for example core/create.py) to keep service orchestration lean and readable.
If a schema is part of externally consumed adapter payload/result shape, define it in payload/schema modules (payloads.py or schema.py), not as ad-hoc local classes in deep internal tool modules.
If a helper return type is consumed by adapter service orchestration (for example typed create/update apply results used by service.py), treat it as a public boundary contract even if it is not directly serialized over HTTP.
Rules:
*_helpers.py).payloads.py.schema.py.The mapper is the single component that understands a provider's credential shape and cross-field update rules. The API schema, the DB model, and the route are all intentionally unaware of provider-specific credential semantics.
Credential flow (API → DB):
provider_data: dict[str, Any]. It does not validate the dict's contents.resolve_credentials(provider_data=...) validates and extracts credential DB fields (e.g. {"api_key": "..."} for WXO today). Mapper create/update assemblers own how these fields are applied.api_key: str). If a future provider requires a different storage layout (multiple columns, a serialised JSON blob, etc.), only the mapper and CRUD layer need to evolve — the route and schema remain unchanged.Create assembly (API → DB):
resolve_provider_account_create(payload=..., user_id=...) assembles the complete provider-account create model for CRUD, including provider URL, tenant/account identifiers, and credential fields.resolve_provider_account_create(...).provider_data; they delegate create assembly to the mapper.Update assembly (API → DB):
resolve_provider_account_update(payload=..., existing_account=...) assembles the complete update kwargs dict. Only fields present in payload.model_fields_set are included so the CRUD layer receives a minimal diff.super() for the common fields and only add their own cross-field rules.DB model validator:
DeploymentProviderAccount model has a model_validator that calls validate_tenant_url_consistency(). This catches inconsistent tenant/URL pairs regardless of entry point — even if a future code path bypasses the mapper.deployment_provider_account/utils.py as the single source of truth. Both the model validator and the WXO create-path mapper logic delegate to the same extract_tenant_from_url() function.provider_data; mappers extract and normalize them before persistence.Routes may:
Routes must not:
Provide and use a public mapper getter (get_deployment_mapper(provider_key)) to preserve a clear and symmetric contract alongside adapter getter usage.
Create-time attachment mapping must use:
source_ref -> flow_version_idsource_ref -> snapshot_idwith strict checks:
source_ref => errorFlow-version attachment add/remove operations should be represented via explicit patch semantics (FlowVersionPatch) and validated for no overlap.
Provider mappers may enforce provider-specific constraints for where patch operations are expressed (for example inside provider operations payload).
Missing required bindings or malformed provider result contracts should fail fast with explicit error messages.
Required coverage areas:
When mapper methods return contract objects, assert against model types and fields, not raw loosely typed dict/list assumptions.
When method naming families change for semantic clarity, update tests immediately to prevent drift in contract language.
Use this checklist before merge:
source_ref bindings (not positional mapping)PayloadSlot.parse keeps None fail-fast + direct model_validate path (no ad-hoc pre-parse bypass)payloads.py as canonical owner of provider payload/result contracts and payload-schema registryresolve_*, shape_*, util_*)super(), not as base-class conditionalsresolve_credentials, not route-level assumptions about provider_data contentsUse this workflow for any deployments change that alters payload shape, reconciliation semantics, or ownership boundaries.
Identify the change category first:
The category determines whether compatibility bridges are required.
Apply changes in this order:
This keeps every boundary explicit while code is in transition.
For each contract evolution, explicitly choose one mode:
Default policy:
Rules:
When introducing new public contract types during refactors:
payloads.py (provider-specific) or schema.py (adapter-neutral/shared)Watsonx...) to prevent ambiguityAny contract evolution PR must include tests that prove:
Following this process keeps layering explicit, reduces leakage during migrations, and makes cleanup predictable.
Both create and update follow a provider-first strategy: the provider is called first, then the Langflow DB is updated and committed. If the DB commit fails, the route issues a best-effort compensating call to the provider.
adapter.delete() to remove the provider resource. Secondary resources (snapshots, configs) are intentionally not cascade-deleted because they may be shared across deployments; they remain as orphaned provider-side resources.resolve_rollback_update(), then issues adapter.update(). If the mapper returns None (no rollback possible for this provider), provider state may diverge until it is independently detected (e.g., lazily synced in a read path).Provider-first write is used uniformly. For create, the provider assigns the resource ID and snapshot IDs that the DB needs to store, so calling the provider first is the natural fit. For update, provider-first could be replaced with DB-first since the resource_key already exists, but provider-first was chosen for simplicity: both strategies need the pre-update state for rollback, but with provider-first that state already lives in the DB — the mapper can query flow_version_deployment_attachment rows at rollback time — whereas DB-first would require explicitly capturing name, description, and every removed attachment's provider_snapshot_id into memory before mutating. Provider-first also avoids the two-commit flow that DB-first requires (one commit before the provider call, a second to fill in provider_snapshot_id values from the response), and eliminates the consistency window where other readers could observe DB state the provider hasn't processed yet. Since both compensating actions are best-effort either way, the simpler single-commit flow is preferred.
Rollback calls are always best-effort and wrapped in their own exception handling. A failed rollback must never mask the original commit error.
The mapper is responsible for constructing provider-specific rollback payloads from current DB state. The mapper reads the flow_version_deployment_attachment table to determine the pre-update attachment state and builds an adapter update payload that would restore the provider to that state.
The base mapper returns None (no generic rollback). Provider mappers override resolve_rollback_update when they can construct meaningful reverse operations.
Read-path synchronization operates at two levels and is an independent mechanism from write-path rollback. Synchronization detects and reconciles stale DB data caused by provider-side deletions — it does not undo or compensate for failed write-path operations.
provider_snapshot_id values in flow_version_deployment_attachment still exist in the provider. Stale attachment rows are removed to keep attached_count accurate.Rollback and synchronization address different consistency problems:
When rollback is unavailable or fails, provider state may diverge from the DB. Synchronization operates from DB rows outward (checking whether each row's resource still exists in the provider), so it can only detect stale DB rows for deleted provider resources. It cannot detect orphaned provider resources that were never recorded in the DB (e.g. a failed create rollback), nor can it detect that an existing provider resource's state diverged after a failed update rollback.
Routes that perform write-path rollback must call session.commit() explicitly after staging all DB writes, rather than relying on session_scope() auto-commit. This allows the route to catch commit failures and issue compensating provider calls before re-raising.
API response schemas must keep a clear ownership boundary between Langflow-managed data and provider-managed data.
deployment_id, name, created_at).provider_data dict.This prevents future collisions — for example, if Langflow starts persisting its own execution records, a top-level execution_id would be ambiguous against the provider's opaque run identifier.
Langflow-owned — fields derived from the Langflow database or assigned by Langflow logic:
deployment_id (DB UUID), id, name, description, typecreated_at, updated_at (DB timestamps)resource_key — provider-originated but stored and indexed by Langflow, so treated as Langflow-owned once persisted.Provider-owned — values returned by the external provider that Langflow passes through without persisting or interpreting:
execution_id (the provider's opaque run identifier)agent_id, status, result, started_at, completed_at, failed_at, cancelled_at, last_errorWhen adding a new field to an execution or deployment response:
provider_data.resource_key)? → top level is acceptable.If data is not persisted in the Langflow DB and comes directly from the provider,
it must go into provider_data.
Rules:
provider_data without exception.