docs/agents/dxf/README.md
This note is a navigation map for DXF (Distributed eXecution Framework). Use it to quickly find:
pkg/dxf/pkg/dxf/pkg/dxf/framework/pkg/dxf/importinto/pkg/ddl/pkg/executor/ and pkg/executor/importer/pkg/session/ and pkg/domain/doc.go), read that first.
pkg/dxf/framework/doc.go.pkg/dxf/framework/proto/: task type/step constants, task-subtask models,
and state machine enums.pkg/dxf/framework/handle/: task submission/control APIs (submit, wait,
pause/resume, cancel, modify).pkg/dxf/framework/storage/: task manager and dist-task table persistence.pkg/dxf/framework/scheduler/: owner-side scheduler manager and task
scheduler extension interfaces.pkg/dxf/framework/taskexecutor/: node-side executor manager and task
executor extension interfaces.pkg/dxf/framework/planner/: logical-plan to physical-plan/task creation
helpers used by DXF apps.pkg/dxf/operator/: operator/pipeline utilities reused by DXF apps.Runtime bootstrap and loops:
pkg/session/session.go: registers IMPORT INTO DXF scheduler/executor in
bootstrap (proto.ImportInto).pkg/ddl/ddl.go: registers DDL backfill DXF scheduler/executor and cleanup
hooks (proto.Backfill).pkg/domain/domain.go: starts executor manager on all nodes and starts/stops
scheduler manager based on DDL owner role.IMPORT INTO integration:
pkg/dxf/importinto/job.go: DXF task submission + job/task lifecycle bridge.pkg/dxf/importinto/planner.go: IMPORT INTO logical/physical planning by DXF
step.pkg/dxf/importinto/scheduler.go: IMPORT INTO scheduler extension.pkg/dxf/importinto/task_executor.go and
pkg/dxf/importinto/subtask_executor.go: executor + step executors.pkg/executor/import_into.go and pkg/executor/importer/import.go: SQL
executor/planning and user-facing behavior around IMPORT INTO.DDL backfill integration:
pkg/ddl/index.go: add-index flow that prepares/uses distributed backfill.pkg/ddl/backfilling_dist_scheduler.go: backfill scheduler extension.pkg/ddl/backfilling_dist_executor.go: backfill task executor extension.pkg/ddl/backfilling_clean_s3.go: backfill cleanup hook implementation.Framework internals:
pkg/dxf/framework/proto/task.go: task rank/order (priority, create_time,
id), slot-related task fields, and runtime slot calculation.pkg/dxf/framework/proto/step.go: task-type step definitions and step-order
contracts (includes compatibility notes on step constants).pkg/dxf/framework/scheduler/scheduler_manager.go: owner-side task admission,
slot reservation, cleanup/historical transfer loops.pkg/dxf/framework/taskexecutor/manager.go: node-side executor start/stop,
preemption handling, and meta recovery loop.pkg/dxf/framework/scheduler/slots.go and
pkg/dxf/framework/taskexecutor/slot.go: slot/stripe reservation logic and
preemption decisions.Learning/edge-case references:
docs/agents/import-into/README.md: IMPORT INTO conflict-resolution and
cleanup troubleshooting map.pkg/dxf/example/: minimal DXF app skeleton (scheduler + executor extension
wiring) useful when adding a new task type.framework/storage.Domain starts executor-manager loop on every TiDB node.pkg/dxf/framework/proto/ first.
step.go explicitly forbids changing
existing constant values).scheduler.Extension) and task executor extension
(taskexecutor.Extension) for the new task type.scheduler.RegisterSchedulerFactory(...)taskexecutor.RegisterTaskType(...)scheduler.RegisterSchedulerCleanUpFactory(...)GetNextStep deterministic from task base state (avoid relying on mutable
task meta there), and keep subtask generation stable when using batch switch APIs.(priority, create_time, id) tuple.RequiredSlots is the reservation baseline, while runtime execution may use a
lower slot count through ExtraParams.MaxRuntimeSlots + TargetSteps."background" nodes when present; otherwise it falls
back to empty-scope nodes.MaxConcurrentTask limit it switches to no-resource states only (for fast handling
of pausing/cancelling/reverting/modifying tasks).pkg/dxf/framework/scheduler/state_transform.go and
pkg/dxf/framework/scheduler/scheduler.go.pkg/dxf/framework/scheduler/slots.go,
pkg/dxf/framework/taskexecutor/slot.go, and
pkg/dxf/framework/taskexecutor/manager.go.pkg/domain/domain.go, pkg/dxf/framework/handle/handle.go, and
pkg/dxf/framework/storage/task_table.go."How does task state/lifecycle work?"
framework/proto -> framework/storage -> framework/scheduler ->
framework/taskexecutor."Where is submit/pause/resume/cancel implemented?"
framework/handle first, then follow call sites in pkg/dxf/importinto/
and pkg/ddl/."How does IMPORT INTO use DXF?"
pkg/dxf/importinto/ first, then pkg/executor/import_into.go and
pkg/executor/importer/."How does DDL backfill use DXF?"
pkg/ddl/index.go -> pkg/ddl/backfilling_dist_scheduler.go ->
pkg/ddl/backfilling_dist_executor.go, and cross-check
docs/agents/ddl/README.md.rg --line-number --glob '*.go' 'RegisterSchedulerFactory|RegisterSchedulerCleanUpFactory|RegisterTaskType' pkg/session pkg/ddl pkg/dxfrg --line-number --glob '*.go' 'SubmitTask|WaitTask|CancelTask|PauseTask|ResumeTask|ModifyTaskByID' pkg/dxf pkg/executor pkg/ddlrg --line-number --glob '*.go' 'proto.ImportInto|importinto' pkg/dxf pkg/executor pkg/sessionrg --line-number --glob '*.go' 'proto.Backfill|backfill' pkg/ddlpkg/dxf/framework/integrationtests/ and sibling framework
package tests.pkg/dxf/importinto/.pkg/ddl/backfilling_*_test.go and related
distributed-backfill tests under pkg/ddl/.