docs/agents/ddl/02-job-lifecycle.md
This doc focuses on the persistent “job” as the unit of DDL execution.
DDL jobs are persisted as model.Job (pkg/meta/model/job.go).
JobVersion1: legacy, args stored as an untyped array (pre-v8.4.0).JobVersion2: typed args structs (from v8.4.0).pkg/meta/model/job.go (type JobVersion, JobVersion1, JobVersion2, GetJobVerInUse).Practical implication for developers: when adding/changing job arguments, check whether the job type is already migrated to v2; ensure decoding/encoding is compatible and test both versions if needed.
There are two related but distinct “state” dimensions:
Job state (conceptually): whether the job is running/rolling back/done, etc.
Implemented by model.JobState and helper methods like job.IsRunning() / job.IsRollingback().
Schema state (online DDL state machine): how visible the schema change is to SQL.
For many DDLs, the state sequence follows (simplified):
nonedelete onlywrite onlyreorg (reorganization / backfill)publicThe submitter-side wait loop assumes this simplified sequence (see the comment in pkg/ddl/executor.go:DoDDLJobWrapper).
Why schema states exist: to keep DML and queries safe while the schema is changing, by gradually changing allowed operations and ensuring all nodes see each transition before the next.
pkg/ddl/job_scheduler.go:OnBecomeOwner).Developer implication: each job step must be:
After schema changes, TiDB relies on a global schema version + per-job synchronization to coordinate all nodes.
Key pieces:
pkg/ddl/job_worker.go:updateGlobalVersionAndWaitSyncedpkg/ddl/schema_version.go:waitVersionSyncedpkg/ddl/schemaver/syncer.go (Syncer.WaitVersionSynced)At a high level:
OwnerUpdateGlobalVersion).UpdateSelfVersion).WaitVersionSynced).When MDL is disabled, some paths fall back to lease-based waiting (see waitVersionSyncedWithoutMDL in pkg/ddl/schema_version.go).
DDL job persistence is backed by system tables under mysql.* (job queue, history, MDL info, …).
The storage access layer is abstracted by pkg/ddl/systable/manager.go (type Manager), used by scheduler/worker to read job and MDL information.
Tip: if you need to change how jobs are stored/loaded, start from systable.Manager and track call sites from scheduler/worker.
The submitting session waits in pkg/ddl/executor.go:DoDDLJobWrapper via:
ddlJobDoneChMap (wired in pkg/ddl/ddl.go:NewDDL).This dual approach is important:
WaitVersionSynced skipped incorrectly → nodes observe incompatible states.