internal-docs/ast-mutation/implementation.md
Rolldown threads per-AST-node metadata between compiler passes via side tables. The cross-pass identity key is now oxc's post-semantic NodeId, while Span remains source-location metadata for diagnostics, comments, source maps, and generated replacement spans.
The public oxc type is oxc::semantic::NodeId. It is the implementation behind the node_id() / set_node_id() accessors on AST nodes after semantic analysis; there is not a separate public AstNodeId type in the version Rolldown currently uses.
Rolldown's bundling pipeline has three stages that interact with the AST:
ScanStage::scan parses each module, runs Rolldown's pre-scan AST tweaks, then rebuilds semantic/scoping information. This final rebuild is what assigns every node — including the nodes the tweaks created — its NodeId, so the subsequent read-only walk via AstScanner sees stable ids while populating EcmaView side tables.LinkStage::link performs cross-module work such as symbol binding, export resolution, tree shaking, and cross-module optimization. It still does not mutate the AST, but it can derive additional side tables from scan-time records.ScopeHoistingFinalizer, driven from GenerateStage::generate, is the main stage that mutates the AST in place. It visits interesting nodes, calls node_id(), and queries the side tables to decide what to rewrite.Between passes, Rolldown does not hold direct references to AST nodes. Lifetimes and parallel cross-module work make that impractical. The durable identity for a node within one module AST is therefore its NodeId.
The shared invariant across passes:
NodeId of the AST node being recorded.node_id() from the current node and queries the table.ModuleIdx.Important constraints:
NodeId is only unique within a single AST. Any table that combines records from multiple modules must key by (ModuleIdx, NodeId).NodeId is meaningful only after semantic analysis has assigned ids. Rolldown's normal scan path is post-semantic, so scan-created records are valid.NodeId::DUMMY unless ids are assigned later. Do not insert cross-pass side-table records for synthetic DUMMY nodes.NodeId::DUMMY equals NodeId::ROOT (both are 0, the Program node's id). DUMMY probes from synthesized nodes only miss because no side table records a Program-level entry — never add a Program-keyed entry to a per-module NodeId table.Two paths finalize a clone of the scanned AST, produced by EcmaAst::clone_with_another_arena into a fresh allocator, and they satisfy the "same post-semantic AST" guarantee through different mechanisms:
NormalizedScanStageOutput::make_copy, ScanStageCache::create_output) hands its clones to the link stage and ScopeHoistingFinalizer, which reuse scan-time scoping and never re-run semantic. The clone itself must carry the scan-time ids — this is why clone_with_another_arena uses oxc's clone_in_with_semantic_ids rather than plain clone_in, which would reset every id to NodeId::DUMMY and silently break every lookup.crates/rolldown/src/hmr/hmr_stage.rs clone and then immediately run EcmaAst::make_semantic on the clone, which re-stamps every NodeId; the ids the clone preserved are overwritten before any lookup. Lookups still hit because SemanticBuilder numbers nodes purely by traversal order, so an unmutated clone of the same tree shape re-derives exactly the scan-time ids. Two invariants keep this true: nothing may mutate the clone before make_semantic runs, and oxc's numbering must remain a pure function of tree shape (true as of oxc 0.135 — builder options such as with_cfg / with_enum_eval do not affect numbering). Breaking either shifts ids silently: the indexing lookups (module.imports[&…]) panic, the .get() lookups silently skip rewrites.The main cross-pass side tables keyed by NodeId are:
EcmaView::imports - import declarations, export-from declarations, dynamic import() expressions, and recognized require() call expressions.EcmaView::dummy_record_set - require identifier references that need the runtime helper rewrite.EcmaView::new_url_references - new URL('...', import.meta.url) nodes mapped to asset import records.EcmaView::this_expr_replace_map - top-level this expressions that should become exports or undefined.MemberExprRef::node_id and LinkingMetadata::resolved_member_expr_refs - namespace/member-expression resolution from scan through link to finalization.DynamicImportExprInfo::node_id records the dynamic import() node within its own module; EntryPoint::related_stmt_infos then carries (ModuleIdx, …, NodeId, …) tuples so a dynamic-import entry can be traced back across the module graph.NodeId, only consumed within the same module's traversal) and a graph-wide set of unreachable dynamic imports keyed by (ModuleIdx, NodeId) because it aggregates records from every module.This means finalizer-generated nodes that keep the default NodeId::DUMMY do not accidentally match scan-time records. Span no longer needs to double as the key for these rewrite decisions.
Span remains the right representation for source positions. It is still used for:
importer_span for diagnostics that need to point at the full import site.For import records, the module-request span belongs to ImportRecordStateInit: dependency
resolution diagnostics still need to underline the original specifier, but the span is not
carried into ImportRecordStateResolved. Resolved records keep importer_span because later
passes, such as TLA import-chain diagnostics, need a location for the resolved import edge.
For member expressions, NodeId is the cross-pass lookup key, but spans remain necessary as
source locations: MemberExprRef::span points diagnostics at the original expression, and the
finalizer applies the current member expression span to generated replacements so source-map and
diagnostic locations stay tied to the rewritten source range.
Do not add a cross-pass node side table keyed only by Span. If a later pass needs to identify the same AST node, prefer NodeId; if records from more than one module can share a table, include ModuleIdx.
Oxc Address is still acceptable for scratch state inside one live AST traversal, where producer and consumer operate before the traversal returns and no data survives as cross-pass metadata. The current example is:
PreProcessor's statement_stack / statement_replace_map in crates/rolldown/src/utils/tweak_ast_for_scanning.rs.PreProcessor specifically cannot use NodeId: it runs before the final semantic rebuild (recreate_scoping in crates/rolldown/src/utils/pre_process_ecma_ast.rs), so node ids are not yet assigned to the nodes it creates or moves. Address is the only stable per-node identity available at that point, and it is safe because the table never outlives the traversal.
Do not store Address in module metadata, entry metadata, or link-stage tables that outlive the traversal that produced it. In the post-semantic scanner, prefer NodeId even for same-traversal node identity checks when the compared nodes already have semantic IDs.
PreProcessor does not rewrite spans for identity anymore. Pairwise span uniqueness does not back any identity table after the NodeId migration, so ordinary duplicate spans are left alone, and nodes created during pre-scan rewrites can keep the reserved synthetic span (SPAN, 0..0).
Later passes must not use span.is_unspanned() to decide whether a scanner-visible node has a cross-pass record. For example, finalizing a require() call now relies on EcmaView::imports.get(call_expr.node_id()): pre-scan-created calls have semantic NodeIds and can hit, while finalizer-created calls keep NodeId::DUMMY and miss.
The practical rule is simple: treat Span as location, NodeId as same-AST node identity, and (ModuleIdx, NodeId) as cross-module node identity.
SPAN / dummy-NodeId discipline for synthesized nodes is shared with that doc