Assembler and MacroAssembler Architecture

This document explains V8's low-level code generation abstractions: the Assembler and MacroAssembler. These are used by all compilers (Sparkplug, Maglev, TurboFan) and hand-written builtins to emit machine code.

The Layered Approach

V8 uses a multi-layered approach to code generation to balance performance, platform independence, and ease of use.

1. `AssemblerBase` and `Assembler`

At the lowest level is the Assembler (defined per architecture, e.g., src/codegen/x64/assembler-x64.h).

AssemblerBase: Provides common functionality like managing the instruction buffer, relocation information (reloc info), and code comments.
Assembler: Implements the actual instruction encoding for a specific architecture. For example, on x64, calling movq(rax, rbx) directly emits the corresponding bytes into the buffer.
Role: It is a 1:1 mapping to machine instructions. It does not perform any abstraction or simplification.

2. `MacroAssembler`

The MacroAssembler (also defined per architecture, e.g., src/codegen/x64/macro-assembler-x64.h) inherits from Assembler (via MacroAssemblerBase and often via a shared intermediate class like SharedMacroAssembler for related architectures).

MacroAssemblerBase: Contains platform-independent macro-assembler functionality (e.g., loading constants, root-relative addressing).
Role: It provides higher-level "macro" instructions and common idioms that are composed of multiple raw instructions.
Abstractions:
- Runtime Calls: CallRuntime(Runtime::k...) handles the complexity of setting up arguments and calling a C++ runtime function.
- Frame Setup: Methods to push/pop standard frame headers.
- Pseudo-instructions: Instructions that might not exist directly on the hardware but are common (e.g., Push on architectures that don't have a dedicated push instruction).

How They Are Used

In Compilers

Compilers like Maglev use the MacroAssembler in their GenerateCode methods.

cpp

void Int32AddWithOverflow::GenerateCode(MaglevAssembler* masm, const ProcessingState& state) {
  Register left = ToRegister(LeftInput());
  Register right = ToRegister(RightInput());
  __ addl(left, right); // Raw assembler call via MacroAssembler
  __ EmitEagerDeoptIf(overflow, DeoptimizeReason::kOverflow, this); // MacroAssembler helper
}

(Note: MaglevAssembler inherits from MacroAssembler and adds Maglev-specific helpers).

In Builtins

Hand-written assembler builtins are written directly using the MacroAssembler.

The Instruction Buffer and Relocation

The Assembler writes bytes into a growable buffer. Because code often contains references to objects or labels that are not yet fixed in memory, V8 uses Relocation Information (RelocInfo).

When the assembler emits a reference to a HeapObject or another code chunk, it records the position and the type of reference.
When the code is finalized and moved to its permanent location in the heap, V8 "relocates" these references, patching the code with the actual memory addresses.

Assembler and MacroAssembler Architecture

Assembler and MacroAssembler Architecture

The Layered Approach

1. AssemblerBase and Assembler

2. MacroAssembler

How They Are Used

In Compilers

In Builtins

The Instruction Buffer and Relocation

See Also

1. `AssemblerBase` and `Assembler`

2. `MacroAssembler`