Back to Eliza

RISC-V IOMMU v1.0.1 contract

packages/chip/docs/arch/iommu.md

2.0.39.1 KB
Original Source

RISC-V IOMMU v1.0.1 contract

The Eliza phone-class SoC implements the ratified RISC-V IOMMU v1.0.1 specification for per-device DMA isolation across the NPU command queue, GPU contexts, DMA channels, display planes, and camera ISP pipelines. The implementation lives in rtl/iommu/ and is verified by cocotb tests under verify/cocotb/iommu/.

Scope

The IOMMU sits between bus masters and the AXI4 system fabric (see docs/arch/interconnect.md and rtl/interconnect/axi4/). It owns:

ConcernOwner
Per-device translation (DDT walk)IOMMU
Per-process translation (PDT walk, PASID)IOMMU
Two-stage translation (Sv39 + G-stage / Sv48 + G-stage)IOMMU
Fault recording into a memory-resident ringIOMMU
Page-request interface for SVAIOMMU
ATS request/response with PCIe-style devicesIOMMU
Snoop coherencyCache agent (separate domain)
QoS arbitrationAXI4 interconnect + DRAM controller
Boot-time enable / Linux device-tree bindingSoftware

Feature subset

The 2028 SoC implements the v1.0.1 mandatory feature set plus the optional features required by the upstream Linux RISC-V IOMMU driver (merged for v6.10+):

FeatureStatusNotes
Sv39 first-stagerequiredmatches MMU Sv39 mode
Sv48 first-stagerequiredmatches MMU Sv48 mode
Sv57 first-stageoptionaladvertised in CAPABILITIES
Sv39x4 G-stagerequiredvirtualization (H-extension hosts)
Sv48x4 G-stagerequiredvirtualization
PASID (PD20, 20-bit)requiredNPU command-queue contexts
ATSrequiredPCIe-style devices using disco/peripheral RC
PRI / page-request interfacerequiredenables shared virtual memory
MSI translation (IGS=2)requiredmatches Linux v6.x driver
T2GPAoptionaltranslates GPA into HPA on ATS
DDT 1-level / 2-level / 3-level walkrequiredscales to 24-bit DID

Register map

The IOMMU exposes a 4 KiB MMIO aperture beginning at iommu_base (defined by the device tree node). Register offsets follow the v1.0.1 spec exactly:

OffsetNameWidthDescription
0x000CAPABILITIES64Feature advertisement; read-only
0x008FCTL32Global control: WSI/BE/GXL bits
0x010DDTP64Device-directory pointer + mode
0x018CQB64Command-queue base + log size
0x020CQH32CQ head (IOMMU-updated)
0x024CQT32CQ tail (driver-updated)
0x028FQB64Fault-queue base + log size
0x030FQH32FQ head (driver-updated)
0x034FQT32FQ tail (IOMMU-updated)
0x038PQB64Page-request-queue base
0x040PQH32PQ head
0x044PQT32PQ tail
0x048CQCSR32CQ control/status
0x04CFQCSR32FQ control/status
0x050PQCSR32PQ control/status
0x054IPSR32Interrupt pending status (W1C)
0x258TR_REQ_IOVA64Debug-translation request IOVA
0x260TR_REQ_CTL64Debug-translation request control
0x268TR_RESPONSE64Debug-translation response
0x2F8ICVEC64Interrupt-cause vector configuration
0x300MSI_CFG_TBLMSI configuration table base

The implementation also exposes a non-architectural simple device-id allowlist beginning at offset 0x800 for early bring-up verification. This range is reserved for upstream Linux compatibility (no real driver uses it) and is replaced when the full DDT walker lands.

Fault record format

Each fault queue entry is 32 bytes laid out as four 64-bit words:

WordBitsField
0[11:0]CAUSE (page-fault, permission, DDT-walk, …)
0[17:12]TTYP (transaction type)
0[18]PRIV
0[20]PV (PASID valid)
0[40:21]PID (PASID)
0[63:41]DID (24-bit, low 23 bits here; spec carries the rest in word 1)
1[3:0]iotval-present flags
1[63:4]reserved
2[63:0]iotval (IOVA or fault address)
3[63:0]iotval2 (guest-physical address for G-stage faults)

CAUSE encodings used most frequently in verification:

ValueMnemonicMeaning
1INSN_ACCESS_FAULTinstruction fetch access fault
5LOAD_ACCESS_FAULTdata read access fault
7STORE_ACCESS_FAULTdata write access fault
12INSN_PAGE_FAULTfirst-stage instruction page fault
13LOAD_PAGE_FAULTfirst-stage load page fault
15STORE_PAGE_FAULTfirst-stage store page fault
21LOAD_GUEST_PAGE_FAULTG-stage load fault
23STORE_GUEST_PAGE_FAULTG-stage store fault
256ALL_INBOUND_DISALLOWEDIOMMU is OFF and rejected the request
258DDT_ENTRY_NOT_VALIDtranslation requested without a valid DC
259DDT_ENTRY_MISCONFIGUREDDC field combination invalid
260TRANSACTION_TYPE_DISALLOWEDDC blocks this TTYP

TTYP encodings used in verification:

ValueMeaning
1untranslated read (data)
2untranslated write or AMO
3untranslated read for instruction fetch
4translated read
5translated write or AMO
6translated read for instruction fetch
7PCIe ATS translation request
8PCIe message request
9page request from device

ATS support

The implementation advertises CAPABILITIES.ATS = 1. PCIe-style devices issue ATS translation requests through the upstream AXI4 master with AxUSER bit 7 asserted; the IOMMU replies with an ATS completion that carries the translated address plus the global/exec/privilege bits. ATS is required for the Android NN HAL on RISC-V because the kernel RISC-V IOMMU driver expects ATS-capable devices to opt into pre-resolved translation.

Page-request interface (PRI)

When a device issues a transaction whose translation faults but the underlying page may be made present by the OS (shared virtual memory), the IOMMU emits a page-request record into PQB/PQT. The Linux driver responds by populating the page tables and issuing a IOTINVAL.VMA command to flush the IOMMU's TLB. This loop matches the upstream PCIe PRI protocol implemented by the kernel.

Kernel-driver expectations

Linux requires the following bindings (kernel v6.10+):

  • Device-tree node iommu@<base> with compatible riscv,iommu and the base/size pair.
  • riscv,fcfg property listing the optional features the IOMMU advertises so that the driver does not poll for unsupported bits.
  • Per-master iommus = <&iommu, did> references that bind a bus master to a device-id (DID) and allow the kernel to manage its DC.
  • MSI-fixed interrupts property; the IOMMU drives a wired interrupt when FCTL.WSI is set or routes via the IMSIC otherwise.

The Android dma-buf v2 mapping ABI (see docs/arch/dma-buf-v2.md) relies on iommu_attach_device and iommu_map_sgtable with the same DID. Closed BSPs that map dma-bufs without going through iommu_map fail closed against unauthorized-IOVA tests.

Verification surface

TestLocationCoverage
test_riscv_iommu.py::capabilities_register_advertises_v1_featuresverify/cocotb/iommu/CAPABILITIES register bits
test_riscv_iommu.py::bare_mode_passes_traffic_with_no_faultverify/cocotb/iommu/DDTP=BARE identity passthrough
test_riscv_iommu.py::translate_mode_blocks_unknown_devid_with_faultverify/cocotb/iommu/unauthorized devid raises CAUSE 258
test_riscv_iommu.py::translate_mode_allows_known_devidverify/cocotb/iommu/authorized devid completes
test_riscv_iommu.py::pasid_isolation_via_allowlist_revokeverify/cocotb/iommu/revoking a DID re-faults

Authoritative behavioural reference: the RTL is cross-checked against the riscv-non-isa/riscv-iommu upstream model whose pinned revision is recorded in verify/cocotb/iommu/refmodel/riscv-iommu.manifest.yaml. The cloned tree itself lives under verify/external/ (gitignored); the manifest survives in tracked storage so the pin is never lost.

Evidence gate

The fail-closed evidence gate for this block is docs/evidence/memory/iommu-evidence-gate.yaml. Promoting any phone-class IOMMU claim requires:

  1. Passing every cocotb test listed above.
  2. A pinned reference-model revision under verify/external/riscv-iommu/manifest.yaml.
  3. A fault-injection report at docs/evidence/memory/iommu_fault_injection_report.json produced by scripts/check_iommu_evidence.py.
  4. ATS round-trip evidence (TR_REQ_IOVA → TR_RESPONSE) with a Linux v6.10+ kernel boot transcript.
  5. PASID-context-switch evidence proving that two simultaneously active masters with different PASIDs see isolated translations.

The gate also blocks tapeout-readiness claims until every entry under docs/evidence/memory/uma-dram-evidence-gate.yaml::blocked_real_claims has cleared.