packages/chip/docs/board/fpga/platform-selection.md
Status: decision recorded Owner: board/fpga Date: 2026-05-17
The project commits to a two-stage FPGA prototyping strategy:
The two stages are sequential, not exclusive. The ECP5 platform stays in the lab for e1-demo and small regression bring-up even after the VCU118 / F1 flow comes online for the Rocket+Gemmini SoC.
The e1-demo MMIO chip and the Rocket+Gemmini SoC have resource budgets that differ by roughly two orders of magnitude. Forcing both onto a single platform either over-pays for e1-demo (waiting on a VCU118 just to blink an LED) or under-provisions for Rocket+Gemmini (Rocket alone barely fits an ECP5-85F; Gemmini does not fit at all). Splitting the platform decision lets e1-demo bring-up run today on cheap, fully open silicon while the heavier SoC stays on a path with realistic capacity headroom.
Estimates below are order-of-magnitude. They are sourced from public Chipyard FPGA reports (Rocket small-config) and Gemmini paper datapoints (16x16 systolic array, INT8) plus typical DDR controller overhead.
| Design | LUT (k) | FF (k) | BRAM (Mb) | DSP | Off-chip DRAM | Fits ECP5-85F | Fits Zynq-7020 | Fits VCU118 (XCVU9P) |
|---|---|---|---|---|---|---|---|---|
| e1-demo MMIO | < 10 | < 8 | < 0.5 | 0 | none (BRAM) | yes | yes | overkill |
| Rocket small (1 core) | 35 | 20 | 4 | 10 | 256 MB DDR | tight, no DDR | tight, no DSP | yes |
| Rocket + Gemmini 16x16 | ~150 | ~100 | ~20 | ~200 | >= 1 GB DDR4 | no | no | yes |
ECP5-85F has ~84 k LUT4 and ~3.7 Mb BRAM and no hard DDR4 PHY. It is fundamentally below the Rocket+Gemmini line on both logic and memory bandwidth. Zynq-7020 has ~53 k LUT and 4.9 Mb BRAM with only DDR3 via the PS side; it cannot host Gemmini's DSP-heavy MAC array. HAPS-class emulators (HAPS-100/200) fit easily but cost six figures per seat and are not justified before tape-out planning starts.
VCU118 (XCVU9P) has ~1.18 M LUT, ~75 Mb URAM+BRAM, 6840 DSP slices, and an on-board DDR4 SODIMM slot wired to a hard PHY. Rocket+Gemmini lands at roughly 10-15 % LUT utilization with room for L2, DMA, and a debug bridge.
The two stage-2 options are not mutually exclusive. The recommended order is metasim -> VCU118 (if hardware is available) -> F1 for long-running benchmark sweeps.
board/fpga/package/wifi_external_module_adapter.yaml.LFE5U-85F-6BG381C.--85k --package CABGA381.clk_25mhz.openFPGALoader -b ulx3s.unassigned, so the LPF remains a scaffold even though it has concrete
preliminary package sites.docs/board/fpga/README.mddocs/rtl/open_rtl_prototype_path.mddocs/project/board-package-pd-fpga-critical-gap-audit.mddocs/generators/chipyard/README.mddocs/toolchain/headless-cli-audit.md