packages/chip/docs/architecture-optimization/compute-silicon.md
| Priority | Optimization | Release boundary |
|---|---|---|
| P0 | Replace the tiny CPU contract with a Chipyard Rocket or CVA6 integration path. | No Android boot claim until BSP logs and boot transcripts exist. |
| P0 | Keep DMA as the first shared-memory performance primitive and prove ordering, backpressure, and error handling. | No coherent DMA claim until memory-system verification exists. |
| P1 | Increase memory bandwidth before adding wider accelerators. | No benchmark claim from simulator wall-clock time. |
| P1 | Apply the modeled CPU+NPU no-throttle operating point in soc-optimized-operating-point.yaml: 1.4 W CPU/AP active budget, 1.2 W NPU active budget, 44 dense INT8 TOPS modeled base, and 208 GB/s sustained memory target with 5% memory/TOPS/power guardbands. | No design claim until make soc-optimization still matches the work order and real target evidence replaces the model. |
| P1 | Add NPU operator coverage only with unsupported op count and CPU fallback percentage. | No AI throughput claim without real calibrated runs. |
| P2 | Explore cache, scratchpad, quantization, compression, and tiling for performance per watt. | No size or power win claim without synthesis and power evidence. |
Scale-up work must keep the RTL contract, software header, cocotb evidence, and benchmark metadata in sync.