src/platforms/arm/nrf52/LOOP_NRF52.md
Status: Phase 3 Complete - Dual-SPI and Quad-SPI fully integrated with bit-level transposition
This document provides the detailed implementation plan for adding Dual-SPI (2-lane), Quad-SPI (4-lane), and Octal-SPI (8-lane) support to Nordic nRF52 series microcontrollers using the SPI Proxy + Bus Manager architecture.
┌─────────────────────────────────────────────────────────────┐
│ LED Controller (APA102, SK9822, etc.) │
└───────────────────┬─────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ SPIDeviceProxy<DATA_PIN, CLOCK_PIN, SPI_CLOCK_DIVIDER> │
│ - Registers with SPIBusManager │
│ - Routes to Single-SPI or Multi-lane SPI │
│ - Buffers data for multi-lane transmission │
└───────────────────┬─────────────────────────────────────────┘
│
┌─────────────┴────────────────┐
▼ ▼
┌──────────────────┐ ┌─────────────────────────┐
│ NRF52SPIOutput │ │ SPIBusManager │
│ (Single-lane) │ │ - Detects conflicts │
│ │ │ - Promotes to multi │
└──────────────────┘ │ - Coordinates DMA │
└──────────┬──────────────┘
│
┌─────────────────┼─────────────────┐
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ SpiHw2 │ │ SpiHw4 │ │ SpiHw8 │
│ (Dual) │ │ (Quad) │ │ (Octal) │
└─────────┘ └─────────┘ └─────────┘
│ │ │
└─────────────────┴─────────────────┘
▼
┌──────────────────────────────────────┐
│ NRF52 Hardware (SPIM + GPIOTE + PPI) │
└──────────────────────────────────────┘
Created spi_device_proxy.h
src/platforms/arm/nrf52/spi_device_proxy.hinit(), select(), release()writeByte(), writeWord()writeBytesValue(), writeBytes(), writeBytes<D>()writeBit<BIT>() (single-SPI only)finalizeTransmission() (new method for multi-lane flush)Architecture Decisions
init() callUnlike ESP32/RP2040 which have native multi-lane SPI hardware, nRF52 requires a creative approach using Nordic's peripheral interconnect system:
SPIM (SPI Master)
GPIOTE (GPIO Tasks and Events)
PPI (Programmable Peripheral Interconnect)
TIMER
Challenge: NRF52 SPIM peripherals don't have hardware support for multi-lane SPI. We need to synchronize multiple SPIM instances to output data in parallel.
Solution: Use TIMER + PPI + GPIOTE to create a synchronized clock signal, then coordinate SPIM data transmission:
TIMER (generates clock events)
│
└─[PPI]→ GPIOTE Task (toggle clock pin)
│
└─[PPI]→ SPIM0 START Task (lane 0 data)
│
└─[PPI]→ SPIM1 START Task (lane 1 data)
│
└─[PPI]→ SPIM2 START Task (lane 2 data)
│
└─[PPI]→ SPIM3 START Task (lane 3 data, nRF52840 only)
Key Constraints:
File: src/platforms/arm/nrf52/spi_hw_2_nrf52.h ✅ CREATED
File: src/platforms/arm/nrf52/spi_hw_2_nrf52.cpp ✅ CREATED
Class: SPIDualNRF52 (implements SpiHw2 interface)
Configuration:
Methods to Implement:
class SpiHw2NRF52 : public SpiHw2 {
public:
bool begin(const Config& config) override;
void end() override;
bool transmit(fl::span<const uint8_t> buffer) override;
bool waitComplete(uint32_t timeout_ms = UINT32_MAX) override;
bool isBusy() const override;
bool isInitialized() const override;
int getBusId() const override;
const char* getName() const override;
private:
void configureTimer();
void configurePPI();
void configureGPIOTE();
void startTransmission();
};
Key Implementation Details:
File: src/platforms/arm/nrf52/spi_hw_4_nrf52.h ✅ CREATED
File: src/platforms/arm/nrf52/spi_hw_4_nrf52.cpp ✅ CREATED
Class: SPIQuadNRF52 (implements SpiHw4 interface)
Configuration:
Methods to Implement:
class SpiHw4NRF52 : public SpiHw4 {
public:
bool begin(const Config& config) override;
void end() override;
bool transmit(fl::span<const uint8_t> buffer) override;
bool waitComplete(uint32_t timeout_ms = UINT32_MAX) override;
bool isBusy() const override;
bool isInitialized() const override;
int getBusId() const override;
const char* getName() const override;
private:
void configureTimer();
void configurePPI();
void configureGPIOTE();
void startTransmission();
};
Key Implementation Details:
Note: Octal-SPI (8-lane) is NOT feasible on nRF52 due to hardware constraints:
Recommendation: Skip SpiHw8 implementation for nRF52. Document in platform limitations.
File: src/platforms/arm/nrf52/spi_hw_2_nrf52.cpp (and spi_hw_4_nrf52.cpp)
Implement the createInstances() factory for each interface:
// In spi_hw_2_nrf52.cpp
namespace fl {
fl::vector<SpiHw2*> SpiHw2::createInstances() {
static SpiHw2NRF52 instance0; // Dual-SPI using SPIM0/1
fl::vector<SpiHw2*> instances;
instances.push_back(&instance0);
return instances;
}
} // namespace fl
// In spi_hw_4_nrf52.cpp
namespace fl {
fl::vector<SpiHw4*> SpiHw4::createInstances() {
fl::vector<SpiHw4*> instances;
#if defined(NRF52840) || defined(NRF52833)
// nRF52840/833 has SPIM3 (32 MHz capable)
static SpiHw4NRF52 instance0; // Quad-SPI using SPIM0/1/2/3
instances.push_back(&instance0);
#endif
// nRF52832 only has SPIM0/1/2, so Quad-SPI limited to 3 lanes
// (SPIBusManager will fall back to Dual-SPI or Single-SPI)
return instances;
}
} // namespace fl
File: src/platforms/shared/spi_manager.h ✅ UPDATED
Updated getMaxSupportedSPIType() to detect nRF52 - COMPLETE
Platform detection now includes:
File: src/platforms/shared/spi_manager.h ✅ UPDATED
Added nRF52 hardware includes - COMPLETE
Includes added:
platforms/shared/spi_hw_2.h for Dual-SPI supportFiles: Various chipset headers (e.g., src/chipsets/apa102.h, src/chipsets/sk9822.h)
Update SPI output type selection to use proxy on nRF52:
// In chipsets that use SPI (APA102, SK9822, etc.)
#if defined(ESP32) || defined(ESP32S2) || defined(ESP32S3) || defined(ESP32C3) || defined(ESP32P4)
#include "platforms/esp/32/spi_device_proxy.h"
using SPIOutput = fl::SPIDeviceProxy<DATA_PIN, CLOCK_PIN, SPI_SPEED>;
#elif defined(__IMXRT1062__) && defined(ARM_HARDWARE_SPI)
#include "platforms/arm/teensy/teensy4_common/spi_device_proxy.h"
using SPIOutput = fl::SPIDeviceProxy<DATA_PIN, CLOCK_PIN, SPI_SPEED, SPIObject, SPI_INDEX>;
#elif defined(NRF52) || defined(NRF52832) || defined(NRF52840) || defined(NRF52833)
#include "platforms/arm/nrf52/spi_device_proxy.h"
using SPIOutput = fl::SPIDeviceProxy<DATA_PIN, CLOCK_PIN, SPI_CLOCK_DIVIDER>;
#else
// Standard single-SPI fallback
using SPIOutput = StandardSPIOutput<DATA_PIN, CLOCK_PIN>;
#endif
Create unit tests for nRF52 parallel SPI:
File: tests/test_nrf52_parallel_spi.cpp
Test Cases:
Required Hardware:
Test Scenarios:
Metrics:
Expected Results:
Problem: SPIM peripherals don't share a clock signal in hardware.
Solution: Use TIMER + PPI to generate synchronized START events for all SPIM instances. The TIMER compare event triggers all SPIM.START tasks simultaneously via PPI.
Problem: EasyDMA requires buffers in RAM, not flash or stack.
Solution:
SPIBusInfo::lane_buffersProblem: GPIOTE only has 8 channels, shared with other FastLED features.
Solution:
Problem: PPI has 20 channels, can run out with multiple peripherals.
Solution:
Problem: Can't do full 4-lane Quad-SPI on nRF52832.
Solution:
| Platform | Max Lanes | Max Clock | DMA | Performance |
|---|---|---|---|---|
| ESP32 | 8 | 80 MHz | ✅ | ★★★★★ |
| RP2040 | 8 | 62.5 MHz | ✅ | ★★★★★ |
| Teensy 4 | 4 | 30 MHz | ✅ | ★★★★ |
| nRF52840 | 4 | 32 MHz | ✅ | ★★★★ |
| nRF52832 | 2 | 8 MHz | ✅ | ★★ |
Verdict: nRF52840 is competitive with Teensy 4.x for Quad-SPI. nRF52832 is limited but Dual-SPI still provides 2× performance improvement.
Created SPIDualNRF52 class (spi_hw_2_nrf52.h and spi_hw_2_nrf52.cpp)
SpiHw2 interface for NRF52 platformcreateInstances()Updated Platform Detection
SPIBusManager::getMaxSupportedSPIType()spi_hw_2.h header for nRF52 platformsCompilation Verified
adafruit_feather_nrf52840_sense boardHardware Synchronization Not Implemented
configureTimer(), configurePPI(), configureGPIOTE() are placeholdersData Interleaving is Simplified
Timeout Support Missing
waitComplete() doesn't honor timeout_ms parameterResource Management Incomplete
Implemented Hardware Synchronization
startTransmission() method using TIMER triggerAdded Timeout Support
waitComplete() using iteration-based timingGPIOTE Analysis
Integration and Testing
begin() methodtransmit() to use synchronized transmissionData Interleaving Still Simplified
Timeout Mechanism is Approximate
Hardware Not Tested
Performance Not Measured
Implement Quad-SPI Driver (SpiHw4)
spi_hw_4_nrf52.h and spi_hw_4_nrf52.cppImprove Data Transposition
Hardware Testing
Update Chipset Controllers
src/platforms/shared/spi_manager.h - Bus manager interfacesrc/platforms/shared/spi_hw_2.h - Dual-SPI interfacesrc/platforms/shared/spi_hw_4.h - Quad-SPI interfacesrc/platforms/esp/32/spi_device_proxy.h - ESP32 proxy referencesrc/platforms/arm/teensy/teensy4_common/spi_device_proxy.h - Teensy proxy referenceCreated SPIQuadNRF52 class (spi_hw_4_nrf52.h and spi_hw_4_nrf52.cpp)
SpiHw4 interface for NRF52840/52833 platformscreateInstances()Implemented Hardware Synchronization for Quad-SPI
startTransmission() method using TIMER triggerwaitComplete() with 4 SPIM checksUpdated Platform Detection and Integration
spi_hw_4.h include for nRF52840/52833 in SPIBusManagerpromoteToMultiSPI() to support nRF52840/52833 quad-SPIwaitComplete() to handle nRF52840 quad-SPI controllersreleaseBusHardware() to cleanup nRF52840 quad-SPIDMA Buffer Management for Quad-SPI
cleanup()Compilation Verification
adafruit_feather_nrf52840_sense boardData Interleaving Still Simplified
Timeout Mechanism is Approximate
Hardware Not Tested
Performance Not Measured
Resource Allocation Hardcoded
Hardware Testing
Improve Data Transposition
Update Chipset Controllers (if needed)
Performance Benchmarking
Integrated SPITransposer for Bit-Level Interleaving
#include "platforms/shared/spi_transposer.h" to SPIBusManagerSPIBusManager::transmit() (was TODO)SPIBusManager::finalizeTransmission()SPITransposer::transpose2() for proper bit-level interleavingCompleted Dual-SPI Promotion Logic
promoteToMultiSPI() for DUAL_SPI (was returning false with "not implemented")SpiHw2::getAll()Updated SPIBusManager Integration
waitComplete() methodreleaseBusHardware()finalizeTransmission() to handle both DUAL_SPI and QUAD_SPIVerified Compilation
adafruit_feather_nrf52840_senseRemoved Byte-Level Interleaving from Hardware Drivers
transmit()Data Flow for Dual-SPI:
LED Controller → SPIBusManager::transmit() → lane_buffers[0/1]
↓
SPIBusManager::finalizeTransmission()
↓
SPITransposer::transpose2() → interleaved_buffer (2× size, bit-level)
↓
SpiHw2::transmit(interleaved_buffer) → Hardware (SPIM0/1)
Key Insight: Hardware drivers receive PRE-TRANSPOSED data, not raw lane data. The transposition happens in SPIBusManager, not in the hardware driver.
Hardware Testing (requires physical hardware)
Optional Code Cleanup
transmit() to just set up DMA pointersPerformance Benchmarking (after hardware testing)
Last Updated: 2025-10-16 (Iteration 6) Status: Phase 3 Complete - Dual-SPI and Quad-SPI fully integrated with bit-level transposition ✅ Next Phase: Hardware Testing and Performance Benchmarking 🔨