src/platforms/arm/stm32/HARDWARE_RESOURCES.md
This document defines the hardware resource allocation strategy for STM32 parallel SPI implementations. It covers Timer, DMA, and GPIO resource requirements, allocation strategies, and platform-specific considerations for all STM32 variants.
| Mode | Timer | DMA Channels | GPIO Pins | Notes |
|---|---|---|---|---|
| Single-SPI | 0 | 0 | 2 (CLK, MOSI) | Uses hardware SPI peripheral or software bitbang |
| Dual-SPI | 1 | 2 | 3 (CLK, D0, D1) | 2x throughput vs single-lane |
| Quad-SPI | 1 | 4 | 5 (CLK, D0-D3) | 4x throughput vs single-lane |
| Octal-SPI | 1 | 8 | 9 (CLK, D0-D7) | 8x throughput vs single-lane |
Example 1: Dual + Quad + Single
Example 2: Two Octal Buses
Example 3: Four Dual Buses
Priority Order:
Avoid:
Mode: PWM Generation Mode 1
Register Settings:
// Pseudo-code for Timer setup
TIM_HandleTypeDef htim;
htim.Instance = TIMx; // TIM2, TIM3, TIM4, etc.
htim.Init.Prescaler = 0; // No prescaling for maximum frequency
htim.Init.CounterMode = TIM_COUNTERMODE_UP;
htim.Init.Period = (Timer_Clock / desired_frequency) - 1;
htim.Init.ClockDivision = TIM_CLOCKDIVISION_DIV1;
htim.Init.AutoReloadPreload = TIM_AUTORELOAD_PRELOAD_ENABLE;
// PWM Channel Configuration
TIM_OC_InitTypeDef sConfigOC;
sConfigOC.OCMode = TIM_OCMODE_PWM1;
sConfigOC.Pulse = htim.Init.Period / 2; // 50% duty cycle
sConfigOC.OCPolarity = TIM_OCPOLARITY_HIGH;
sConfigOC.OCFastMode = TIM_OCFAST_DISABLE;
Clock Frequency Calculation:
Period = (Timer_Clock / 5000000) - 1TIM2 Channels (Example for STM32F4):
TIM3 Channels (Example for STM32F4):
TIM4 Channels (Example for STM32F4):
Note: Pin availability varies by STM32 package (QFP48, LQFP64, LQFP100, etc.)
STM32F1/F4/G4/L4:
STM32H7:
STM32U5:
Priority Scheme:
Reserve for Critical Functions:
Allocate for Parallel SPI:
Example Allocation (STM32F4):
DMA1:
Stream 0-3: Reserved for peripherals (UART, I2C, etc.)
Stream 4-7: Available for Parallel SPI
DMA2:
Stream 0-7: Available for Parallel SPI (preferred)
Dual-SPI Bus Allocation:
Quad-SPI Bus Allocation:
Octal-SPI Bus Allocation:
Transfer Mode: Memory to Memory (or Memory to Peripheral)
Register Settings:
// Pseudo-code for DMA setup
DMA_HandleTypeDef hdma;
hdma.Instance = DMA2_Stream0; // Or appropriate stream
hdma.Init.Channel = DMA_CHANNEL_0; // Or appropriate channel/request
hdma.Init.Direction = DMA_MEMORY_TO_PERIPH;
hdma.Init.PeriphInc = DMA_PINC_DISABLE; // Fixed peripheral address
hdma.Init.MemInc = DMA_MINC_ENABLE; // Increment memory address
hdma.Init.PeriphDataAlignment = DMA_PDATAALIGN_BYTE;
hdma.Init.MemDataAlignment = DMA_MDATAALIGN_BYTE;
hdma.Init.Mode = DMA_NORMAL; // Not circular
hdma.Init.Priority = DMA_PRIORITY_HIGH; // Or VERY_HIGH for time-critical
hdma.Init.FIFOMode = DMA_FIFOMODE_DISABLE;
DMA-Timer Linkage:
// Link DMA to Timer Update event
__HAL_LINKDMA(&htim, hdma[TIM_DMA_ID_UPDATE], &hdma);
// Enable Timer DMA request
__HAL_TIM_ENABLE_DMA(&htim, TIM_DMA_UPDATE);
// Start DMA transfer
HAL_DMA_Start(&hdma, (uint32_t)buffer, (uint32_t)&GPIOx->ODR, buffer_size);
// Start Timer
HAL_TIM_PWM_Start(&htim, TIM_CHANNEL_1);
Clock Pin (Timer Output):
Data Pins (DMA-Controlled):
Port Grouping (Optional but Recommended):
Example: Octal-SPI on Port B:
Clock: PA5 (TIM2_CH1)
Data Lanes: PB0-PB7 (8 consecutive pins)
Alternate: Scattered Pins:
Clock: PA5 (TIM2_CH1)
Data0: PA0
Data1: PA1
Data2: PB0
Data3: PB1
Data4: PC0
Data5: PC1
Data6: PD0
Data7: PD1
GPIO Configuration:
GPIO_InitTypeDef GPIO_InitStruct = {0};
// Clock pin (Timer Alternate Function)
GPIO_InitStruct.Pin = CLOCK_PIN;
GPIO_InitStruct.Mode = GPIO_MODE_AF_PP;
GPIO_InitStruct.Pull = GPIO_NOPULL;
GPIO_InitStruct.Speed = GPIO_SPEED_FREQ_VERY_HIGH;
GPIO_InitStruct.Alternate = GPIO_AF1_TIM2; // Or appropriate AF
HAL_GPIO_Init(CLK_GPIO_PORT, &GPIO_InitStruct);
// Data pins (GPIO Output)
for (int i = 0; i < num_lanes; i++) {
GPIO_InitStruct.Pin = data_pins[i];
GPIO_InitStruct.Mode = GPIO_MODE_OUTPUT_PP;
GPIO_InitStruct.Pull = GPIO_NOPULL;
GPIO_InitStruct.Speed = GPIO_SPEED_FREQ_HIGH;
HAL_GPIO_Init(DATA_GPIO_PORT[i], &GPIO_InitStruct);
}
Clock: 72 MHz max Timers: TIM1-4 available (TIM1 advanced, TIM2-4 general-purpose) DMA: 2 controllers × 7 channels = 14 total GPIO Speed: 50 MHz max Recommendation: Dual-SPI or Quad-SPI (limited DMA channels)
Example Configuration:
Clock: 84-180 MHz Timers: TIM1-14 available DMA: 2 controllers × 8 streams = 16 total GPIO Speed: 100 MHz typical Recommendation: Best balance - can run 2x Octal-SPI or 4x Quad-SPI
Example Configuration (High Throughput):
Example Configuration (Flexibility):
Clock: 80 MHz max Timers: TIM1-2, TIM15-17 available DMA: 2 controllers × 7 channels = 14 total GPIO Speed: 80 MHz max QSPI Peripheral: Yes (L476, L486) - native 4-lane SPI alternative Recommendation: Use QSPI peripheral for Quad-SPI when available, or Dual-SPI with GPIO+Timer+DMA
Example Configuration:
Clock: 400-480 MHz Timers: TIM1-17 available DMA: 2 MDMA + BDMA + DMAMUX DMA Channels: 16 per MDMA (32 total) + flexible routing GPIO Speed: 100 MHz max QSPI/OSPI Peripheral: Yes - native multi-lane SPI Recommendation: Best platform for parallel SPI - can run 4x Octal-SPI buses
Example Configuration (Maximum Throughput):
DMAMUX Advantage:
Clock: 170 MHz max Timers: TIM1-8, TIM15-17 available DMA: 2 controllers × 8 channels = 16 total GPIO Speed: 170 MHz max (very fast) Recommendation: Excellent for parallel SPI - high GPIO speed helps
Example Configuration:
Clock: 160 MHz max Timers: TIM1-8, TIM15-17 available DMA: 4 GPDMA controllers × 16 channels = 64 total GPIO Speed: 160 MHz max OSPI Peripheral: Yes - native octal-lane SPI Recommendation: Most flexible DMA allocation - can run 8x Octal-SPI buses theoretically
Example Configuration:
1. Timer Already Used for Other PWM:
2. DMA Channel Already Used:
3. GPIO Pin Already Used:
4. Insufficient DMA Channels:
Check Available Resources:
// Verify Timer clock enabled
__HAL_RCC_TIM2_IS_CLK_ENABLED()
// Verify DMA clock enabled
__HAL_RCC_DMA1_IS_CLK_ENABLED()
__HAL_RCC_DMA2_IS_CLK_ENABLED()
// Verify GPIO clock enabled
__HAL_RCC_GPIOA_IS_CLK_ENABLED()
Verify Pin Alternate Functions:
Hardware:
Resources:
Code Snippet:
// In user code
#include "FastLED.h"
#define NUM_LEDS 150
#define DATA_PIN_0 PA0
#define DATA_PIN_1 PA1
#define CLOCK_PIN PA5
CRGB leds[NUM_LEDS];
void setup() {
// FastLED will automatically detect and use Dual-SPI
FastLED.addLeds<WS2812, DATA_PIN_0>(leds, NUM_LEDS);
}
Hardware:
Resources:
Throughput:
Hardware:
Resources:
Configuration:
// FastLED automatically detects and allocates 2 octal-SPI buses
// No special configuration needed - just add LED strips
Hardware:
Resources:
Total DMA Usage: 6 channels out of 16 available
When implementing hardware support for a specific STM32 variant:
src/platforms/shared/spi_hw_*.h for interface definitionsstm32XXxx_hal_tim.h, stm32XXxx_hal_dma.h, stm32XXxx_hal_gpio.hThis section provides concrete allocation tables for implementing hardware initialization in the STM32 parallel SPI drivers.
Each parallel SPI bus (bus_id) maps to a specific Timer peripheral:
| Bus ID | Timer | Alternative Timer | Notes |
|---|---|---|---|
| 0 | TIM2 | TIM5 (F4/H7 only) | Primary bus - highest priority |
| 1 | TIM3 | TIM4 | Secondary bus |
| 2 | TIM4 | TIM8 | Tertiary bus (if available) |
| 3 | TIM5 | TIM15 (H7 only) | Quaternary bus (F4/H7 only) |
Rationale:
Dual-SPI (2 DMA channels per bus):
| Bus ID | Timer | DMA Controller | Channels | Request |
|---|---|---|---|---|
| 0 | TIM2 | DMA1 | CH2, CH7 | TIM2_UP, TIM2_UP |
| 1 | TIM3 | DMA1 | CH3, CH4 | TIM3_UP, TIM3_UP |
| 2 | TIM4 | DMA1 | CH1, CH5 | TIM4_UP, TIM4_UP |
Quad-SPI (4 DMA channels per bus):
| Bus ID | Timer | DMA Controller | Channels | Request |
|---|---|---|---|---|
| 0 | TIM2 | DMA1 | CH2, CH3, CH4, CH7 | TIM2_UP (shared) |
| 1 | TIM3 | DMA1 + DMA2 | CH1, CH5 (DMA1), CH1, CH2 (DMA2) | TIM3_UP (shared) |
Octal-SPI (8 DMA channels per bus):
| Bus ID | Timer | DMA Controller | Channels | Request |
|---|---|---|---|---|
| 0 | TIM2 | DMA1 (all 7) + DMA2 (1) | CH1-CH7 (DMA1), CH1 (DMA2) | TIM2_UP (shared) |
Note: STM32F1 has limited DMA channels (14 total). Octal-SPI is not recommended; prefer Dual or Quad-SPI.
Dual-SPI (2 DMA streams per bus):
| Bus ID | Timer | DMA Controller | Streams | Channel | Request |
|---|---|---|---|---|---|
| 0 | TIM2 | DMA1 | Stream 1, Stream 7 | CH3, CH3 | TIM2_UP |
| 1 | TIM3 | DMA1 | Stream 2, Stream 4 | CH5, CH5 | TIM3_UP |
| 2 | TIM4 | DMA1 | Stream 6, Stream 3 | CH2, CH2 | TIM4_UP |
| 3 | TIM5 | DMA1 | Stream 0, Stream 5 | CH6, CH6 | TIM5_UP |
Quad-SPI (4 DMA streams per bus):
| Bus ID | Timer | DMA Controller | Streams | Channel | Request |
|---|---|---|---|---|---|
| 0 | TIM2 | DMA1 | Stream 1, 3, 5, 7 | CH3 | TIM2_UP |
| 1 | TIM3 | DMA1 | Stream 0, 2, 4, 7 | CH5 | TIM3_UP |
Octal-SPI (8 DMA streams per bus):
| Bus ID | Timer | DMA Controller | Streams | Channel | Request |
|---|---|---|---|---|---|
| 0 | TIM2 | DMA1 | Stream 0-7 | CH3 | TIM2_UP |
| 1 | TIM3 | DMA2 | Stream 0-7 | CH5 | TIM3_UP |
STM32F4 DMA-Timer Mapping (Reference):
Critical mappings for Timer Update (UP) events:
Important: STM32F4 uses stream-based DMA with channel multiplexing. Multiple streams can share the same channel number if they use the same request source.
STM32H7 Advantage: DMAMUX allows any DMA stream to connect to any Timer Update request, providing maximum flexibility.
Dual-SPI (2 DMA streams per bus):
| Bus ID | Timer | DMA Controller | Streams | DMAMUX Request |
|---|---|---|---|---|
| 0 | TIM2 | MDMA1 | Stream 0-1 | DMAMUX_TIM2_UP |
| 1 | TIM3 | MDMA1 | Stream 2-3 | DMAMUX_TIM3_UP |
| 2 | TIM4 | MDMA1 | Stream 4-5 | DMAMUX_TIM4_UP |
| 3 | TIM5 | MDMA1 | Stream 6-7 | DMAMUX_TIM5_UP |
Quad-SPI (4 DMA streams per bus):
| Bus ID | Timer | DMA Controller | Streams | DMAMUX Request |
|---|---|---|---|---|
| 0 | TIM2 | MDMA1 | Stream 0-3 | DMAMUX_TIM2_UP |
| 1 | TIM3 | MDMA1 | Stream 4-7 | DMAMUX_TIM3_UP |
| 2 | TIM4 | MDMA2 | Stream 0-3 | DMAMUX_TIM4_UP |
| 3 | TIM5 | MDMA2 | Stream 4-7 | DMAMUX_TIM5_UP |
Octal-SPI (8 DMA streams per bus):
| Bus ID | Timer | DMA Controller | Streams | DMAMUX Request |
|---|---|---|---|---|
| 0 | TIM2 | MDMA1 | Stream 0-7 | DMAMUX_TIM2_UP |
| 1 | TIM3 | MDMA2 | Stream 0-7 | DMAMUX_TIM3_UP |
| 2 | TIM4 | MDMA1 | Stream 8-15 | DMAMUX_TIM4_UP |
| 3 | TIM5 | MDMA2 | Stream 8-15 | DMAMUX_TIM5_UP |
Note: STM32H7 has 16 streams per MDMA controller (32 total), providing excellent capacity for multiple octal-SPI buses.
Timer clock output requires GPIO pins configured in Alternate Function mode. The AF number varies by STM32 family and timer.
STM32F1 uses "remap" functionality instead of AF numbers:
| Timer | Channel | Default Pins | Partial Remap | Full Remap |
|---|---|---|---|---|
| TIM2 | CH1 | PA0 | PA15 | N/A |
| TIM2 | CH2 | PA1 | PB3 | N/A |
| TIM2 | CH3 | PA2 | PB10 | N/A |
| TIM2 | CH4 | PA3 | PB11 | N/A |
| TIM3 | CH1 | PA6 | PB4 | PC6 |
| TIM3 | CH2 | PA7 | PB5 | PC7 |
| TIM3 | CH3 | PB0 | - | PC8 |
| TIM3 | CH4 | PB1 | - | PC9 |
| TIM4 | CH1 | PB6 | PD12 | N/A |
| TIM4 | CH2 | PB7 | PD13 | N/A |
| TIM4 | CH3 | PB8 | PD14 | N/A |
| TIM4 | CH4 | PB9 | PD15 | N/A |
Configuration: Use __HAL_AFIO_REMAP_TIMx_ENABLE() or __HAL_AFIO_REMAP_TIMx_PARTIAL() macros.
| Timer | AF Number | CH1 Pins | CH2 Pins | CH3 Pins | CH4 Pins |
|---|---|---|---|---|---|
| TIM2 | AF1 | PA0, PA5, PA15 | PA1, PB3 | PA2, PB10 | PA3, PB11 |
| TIM3 | AF2 | PA6, PB4, PC6 | PA7, PB5, PC7 | PB0, PC8 | PB1, PC9 |
| TIM4 | AF2 | PB6, PD12 | PB7, PD13 | PB8, PD14 | PB9, PD15 |
| TIM5 | AF2 | PA0, PH10 | PA1, PH11 | PA2, PH12 | PA3, PI0 |
Configuration Example:
GPIO_InitStruct.Mode = GPIO_MODE_AF_PP;
GPIO_InitStruct.Alternate = GPIO_AF1_TIM2; // For TIM2
| Timer | AF Number | CH1 Pins | CH2 Pins | CH3 Pins | CH4 Pins |
|---|---|---|---|---|---|
| TIM2 | AF1 | PA0, PA5, PA15 | PA1, PB3 | PA2, PB10 | PA3, PB11 |
| TIM3 | AF2 | PA6, PB4, PC6 | PA7, PB5, PC7 | PB0, PC8 | PB1, PC9 |
| TIM4 | AF2 | PB6, PD12 | PB7, PD13 | PB8, PD14 | PB9, PD15 |
| TIM5 | AF2 | PA0, PH10, PF6 | PA1, PH11, PF7 | PA2, PH12, PF8 | PA3, PI0, PF9 |
| TIM8 | AF3 | PC6, PI5, PJ8 | PC7, PI6, PJ6 | PC8, PI7, PJ9 | PC9, PI2, PJ11 |
| TIM15 | AF4 | PA2, PE5 | PA3, PE6 | N/A | N/A |
Note: STM32H7 has more pins available in larger packages (LQFP144, LQFP176, TFBGA240).
For ease of use, the driver should provide sensible default pin mappings when users don't specify pins explicitly.
Bus 0 (Dual-SPI):
Bus 0 (Quad-SPI):
Bus 0 (Octal-SPI):
Bus 1 (Dual-SPI):
Rationale:
Bus 0 (Dual-SPI):
Bus 1 (Dual-SPI):
Note: Blue Pill has fewer available pins. Quad/Octal-SPI requires careful pin selection to avoid USB, UART, and SWD conflicts.
When initializing hardware, the driver must detect and handle resource conflicts:
bool isTimerAvailable(TIM_TypeDef* timer) {
// Check if timer clock is already enabled (may indicate it's in use)
if (timer == TIM2 && __HAL_RCC_TIM2_IS_CLK_ENABLED()) {
// Timer 2 clock is enabled - may be in use
// Check if timer is running
if (timer->CR1 & TIM_CR1_CEN) {
return false; // Timer is running
}
}
// Repeat for other timers...
return true;
}
bool isDMAStreamAvailable(DMA_Stream_TypeDef* stream) {
// Check if stream is enabled
if (stream->CR & DMA_SxCR_EN) {
return false; // Stream is active
}
return true;
}
bool isGPIOPinAvailable(GPIO_TypeDef* port, uint16_t pin) {
GPIO_InitTypeDef GPIO_InitStruct;
// Read current pin configuration
uint32_t mode = (port->MODER >> (pin * 2)) & 0x3;
// Pin is available if in input mode (default) or analog mode
if (mode == GPIO_MODE_INPUT || mode == GPIO_MODE_ANALOG) {
return true;
}
// Pin may be in use if configured as output or AF
return false;
}
Each bus consumes RAM for DMA buffers. Buffer size depends on LED count and bit interleaving.
Formula:
DMA_Buffer_Size = num_leds × 3 bytes_per_led × (8 bits / num_lanes) × expansion_factor
Expansion Factor:
Examples:
| Mode | LEDs | Bytes/LED | Lanes | Expansion | Total RAM |
|---|---|---|---|---|---|
| Dual | 100 | 3 | 2 | 2x | 600 bytes |
| Quad | 100 | 3 | 4 | 2x | 600 bytes |
| Octal | 100 | 3 | 8 | 1x | 300 bytes |
| Dual | 500 | 3 | 2 | 2x | 3000 bytes (3 KB) |
| Quad | 500 | 3 | 4 | 2x | 3000 bytes (3 KB) |
| Octal | 500 | 3 | 8 | 1x | 1500 bytes (1.5 KB) |
RAM Availability by Platform:
Recommendation: For STM32F1, limit LED count to 500 per bus to avoid RAM exhaustion.
The following platforms have been successfully compiled and tested with the STM32 parallel SPI implementation:
Compilation Status: ✅ SUCCESS Build Time: 370 seconds Memory Usage:
Supported Features:
Runtime Behavior:
Technical Notes:
Compilation Status: ✅ SUCCESS Build Time: 14 seconds Memory Usage:
Supported Features:
Runtime Behavior:
Technical Notes:
Compilation Status: ✅ SUCCESS Build Time: 182 seconds Memory Usage:
Supported Features:
Runtime Behavior:
Technical Notes:
| Family | DMA Type | TIM5 | Max GPIO Speed | Status | Notes |
|---|---|---|---|---|---|
| STM32F1 | Channel | ❌ | 50 MHz | ⚠️ Partial | Compiles, no DMA yet |
| STM32F2 | Stream | ✅ | 60 MHz | 🟢 Expected | Similar to F4 |
| STM32F4 | Stream | ✅ | 100 MHz | ✅ Tested | Full support |
| STM32F7 | Stream | ✅ | 100 MHz | 🟢 Expected | Similar to F4 |
| STM32L4 | Stream | ❌ | 80 MHz | 🟢 Expected | Should work |
| STM32H7 | Stream+DMAMUX | ✅ | 100 MHz | ✅ Tested | Full support |
| STM32G4 | Channel+DMAMUX | ❌ | 170 MHz | 🟢 Expected | High-speed GPIO |
| STM32U5 | GPDMA | ❌ | 160 MHz | 🟢 Expected | 64 DMA channels |
Legend:
The implementation uses comprehensive platform detection to ensure compatibility:
1. GPIO Speed Compatibility
#define FASTLED_GPIO_SPEED_MAX
// F1: GPIO_SPEED_FREQ_HIGH
// F4+: GPIO_SPEED_FREQ_VERY_HIGH
2. TIM5 Conditional Compilation
#ifdef FASTLED_STM32_HAS_TIM5
// TIM5-specific code
#endif
3. DMA Architecture Detection
#ifdef FASTLED_STM32_HAS_DMA_STREAMS
// Stream-based DMA (F2/F4/F7/H7)
#else
// Channel-based DMA (F1/G4) - future implementation
#endif
4. Family Detection Macros
# STM32F1 (Blue Pill)
uv run ci/ci-compile.py stm32f103c8 --examples Blink
# STM32F4 (Black Pill)
uv run ci/ci-compile.py stm32f411ce --examples Blink
# STM32H7 (Giga R1 M7)
uv run ci/ci-compile.py stm32h747xi --examples Blink
STM32F1 Family:
All Platforms:
Priority 1 (High Demand):
Priority 2 (Enhanced Features):
Priority 3 (Advanced Features):
Document Version: 1.2 Last Updated: 2025-11-06