docs/data/simd-sum.md
Sums the values in a collection using SIMD instructions. The suffix (x2, x4, x8, x16, x32, x64) indicates the number of lanes processed simultaneously.
Note: Choose the variant matching your CPU's capabilities. Higher lane counts provide better performance but require newer CPU support.
// Using AVX2 variant (32 lanes at once) - Intel Haswell+ / AMD Excavator+
sum := simd.SumInt8x32([]int8{1, 2, 3, 4, 5})
// 15
// Using AVX-512 variant (16 lanes at once) - Intel Skylake-X+
sum := simd.SumFloat32x16([]float32{1.1, 2.2, 3.3, 4.4})
// 11
// Using AVX variant (4 lanes at once) - works on all amd64
sum := simd.SumInt32x4([]int32{1000000, 2000000, 3000000})
// 6000000
// Empty collection returns 0
sum := simd.SumUint16x16([]uint16{})
// 0