Back to Lo

Simd Sum

docs/data/simd-sum.md

1.53.0793 B
Original Source

Sums the values in a collection using SIMD instructions. The suffix (x2, x4, x8, x16, x32, x64) indicates the number of lanes processed simultaneously.

Note: Choose the variant matching your CPU's capabilities. Higher lane counts provide better performance but require newer CPU support.

go
// Using AVX2 variant (32 lanes at once) - Intel Haswell+ / AMD Excavator+
sum := simd.SumInt8x32([]int8{1, 2, 3, 4, 5})
// 15
go
// Using AVX-512 variant (16 lanes at once) - Intel Skylake-X+
sum := simd.SumFloat32x16([]float32{1.1, 2.2, 3.3, 4.4})
// 11
go
// Using AVX variant (4 lanes at once) - works on all amd64
sum := simd.SumInt32x4([]int32{1000000, 2000000, 3000000})
// 6000000
go
// Empty collection returns 0
sum := simd.SumUint16x16([]uint16{})
// 0