Back to Tachyon

(I)FFT Benchmark

benchmark/fft/README.md

0.4.05.0 KB
Original Source

(I)FFT Benchmark

CPU

Run on 13th Gen Intel(R) Core(TM) i9-13900K (32 X 5500 MHz CPU s)
Compiler: clang-15
CPU Caches:
  L1 Data 48 KiB (x16)
  L1 Instruction 32 KiB (x16)
  L2 Unified 2048 KiB (x16)
  L3 Unified 36864 KiB (x1)

Run on Apple M3 Pro (12 X 4050 MHz)
CPU Caches:
  L1 Data 64 KiB (x12)
  L1 Instruction 128 KiB (x12)
  L2 Unified 4096 KiB (x12)

Note: Run with build --@rules_rust//:extra_rustc_flags="-Ctarget-cpu=native" in your .bazelrc.user

FFT

shell
GOMP_SPINCOUNT=0 bazel run --config maxopt --//:has_matplotlib //benchmark/fft:fft_benchmark -- -k 16 -k 17 -k 18 -k 19 -k 20 -k 21 -k 22 -k 23 --vendor arkworks --vendor bellman --vendor halo2 --check_results

On Intel i9-13900K

ExponentTachyonArkworksBellmanHalo2
160.0020580.0051430.0063140.002249
170.0022460.003340.0156460.006193
180.0101540.0188070.0464430.007574
190.0229840.0146520.0762810.014506
200.020.024970.1000820.042877
210.0448310.0755630.202220.067161
220.1302010.1790750.4024520.169194
230.2813980.3940680.7920040.372566

On Mac M3 Pro

ExponentTachyonArkworksBellmanHalo2
160.0025260.0038040.007840.005689
170.0046940.0057690.0155770.01121
180.0092460.0102430.0278340.022379
190.0183280.0204040.0556610.041394
200.0396830.0410850.1107020.086299
210.0791380.0873360.2308570.175599
220.1666460.1779590.4742960.352872
230.339960.3636120.9715810.748284

IFFT

shell
GOMP_SPINCOUNT=0 bazel run --config maxopt --//:has_matplotlib //benchmark/fft:fft_benchmark -- -k 16 -k 17 -k 18 -k 19 -k 20 -k 21 -k 22 -k 23 --vendor arkworks --vendor bellman --vendor halo2 --run_ifft --check_results

On Intel i9-13900K

ExponentTachyonArkworksBellmanHalo2
160.0013920.0120280.0099130.002413
170.0025110.004270.014180.005731
180.017620.0211670.0346760.010811
190.0096460.014470.0587140.016038
200.0303030.0348150.1049360.05337
210.0474630.0725790.1997880.093146
220.1466970.1813890.3912960.19874
230.2859370.4035960.822760.347876

On Mac M3 Pro

ExponentTachyonArkworksBellmanHalo2
160.0027980.0038670.0081020.005665
170.0048820.0057370.0159980.011672
180.0103080.0109620.0281180.022723
190.0187240.0213380.0568550.042554
200.0376870.0432370.1138480.089899
210.0784290.0921340.2345850.174939
220.1625420.1894420.4846440.361127
230.3386460.3926740.9891730.765592

GPU

FFT

shell
GOMP_SPINCOUNT=0 bazel run --config maxopt --config cuda --//:has_matplotlib //benchmark/fft:fft_benchmark_gpu -- -k 16 -k 17 -k 18 -k 19 -k 20 -k 21 -k 22 -k 23 --check_results

On RTX-4090

ExponentTachyon CPUTachyon GPU
160.0023480.001
170.002040.001182
180.003930.002211
190.0093170.004079
200.0492040.008114
210.0441580.01616
220.1340640.032785
230.2741010.066068

IFFT

shell
GOMP_SPINCOUNT=0 bazel run --config maxopt --config cuda --//:has_matplotlib //benchmark/fft:fft_benchmark_gpu -- -k 16 -k 17 -k 18 -k 19 -k 20 -k 21 -k 22 -k 23 --run_ifft --check_results

On RTX-4090

ExponentTachyonTachyon GPU
160.0021380.001341
170.004880.000933
180.0038870.002502
190.008960.003806
200.0179530.007745
210.0437870.016268
220.1320480.033012
230.2911320.066022