Back to Halide

README

src/autoschedulers/li2018/README.md

22.0.0.dev01.8 KB
Original Source

This is a conservative autoscheduler that compute_root most Funcs except for the trivial ones (think of it as a -O1 optimizer for Halide). It recognizes large reduction patterns and use rfactor or atomic to parallelize on associative reduction when there's not enough parallelism in the pure variable domain. This strategy works reasonably well for gradient pipelines, and is suitable as a default option for decent but not optimal performance. This is also currently the only autoscheduler that generates GPU schedules.

Running some benchmarks in the app directory gives the following statistics (all use halide_reuse_device_allocations(nullptr, true) for GPU)

appmanual (CPU)gradient-autoscheduler (CPU)manual (GPU)gradient-autoscheduler (GPU)
bilateral filter7.93 ms12.92 ms0.29 ms1.05 ms
camera_pipe8823.33 us25126 us605.03 us3347.44 us
lens_blur7.77 ms22.41 ms0.73 ms5.60 ms
local_laplacian42.29 ms128.31 ms0.81 ms14.30 ms
nl_means145.003 msout-of-memoryN/A82.93 ms
conv_layer15.46 ms6.89 msN/A1.90 ms
stencil_chain18.86 ms21.46 msN/A6.35 ms

Tested on a 8 core Intel CPU (16 with HT) and TITAN Xp.

See test/autoschedulers/li2018 for examples of using this autoscheduler.