Back to Halide

README

apps/hannk/README.md

22.0.0.dev03.9 KB
Original Source

This app is an interpreter of machine learning pipelines, where many of the ops are implemented in Halide.

There are several front ends for the interpreter:

  • TFlite flat buffer parser
  • TFlite delegate
  • Direct API

This app is a work in progress. Currently, only quantized uint8 networks are supported. All of the TensorFlow hosted models are working and producing good performance.

Benchmarks

The comparison data below was produced with TensorFlow v.2.5.0 (the latest release as of this writing):

x86 OSX laptop w/ AVX2:

NetworkTFlite (ms)Halide (ms)Speedup
inception_v1_224_quant72.528.42.55
inception_v2_224_quant10038.12.62
inception_v3_quant267105.32.54
inception_v4_299_quant5662272.49
mobilenet_v1_0.25_128_quant1.90.682.78
mobilenet_v1_1.0_224_quant38.412.73.02
mobilenet_v2_1.0_224_quant30.69.853.11

Qualcomm Snapdragon 855 A76 core (Pixel 4):

NetworkTFlite (ms)Halide (ms)Speedup
inception_v1_224_quant24.725.00.99
inception_v2_224_quant49.833.51.49
inception_v3_quant9787.61.11
inception_v4_299_quant198183.41.09
mobilenet_v1_0.25_128_quant0.970.631.54
mobilenet_v1_1.0_128_quant4.644.441.05
mobilenet_v1_1.0_224_quant12.911.61.11
mobilenet_v2_1.0_224_quant11.89.721.21

Planned but still TODO

  • More op support
  • More data type support
  • Multicore parallelism
  • Hexagon HVX support
  • More intelligent scheduling across ops, to save memory and improve locality

Usage

benchmark

benchmark is a binary that runs the provided .tflite flat buffer files and reports the time taken for each.

Usage:

benchmark a.tflite [b.tflite ...]

compare_vs_tflite

This binary runs each provided network 3 times:

  • Directly via TFlite
  • Directly via HANNK
  • Via HANNK TFlite delegate

The app reports timing for each, and compares the results, reporting significant differences.

Usage:

compare_vs_tflite a.tflite [b.tflite ...]

WebAssembly

There is limited support for building and running hannk under WebAssembly.

Requirements:

  • You must use CMake to build (Make isn't supported).
  • You must have Emscripten v2.0.32 (or later) installed and activated.
  • You must have Node.js v16.13 (or later) installed for testing. Note that (as of this writing), EMSDK includes an older version of Node that will not work.

Building:

The simplest way is:

$ HL_TARGET=wasm-32-wasmrt-wasm_simd128 NODE_JS_EXECUTABLE=/path/to/good/version/of/node ./configure_cmake.sh
...output...
$ ninja

Note that wasm_simd128 is optional, but highly recommended.

Running:

If you've built as described above, you can just run ctest to run the basic self-tests.

If you want to run benchmark or compare_vs_tflite manually, you'll need to launch it under node manually; as noted above, when EMSDK is activated, node will likely refer to a version of Node.js that won't work, so you will need to provide a path to a suitable version:

$ cd build
$ /path/to/good/version/of/node benchmark ../test/*/*.tflite
$ /path/to/good/version/of/node compare_vs_tflite ../test/*/*.tflite

Note that compare_vs_tflite doesn't actually build or use tflite when compiling under WebAssembly! The only mode it supports is directly parsing the .tflite files, which is pretty close to the same as the benchmark tool.