tensorflow/lite/tools/strip_buffers/README.md
NOTE: This is an advanced tool used to reduce bandwidth usage in Neural Architecture Search applications. Use with caution.
The tools in this directory make it easier to distribute TFLite models to multiple devices over networks with the sole aim of benchmarking latency performance. The intended workflow is as follows:
As an example, see the before/after sizes for MobileNetV1:
NOTE: This tool only supports single subgraphs for now.
There are two tools in this directory:
This tool takes in an input flatbuffer, and strips out (or 'empties') the
buffers (constant data) for tensors that follow the following guidelines:
The second rule above protects us from invalidating constant data that cannot be randomised (for example, Reshape 'shape' input).
To run the associated script:
bazel run -c opt tensorflow/lite/tools/strip_buffers:strip_buffers_from_fb -- --input_flatbuffer=/input/path.tflite --output_flatbuffer=/output/path.tflite
The idea here is to reconstitute the lean flatbuffer Model generared in the
above step, by filling in random data whereever necessary.
The prototype script can be called as:
bazel run -c opt tensorflow/lite/tools/strip_buffers:reconstitute_buffers_into_fb -- --input_flatbuffer=/input/path.tflite --output_flatbuffer=/output/path.tflite
Both the above tools are present as stripping_lib in this directory, which
mutate the flatbuffer(s) in-memory. This ensures we can do the above two steps
without touching the filesystem again.