tools/experimental/trt-engine-explorer/README.md
This repository contains Python code (trex package) to explore various aspects of a TensorRT engine plan and its associated inference profiling data.
An engine plan file is a serialized TensorRT engine format. It contains information about the final inference graph and can be deserialized for inference runtime execution. An engine plan is specific to the hardware and software versions of the system used to build the engine.
trex is useful for initial model performance debugging, visualization of plan graphs, and for understanding the characteristics of an engine plan. <b>For in-depth performance analysis, NVIDIA® Nsight Systems ™ is the recommended performance analysis tool.</b> For engine visualizations see also NVIDIA® Nsight Deep Learning Designer™.
The trex package contains an API and Jupyter notebooks for viewing and inspecting TensorRT engine-plan files and profiling data.
trex tutorial.trex API examples.trex operates on JSON input files, it does not require a GPU.When trtexec times individual layers, the total engine latency (computed by summing the average latency of each layer) is higher than the latency reported for the entire engine. This is due to per-layer measurement overheads.
To measure per-layer execution times, when trtexec enqueues kernel layers for execution in a stream, it places CUDA event objects between the layers to monitor the start and completion of each layer. These CUDA events add a small overhead which is more noticeable with smaller networks (shallow and narrow networks or networks with small activation data).
Starting with TensorRT 8.2, engine-plan graph and profiling data can be exported to JSON files. trex supports TensortRT 8.x, 9.x and 10.0.
trex has only been tested on 22.04 with Python 3.10.12.
trex does not require a GPU, but generating the input JSON file(s) does require a GPU.
The instructions below detail how to use a Python3 virtualenv for installing and using trex (Python 3.8+ is required).
$ git clone https://github.com/NVIDIA/TensorRT.git
The commands listed below create and activate a Python virtual environment named env_trex which is stored in a directory by the same name, and configures the current shell to use it as the default python environment.
$ cd TensorRT/tools/experimental/trt-engine-explorer
$ python3 -m virtualenv env_trex
$ source env_trex/bin/activate
To install core functionality only:
$ python3 -m pip install -e .
To install all packages (core + packages required for using Jupyter notebooks):
$ python3 -m pip install -e .[notebook]
Generating dot and SVG graphs requires Graphviz, an open source graph visualization software:
$ sudo apt --yes install graphviz
The typical trex workflow is depicted below:
INetworkDefinition.trex notebook.The Python script utils/process_engine.py implements this workflow for ONNX models:
trtexec to import an ONNX model and create an engine.trtexec to profile the engine's inference execution and store the results in an engine profiling JSON file.For more information see TensorRT Engine Inspector and the Tutorial notebook.
</details> <details><summary><h1>Jupyter Server</h1></summary>Launch the Jupyter notebook server as detailed below and open your browser at http://localhost:8888 or http://<your-ip-address>:8888
$ jupyter-notebook --ip=0.0.0.0 --no-browser
If you're using JupyterLab, you can launch the server with:
$ jupyter lab --ip=0.0.0.0 --port=8888
The TensorRT Engine Explorer license can be found in the LICENSE file.
</details>