ghostnet/README.md
GhostNetv1 architecture is from the paper "GhostNet: More Features from Cheap Operations" (https://arxiv.org/abs/1911.11907). GhostNetv2 architecture is from the paper "GhostNetV2: Enhance Cheap Operation with Long-Range Attention" (https://arxiv.org/abs/2211.12905).
For the PyTorch implementations, you can refer to huawei-noah/ghostnet.
Both versions use the following techniques in their TensorRT implementations:
IPoolingLayer with IReduceLayer in TensorRT for Global Average Pooling. The IReduceLayer allows you to perform reduction operations (such as sum, average, max) over specified dimensions without being constrained by the kernel size limitations of pooling layers.ghostnet
│
├── ghostnetv1
│ ├── CMakeLists.txt
│ ├── gen_wts.py
│ ├── ghostnetv1.cpp
│ └── logging.h
│
├── ghostnetv2
│ ├── CMakeLists.txt
│ ├── gen_wts.py
│ ├── ghostnetv2.cpp
│ └── logging.h
│
└── README.md
.wts files for both GhostNetv1 and GhostNetv2# For ghostnetv1
python ghostnetv1/gen_wts.py
# For ghostnetv2
python ghostnetv2/gen_wts.py
cd tensorrtx/ghostnet
mkdir build
cd build
cmake ..
make
Use the following commands to serialize the PyTorch models into TensorRT engine files (ghostnetv1.engine and ghostnetv2.engine):
# For ghostnetv1
sudo ./ghostnetv1 -s
# For ghostnetv2
sudo ./ghostnetv2 -s
Once the engine files are generated, you can run inference with the following commands:
# For ghostnetv1
sudo ./ghostnetv1 -d
# For ghostnetv2
sudo ./ghostnetv2 -d
Compare the output with the PyTorch implementation from huawei-noah/ghostnet to ensure that the TensorRT results are consistent with the PyTorch model.