docs/en/integrations/deepx.md
Deploying computer vision models on specialized NPU hardware requires a compatible and optimized model format. Exporting Ultralytics YOLO models to DeepX format enables efficient, INT8-quantized inference on DeepX NPU accelerators. This guide walks you through converting your YOLO models to DeepX format and deploying them on DeepX-powered hardware.
DeepX is an AI semiconductor company specializing in Neural Processing Units (NPUs) designed for power-efficient deep learning inference at the edge. DeepX NPUs are engineered for demanding embedded and industrial AI applications, delivering high throughput with minimal power consumption. Their hardware is well suited for deployment scenarios where cloud connectivity is unreliable or undesirable, such as robotics, smart cameras, and industrial automation systems.
The DeepX export produces a compiled .dxnn model binary that is optimized for execution on DeepX NPU hardware. The compilation pipeline uses the dx_com toolkit to perform INT8 quantization and hardware-specific optimization, generating a self-contained model directory ready for deployment.
DeepX models offer several advantages for edge deployment:
.dxnn format is specifically compiled for DeepX NPU hardware, leveraging dedicated acceleration units for fast, efficient inference.All standard Ultralytics tasks are supported for DeepX export across YOLO26, YOLO11, and YOLOv8 model families.
| Task | Supported |
|---|---|
| Object Detection | ✅ |
| Segmentation | ✅ |
| Pose Estimation | ✅ |
| OBB Detection | ✅ |
| Classification | ✅ |
Export an Ultralytics YOLO model to DeepX format and run inference with the exported model.
!!! note
DeepX export is only supported on x86-64 Linux machines. ARM64 (aarch64) is not supported for the export step.
To install the required packages, run:
!!! tip "Installation"
=== "CLI"
```bash
# Install the required package for YOLO
pip install ultralytics
```
The dx_com compiler package will be automatically installed from the DeepX SDK repository on first export. For detailed instructions and best practices related to the installation process, check our Ultralytics Installation guide. While installing the required packages for YOLO, if you encounter any difficulties, consult our Common Issues guide for solutions and tips.
!!! example "Usage"
=== "Python"
```python
from ultralytics import YOLO
# Load the YOLO26 model
model = YOLO("yolo26n.pt")
# Export the model to DeepX format (int8=True is enforced automatically)
model.export(format="deepx") # creates 'yolo26n_deepx_model/'
```
=== "CLI"
```bash
# Export a YOLO26n PyTorch model to DeepX format
yolo export model=yolo26n.pt format=deepx # creates 'yolo26n_deepx_model/'
```
| Argument | Type | Default | Description |
|---|---|---|---|
format | str | 'deepx' | Target format for the exported model, defining compatibility with DeepX NPU hardware. |
imgsz | int or tuple | 640 | Desired image size for the model input. DeepX export requires a square input — pass an integer (e.g., 640) or a tuple where height equals width. |
int8 | bool | True | Enables INT8 quantization. Required for DeepX export — automatically set to True if not specified. |
data | str | 'coco128.yaml' | Dataset configuration file used for INT8 calibration. Specifies the calibration image source. |
device | str | None | Specifies the device for exporting: GPU (device=0) or CPU (device=cpu). |
optimize | bool | False | Enables higher compiler optimization which reduces inference latency and increases compilation time. |
!!! tip
Always run DeepX export on an **x86-64 Linux** host. The `dx_com` compiler does not support ARM64.
For more details about the export process, visit the Ultralytics documentation page on exporting.
After a successful export, a model directory is created with the following layout:
yolo26n_deepx_model/
├── yolo26n.dxnn # Compiled DeepX model binary (NPU executable)
├── config.json # Calibration and preprocessing configuration
└── metadata.yaml # Model metadata (classes, image size, task, etc.)
The .dxnn file is the compiled model binary that the dx_engine runtime loads directly on the NPU. The metadata.yaml contains class names, image size, and other information used by the Ultralytics inference pipeline.
Once you've successfully exported your Ultralytics YOLO model to DeepX format, the next step is deploying these models on DeepX NPU hardware.
Inference requires the DeepX NPU driver, the libdxrt runtime, and the dx_engine Python package.
!!! note
DeepX runtime is only supported on x86-64 Linux machines and ARM64 Debian Trixie machines (Raspberry Pi 5).
# Install the NPU driver and libdxrt runtime
sudo apt update
wget https://github.com/DEEPX-AI/dx_rt_npu_linux_driver/raw/main/release/2.4.0/dxrt-driver-dkms_2.4.0-2_all.deb
sudo apt install ./dxrt-driver-dkms_2.4.0-2_all.deb
wget https://github.com/DEEPX-AI/dx_rt/raw/main/release/3.3.2/libdxrt_3.3.2_all.deb
sudo apt install ./libdxrt_3.3.2_all.deb
# Create dx-engine wheel
cd /usr/share/libdxrt/python_package && sudo ./make_whl.sh
# Install the bundled dx_engine Python wheel
pip install dx_engine-*.whl
Verify the runtime is installed correctly with dxrt-cli --version. You should see output similar to:
DXRT v3.3.2
Minimum Driver Versions
Device Driver: v2.4.0
PCIe Driver: v2.2.0
Firmware: v2.5.2
Minimum Compiler Versions
Compiler: v1.18.1
.dxnn File Format: v6
!!! example "Usage"
=== "Python"
```python
from ultralytics import YOLO
# Load the exported DeepX model
model = YOLO("yolo26n_deepx_model")
# Run inference
results = model("https://ultralytics.com/images/bus.jpg")
# Process results
for r in results:
print(f"Detected {len(r.boxes)} objects")
r.show()
```
=== "CLI"
```bash
# Run inference with the exported DeepX model
yolo predict model='yolo26n_deepx_model' source='https://ultralytics.com/images/bus.jpg'
```
dxtron is DeepX's graph visualizer for inspecting the compiled .dxnn model.
Install dxtron on x86-64 Linux by downloading the .deb package from the DeepX SDK and installing it via dpkg:
wget https://sdk.deepx.ai/release/dxtron/v2.0.1/dxtron_2.0.1_amd64.deb
sudo dpkg -i dxtron_2.0.1_amd64.deb
Then open your exported model:
dxtron yolo26n_deepx_model/yolo26n.dxnn
!!! note
`dxtron` is only available for **x86-64 Linux**. ARM64/aarch64 and non-Linux platforms are not supported.
The Ultralytics team benchmarked YOLO26 models, comparing speed and accuracy between PyTorch and DeepX.
!!! tip "Performance"
=== "Raspberry Pi 5 + DX-M1 M.2 Module"
| Model | Format | Status | Size (MB) | metrics/mAP50-95(B) | Inference time (ms/im) |
|-------------- |--------- |-------- |----------- |--------------------- |------------------------ |
| YOLO26n | PyTorch | ✅ | 5.3 | 0.4760 | 315.2 |
| YOLO26n | DeepX | ✅ | 6.6 | 0.4660 | 34.6 |
| YOLO26n-seg | PyTorch | ✅ | 6.5 | 0.4080 | 485.4 |
| YOLO26n-seg | DeepX | ✅ | 7.9 | 0.3920 | 53.8 |
| YOLO26n-pose | PyTorch | ✅ | 7.6 | 0.4230 | 506.3 |
| YOLO26n-pose | DeepX | ✅ | 8.8 | 0.4590 | 37.6 |
| YOLO26n-obb | PyTorch | ✅ | 5.7 | 0.817 | 1094.4 |
| YOLO26n-obb | DeepX | ✅ | 7.3 | 0.783 | 56.4 |
| Model | Format | Status | Size (MB) | acc (top1) | acc (top5) | Inference time (ms/im) |
|------------- |--------- |-------- |----------- |------------ |------------ |------------------------ |
| YOLO26n-cls | PyTorch | ✅ | 5.6 | 0.431 | 0.716 | 23.8 |
| YOLO26n-cls | DeepX | ✅ | 5.9 | 0.333 | 0.686 | 2.7 |
=== "More devices coming soon!"
Benchmarked with Ultralytics 8.4.48
!!! note
Validation for the above benchmarks were done using coco128 for detection, coco128-seg for segmentation, coco8-pose for pose estimation, imagenet100 for classification and dota128 for OBB models. Inference time does not include pre/ post-processing.
!!! tip "Performance Optimization Tips"
To get the best inference throughput from the DX-M1 NPU connected to a Raspberry Pi 5, open the boot configuration file and enable PCIe Gen 3 support.
```sh
sudo nano /boot/firmware/config.txt
```
Add the following lines at the end of the file:
```
dtparam=pciex1
dtparam=pciex1_gen=3
```
Save and exit (Ctrl+X, then Y, then Enter), then reboot:
```sh
sudo reboot
```
model.export(format="deepx")yolo val to verify minimal quantization lossyolo predict for qualitative validation_deepx_model/ directory to DeepX NPU hardware using the dx_engine runtimeYOLO models deployed on DeepX NPU hardware are well suited for a wide range of edge AI applications:
In this guide, you've learned how to export Ultralytics YOLO models to DeepX format and deploy them on DeepX NPU hardware. The export pipeline uses INT8 calibration and the dx_com compiler to produce a hardware-optimized .dxnn binary, while the dx_engine runtime handles inference on the device.
The combination of Ultralytics YOLO and DeepX's NPU technology provides an effective solution for running advanced computer vision workloads on embedded and edge devices — delivering high throughput with low power consumption for real-time applications.
For further details on usage, visit the DeepX official website.
Also, if you'd like to know more about other Ultralytics YOLO integrations, visit our integration guide page. You'll find plenty of useful resources and insights there.
You can export your model using the export() method in Python or via the CLI. The export automatically enables INT8 quantization and uses a calibration dataset to minimize accuracy loss. The dx_com compiler package is installed automatically if not already present.
!!! example
=== "Python"
```python
from ultralytics import YOLO
model = YOLO("yolo26n.pt")
model.export(format="deepx")
```
=== "CLI"
```bash
yolo export model=yolo26n.pt format=deepx
```
DeepX NPUs are designed to execute INT8 computations at maximum efficiency. The dx_com compiler quantizes the model during export using EMA-based calibration with real dataset images, enabling the NPU to deliver its full performance. INT8 is always enforced for DeepX exports — if you pass int8=False, it will be overridden with a warning.
DeepX model export (compilation) requires an x86-64 Linux host. The export step is not supported on ARM64 (aarch64) and Windows machines. Inference using the exported .dxnn model can be run on any Linux platform (x86-64 and ARM64) supported by the dx_engine runtime.
The export creates a directory (e.g., yolo26n_deepx_model/) containing:
yolo26n.dxnn — the compiled NPU binaryconfig.json — calibration and preprocessing settingsmetadata.yaml — model metadata including class names and image sizeYes. Any model trained using Ultralytics Train Mode and exported with format="deepx" can be deployed on DeepX NPU hardware, provided it uses supported layer operations. Export supports detection, segmentation, pose estimation, oriented bounding box (OBB), and classification tasks.
The DeepX export pipeline uses every image in the calibration dataset (after fraction filtering) with the EMA calibration method. A few hundred images is usually sufficient for good quantization accuracy. Point data at a smaller dataset (or set fraction below 1.0) if compilation time becomes a concern on large datasets.
The DeepX runtime is not bundled with ultralytics and must be installed separately before running inference. On x86-64 Linux machines and ARM64 Debian Trixie machines (Raspberry Pi 5), install the NPU driver (dxrt-driver-dkms) and runtime (libdxrt) from the DEEPX-AI GitHub releases, then install the bundled dx_engine Python wheel. See the Runtime Installation section above for step-by-step commands.