No Standard Library

In this section, you will learn how to run an ONNX inference model on an embedded system, with no standard library support on a Raspberry Pi Pico 2. This should be universally applicable to other platforms. All the code can be found in the burn-onnx examples.

Step-by-Step Guide

Let's walk through the process of running an embedded ONNX model:

Setup

Follow the embassy guide for your specific environment. Once setup, you should have something similar to the following.

./inference
├── Cargo.lock
├── Cargo.toml
├── build.rs
├── memory.x
└── src
    └── main.rs

Some other dependencies have to be added

toml

[dependencies]
embedded-alloc = "0.6.0" # Only if there is no default allocator for your chip
burn = { version = "0.21", default-features = false, features = ["ndarray"] } # Backend must be ndarray
burn-store = { version = "0.21", default-features = false, features = ["burnpack"] }

[build-dependencies]
burn-onnx = { version = "0.21" } # Used to auto generate the rust code to import the model

Import the Model

Follow the directions in ONNX Import.

Use the following ModelGen config

ModelGen::new()
    .input(my_model)
    .out_dir("model/")
    .embed_states(true)
    .run_from_script();

Global Allocator

First define a global allocator (if you are on a no_std system without alloc).

use embedded_alloc::LlffHeap as Heap;

#[global_allocator]
static HEAP: Heap = Heap::empty();

#[embassy_executor::main]
async fn main(_spawner: Spawner) {
    {
        use core::mem::MaybeUninit;
        // Watch out for this, if it is too big or small for your model, the
        // program may crash. This is in u8 bytes, as such this is a total of 100kb
        const HEAP_SIZE: usize = 100 * 1024;
        static mut HEAP_MEM: [MaybeUninit<u8>; HEAP_SIZE] = [MaybeUninit::uninit(); HEAP_SIZE];
        unsafe { HEAP.init(&raw mut HEAP_MEM as usize, HEAP_SIZE) } // Initialize the heap
    }
}

Define Backend

We are using ndarray, so we just need to define the NdArray backend as usual

use burn::{backend::NdArray, tensor::Tensor};

type Backend = NdArray<f32>;
type BackendDevice = <Backend as burn::tensor::backend::Backend>::Device;

Then inside the main function add

use your_model::Model;

// Get a default device for the backend
let device = BackendDevice::default();

// Create a new model and load the state
let model: Model<Backend> = Model::default();

Running the Model

To run the model, just call it as you would normally

// Define the tensor
let input = Tensor::<Backend, 2>::from_floats([[input]], &device);

// Run the model on the input
let output = model.forward(input);

Conclusion

Running a model in a no_std environment is pretty much identical to a normal environment. All that is needed is a global allocator.