Back to Openvino

Inference

docs/articles_en/physical-ai/explanation/inference.md

2026.2.11.6 KB
Original Source

Inference

InferenceModel is the runtime API for exported policies. It loads the exported package, runs preprocessing and postprocessing, and produces actions.

python
model = InferenceModel.load("./exports/act_policy")
action = model.select_action(observation)

Pipeline

text
observation
  -> preprocessors
  -> runner
  -> postprocessors
  -> action

APIs

MethodUse
select_action(observation)Returns one action immediately.
predict_action_chunk(observation)Returns a chunk for runtime queueing.
reset()Clears state for a new episode.
close()Releases backend resources.

Chunked Policies

Chunk-producing policies still support select_action(). The caller does not need to branch on runner type.

python
if cursor.empty():
    cursor.push_chunk(model.predict_action_chunk(obs))

return cursor.pop()

The cursor is a convenience inside the model layer. It is not the runtime action queue and it should not be treated as one.

Runtime Boundary

Use select_action() for scripts, tests, demos, and evaluation loops.

Use predict_action_chunk() through PolicyRuntime when the policy is driving a robot.

text
PolicyRuntime
  -> Execution.maybe_request(obs)
  -> InferenceModel.predict_action_chunk(obs)
  -> ActionQueue.push_chunk(chunk)
  -> ActionQueue.pop_or_none()
  -> robot.send_action(action)