Back to Openvino

AUTO Plugin Integration

src/plugins/auto/docs/integration.md

2026.1.23.2 KB
Original Source

AUTO Plugin Integration

Implement a New Plugin

Refer to OpenVINO Plugin Developer Guide for detailed information on how to implement a new plugin.

Query model method ov::IPlugin::query_model() is recommended as it is important for AUTO to quickly make decisions and save selection time.

AUTO Plugin Property Requirements

AUTO Plugin requires the following plugin properties:

PropertyMandatoryPurpose
ov::device::idYesDistinguish devices with the same type.
ov::enable_profilingYesPerformance profiling.
ov::hint::performance_modeYesPerformance mode hint.
ov::hint::num_requestsYesnum_requests hint.
ov::device::full_nameYesAutomatic device selection.
ov::model_nameYesReturn model name.
ov::optimal_batch_sizeNoDecide batch size in automatic batching case.
ov::optimal_number_of_infer_requestsYesDecide AUTO optimal_number_of_infer_requests.
ov::range_for_streamsYesDecide AUTO optimal_number_of_infer_requests in automatic batching case.
ov::supported_propertiesYesCheck if a property is supported by HW plugin.
ov::device::capabilitiesYesAutomatic device selection.
ov::device::gopsNoImprove automatic device selection.
ov::compilation_num_threadsNoLimit the compilation threads for a single device when compiling a model to multiple devices.

AUTO Plugin Tests

Refer to the Testing the AUTO Plugin page for detailed instructions.

AUTO Plugin Integration Tests

Test AUTO and Hardware Plugins Using benchmark_app

sh
benchmark_app -d ${device} -hint ${hint} -m <any model works on HW plugin>
hintdevice
throughput<HW>
throughputAUTO:<HW>
throughputAUTO:<HW>,CPU
latency<HW>
latencyAUTO:<HW>
latencyAUTO:<HW>,CPU
cumulative_throughputAUTO:<HW>
cumulative_throughputAUTO:<HW>,CPU

Test Multiple Devices Running Simultaneously

The HW plugin must guarantee simultaneous execution of multiple devices in different threads. It is recommended to test the HW plugin with the CPU plugin by running the plugins in different threads simultaneously.

For example, there may be two GPUs on the same system, with device names GPU.0 and GPU.1. GPU plugin must guarantee simultaneous execution of GPU.0 and GPU.1 in different threads.