docs/articles_en/openvino-workflow/model-preparation/convert-model-to-ir.rst
.. meta:: :description: Convert models from the original framework to OpenVINO representation.
.. toctree:: :maxdepth: 1 :hidden:
Convert from PyTorch <convert-model-pytorch> Convert from TensorFlow <convert-model-tensorflow> Convert from ONNX <convert-model-onnx> Convert from TensorFlow Lite <convert-model-tensorflow-lite> Convert from PaddlePaddle <convert-model-paddle> Convert from JAX/Flax <convert-model-jax> Convert from Keras <convert-model-keras>
:doc:OpenVINO IR <../../documentation/openvino-ir-format> is the proprietary model format
used by OpenVINO, typically obtained by converting models of supported frameworks:
.. tab-set::
.. tab-item:: PyTorch :sync: torch
.. tab-set::
.. tab-item:: Python
:sync: py
* The ``convert_model()`` method:
This is the only method applicable to PyTorch models.
.. dropdown:: List of supported formats:
* **Python objects**:
* ``torch.nn.Module``
* ``torch.jit.ScriptModule``
* ``torch.jit.ScriptFunction``
* ``torch.export.ExportedProgram``
.. code-block:: py
:force:
import torchvision
import openvino as ov
model = torchvision.models.resnet50(weights='DEFAULT')
ov_model = ov.convert_model(model)
compiled_model = ov.compile_model(ov_model, "AUTO")
For more details on conversion, refer to the
:doc:`guide <convert-model-pytorch>`
and an example `tutorial <https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/pytorch-to-openvino/pytorch-onnx-to-openvino.ipynb>`__
on this topic.
.. tab-item:: TensorFlow :sync: tf
.. tab-set::
.. tab-item:: Python
:sync: py
* The ``convert_model()`` method:
When you use the ``convert_model()`` method, you have more control and you can
specify additional adjustments for ``ov.Model``. The ``read_model()`` and
``compile_model()`` methods are easier to use, however, they do not have such
capabilities. With ``ov.Model`` you can choose to optimize, compile and run
inference on it or serialize it into a file for subsequent use.
.. dropdown:: List of supported formats:
* **Files**:
* SavedModel - ``<SAVED_MODEL_DIRECTORY>`` or ``<INPUT_MODEL>.pb``
* Checkpoint - ``<INFERENCE_GRAPH>.pb`` or ``<INFERENCE_GRAPH>.pbtxt``
* MetaGraph - ``<INPUT_META_GRAPH>.meta``
* **Python objects**:
* ``tf.keras.Model``
* ``tf.keras.layers.Layer``
* ``tf.Module``
* ``tf.compat.v1.Graph``
* ``tf.compat.v1.GraphDef``
* ``tf.function``
* ``tf.compat.v1.session``
* ``tf.train.checkpoint``
.. code-block:: py
:force:
import openvino as ov
ov_model = ov.convert_model("saved_model.pb")
compiled_model = ov.compile_model(ov_model, "AUTO")
For more details on conversion, refer to the
:doc:`guide <convert-model-tensorflow>`
and an example `tutorial <https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/tensorflow-classification-to-openvino>`__
on this topic.
* The ``read_model()`` and ``compile_model()`` methods:
.. dropdown:: List of supported formats:
* **Files**:
* SavedModel - ``<SAVED_MODEL_DIRECTORY>`` or ``<INPUT_MODEL>.pb``
* Checkpoint - ``<INFERENCE_GRAPH>.pb`` or ``<INFERENCE_GRAPH>.pbtxt``
* MetaGraph - ``<INPUT_META_GRAPH>.meta``
.. code-block:: py
:force:
import openvino as ov
core = ov.Core()
ov_model = core.read_model("saved_model.pb")
compiled_model = ov.compile_model(ov_model, "AUTO")
For a guide on how to run inference, see how to
:doc:`Integrate OpenVINO™ with Your Application <../running-inference>`.
.. tab-item:: C++
:sync: cpp
* The ``compile_model()`` method:
.. dropdown:: List of supported formats:
* **Files**:
* SavedModel - ``<SAVED_MODEL_DIRECTORY>`` or ``<INPUT_MODEL>.pb``
* Checkpoint - ``<INFERENCE_GRAPH>.pb`` or ``<INFERENCE_GRAPH>.pbtxt``
* MetaGraph - ``<INPUT_META_GRAPH>.meta``
.. code-block:: cpp
ov::CompiledModel compiled_model = core.compile_model("saved_model.pb", "AUTO");
For a guide on how to run inference, see how to
:doc:`Integrate OpenVINO™ with Your Application <../running-inference>`.
.. tab-item:: C
:sync: c
* The ``compile_model()`` method:
.. dropdown:: List of supported formats:
* **Files**:
* SavedModel - ``<SAVED_MODEL_DIRECTORY>`` or ``<INPUT_MODEL>.pb``
* Checkpoint - ``<INFERENCE_GRAPH>.pb`` or ``<INFERENCE_GRAPH>.pbtxt``
* MetaGraph - ``<INPUT_META_GRAPH>.meta``
.. code-block:: c
ov_compiled_model_t* compiled_model = NULL;
ov_core_compile_model_from_file(core, "saved_model.pb", "AUTO", 0, &compiled_model);
For a guide on how to run inference, see how to
:doc:`Integrate OpenVINO™ with Your Application <../running-inference>`.
.. tab-item:: CLI
:sync: cli
You can use ``ovc`` command-line tool to convert a model to IR. The obtained IR can
then be read by ``read_model()`` and inferred.
.. code-block:: sh
ovc <INPUT_MODEL>.pb
For details on the conversion, refer to the
:doc:`article <../model-preparation>`.
.. tab-item:: TensorFlow Lite :sync: tflite
.. tab-set::
.. tab-item:: Python
:sync: py
* The ``convert_model()`` method:
When you use the ``convert_model()`` method, you have more control and you can
specify additional adjustments for ``ov.Model``. The ``read_model()`` and
``compile_model()`` methods are easier to use, however, they do not have such
capabilities. With ``ov.Model`` you can choose to optimize, compile and run
inference on it or serialize it into a file for subsequent use.
.. dropdown:: List of supported formats:
* **Files**:
* ``<INPUT_MODEL>.tflite``
.. code-block:: py
:force:
import openvino as ov
ov_model = ov.convert_model("<INPUT_MODEL>.tflite")
compiled_model = ov.compile_model(ov_model, "AUTO")
For more details on conversion, refer to the
:doc:`guide <convert-model-tensorflow-lite>`
and an example `tutorial <https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/tflite-to-openvino/tflite-to-openvino.ipynb>`__
on this topic.
* The ``read_model()`` method:
.. dropdown:: List of supported formats:
* **Files**:
* ``<INPUT_MODEL>.tflite``
.. code-block:: py
:force:
import openvino as ov
core = ov.Core()
ov_model = core.read_model("<INPUT_MODEL>.tflite")
compiled_model = ov.compile_model(ov_model, "AUTO")
* The ``compile_model()`` method:
.. dropdown:: List of supported formats:
* **Files**:
* ``<INPUT_MODEL>.tflite``
.. code-block:: py
:force:
import openvino as ov
compiled_model = ov.compile_model("<INPUT_MODEL>.tflite", "AUTO")
For a guide on how to run inference, see how to
:doc:`Integrate OpenVINO™ with Your Application <../running-inference>`.
.. tab-item:: C++
:sync: cpp
* The ``compile_model()`` method:
.. dropdown:: List of supported formats:
* **Files**:
* ``<INPUT_MODEL>.tflite``
.. code-block:: cpp
ov::CompiledModel compiled_model = core.compile_model("<INPUT_MODEL>.tflite", "AUTO");
For a guide on how to run inference, see how to
:doc:`Integrate OpenVINO™ with Your Application <../running-inference>`.
.. tab-item:: C
:sync: c
* The ``compile_model()`` method:
.. dropdown:: List of supported formats:
* **Files**:
* ``<INPUT_MODEL>.tflite``
.. code-block:: c
ov_compiled_model_t* compiled_model = NULL;
ov_core_compile_model_from_file(core, "<INPUT_MODEL>.tflite", "AUTO", 0, &compiled_model);
For a guide on how to run inference, see how to
:doc:`Integrate OpenVINO™ with Your Application <../running-inference>`.
.. tab-item:: CLI
:sync: cli
* The ``convert_model()`` method:
You can use ``ovc`` to convert a model to IR. The obtained IR can
then be read by ``read_model()`` and inferred.
.. dropdown:: List of supported formats:
* **Files**:
* ``<INPUT_MODEL>.tflite``
.. code-block:: sh
ovc <INPUT_MODEL>.tflite
For details on the conversion, refer to the
:doc:`article <convert-model-tensorflow-lite>`.
.. tab-item:: ONNX :sync: onnx
.. tab-set::
.. tab-item:: Python
:sync: py
* The ``convert_model()`` method:
When you use the ``convert_model()`` method, you have more control and you can
specify additional adjustments for ``ov.Model``. The ``read_model()`` and
``compile_model()`` methods are easier to use, however, they do not have such
capabilities. With ``ov.Model`` you can choose to optimize, compile and run
inference on it or serialize it into a file for subsequent use.
.. dropdown:: List of supported formats:
* **Files**:
* ``<INPUT_MODEL>.onnx``
.. code-block:: py
:force:
import openvino as ov
ov_model = ov.convert_model("<INPUT_MODEL>.onnx")
compiled_model = ov.compile_model(ov_model, "AUTO")
For more details on conversion, refer to the
:doc:`guide <convert-model-onnx>`
and an example `tutorial <https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/pytorch-to-openvino/pytorch-onnx-to-openvino.ipynb>`__
on this topic.
* The ``read_model()`` method:
.. dropdown:: List of supported formats:
* **Files**:
* ``<INPUT_MODEL>.onnx``
.. code-block:: py
:force:
import openvino as ov
core = ov.Core()
ov_model = core.read_model("<INPUT_MODEL>.onnx")
compiled_model = ov.compile_model(ov_model, "AUTO")
* The ``compile_model()`` method:
.. dropdown:: List of supported formats:
* **Files**:
* ``<INPUT_MODEL>.onnx``
.. code-block:: py
:force:
import openvino as ov
compiled_model = ov.compile_model("<INPUT_MODEL>.onnx", "AUTO")
For a guide on how to run inference, see how to :doc:`Integrate OpenVINO™ with Your Application <../running-inference>`.
.. tab-item:: C++
:sync: cpp
* The ``compile_model()`` method:
.. dropdown:: List of supported formats:
* **Files**:
* ``<INPUT_MODEL>.onnx``
.. code-block:: cpp
ov::CompiledModel compiled_model = core.compile_model("<INPUT_MODEL>.onnx", "AUTO");
For a guide on how to run inference, see how to :doc:`Integrate OpenVINO™ with Your Application <../running-inference>`.
.. tab-item:: C
:sync: c
* The ``compile_model()`` method:
.. dropdown:: List of supported formats:
* **Files**:
* ``<INPUT_MODEL>.onnx``
.. code-block:: c
ov_compiled_model_t* compiled_model = NULL;
ov_core_compile_model_from_file(core, "<INPUT_MODEL>.onnx", "AUTO", 0, &compiled_model);
For details on the conversion, refer to the :doc:`article <convert-model-onnx>`
.. tab-item:: CLI
:sync: cli
* The ``convert_model()`` method:
You can use ``ovc`` to convert a model to IR. The obtained IR
can then be read by ``read_model()`` and inferred.
.. dropdown:: List of supported formats:
* **Files**:
* ``<INPUT_MODEL>.onnx``
.. code-block:: sh
ovc <INPUT_MODEL>.onnx
For details on the conversion, refer to the
:doc:`article <convert-model-onnx>`
.. tab-item:: PaddlePaddle :sync: pdpd
.. tab-set::
.. tab-item:: Python
:sync: py
* The ``convert_model()`` method:
When you use the ``convert_model()`` method, you have more control and you can
specify additional adjustments for ``ov.Model``. The ``read_model()`` and
``compile_model()`` methods are easier to use, however, they do not have such
capabilities. With ``ov.Model`` you can choose to optimize, compile and run
inference on it or serialize it into a file for subsequent use.
.. dropdown:: List of supported formats:
* **Files**:
* ``<INPUT_MODEL>.pdmodel``
* **Python objects**:
* ``paddle.hapi.model.Model``
* ``paddle.fluid.dygraph.layers.Layer``
* ``paddle.fluid.executor.Executor``
.. code-block:: py
:force:
import openvino as ov
ov_model = ov.convert_model("<INPUT_MODEL>.pdmodel")
compiled_model = ov.compile_model(ov_model, "AUTO")
For more details on conversion, refer to the
:doc:`guide <convert-model-paddle>`
and an example `tutorial <https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/paddle-to-openvino/paddle-to-openvino-classification.ipynb>`__
on this topic.
* The ``read_model()`` method:
.. dropdown:: List of supported formats:
* **Files**:
* ``<INPUT_MODEL>.pdmodel``
.. code-block:: py
:force:
import openvino as ov
core = ov.Core()
ov_model = core.read_model("<INPUT_MODEL>.pdmodel")
compiled_model = ov.compile_model(ov_model, "AUTO")
* The ``compile_model()`` method:
.. dropdown:: List of supported formats:
* **Files**:
* ``<INPUT_MODEL>.pdmodel``
.. code-block:: py
:force:
import openvino as ov
compiled_model = ov.compile_model("<INPUT_MODEL>.pdmodel", "AUTO")
For a guide on how to run inference, see how to
:doc:`Integrate OpenVINO™ with Your Application <../running-inference>`.
.. tab-item:: C++
:sync: cpp
* The ``compile_model()`` method:
.. dropdown:: List of supported formats:
* **Files**:
* ``<INPUT_MODEL>.pdmodel``
.. code-block:: cpp
ov::CompiledModel compiled_model = core.compile_model("<INPUT_MODEL>.pdmodel", "AUTO");
For a guide on how to run inference, see how to
:doc:`Integrate OpenVINO™ with Your Application <../running-inference>`.
.. tab-item:: C
:sync: c
* The ``compile_model()`` method:
.. dropdown:: List of supported formats:
* **Files**:
* ``<INPUT_MODEL>.pdmodel``
.. code-block:: c
ov_compiled_model_t* compiled_model = NULL;
ov_core_compile_model_from_file(core, "<INPUT_MODEL>.pdmodel", "AUTO", 0, &compiled_model);
For a guide on how to run inference, see how to
:doc:`Integrate OpenVINO™ with Your Application <../running-inference>`.
.. tab-item:: CLI
:sync: cli
* The ``convert_model()`` method:
You can use ``ovc`` to convert a model to IR. The obtained IR
can then be read by ``read_model()`` and inferred.
.. dropdown:: List of supported formats:
* **Files**:
* ``<INPUT_MODEL>.pdmodel``
.. code-block:: sh
ovc <INPUT_MODEL>.pdmodel
For details on the conversion, refer to the
:doc:`article <convert-model-paddle>`.
.. tab-item:: JAX/Flax :sync: torch
.. tab-set::
.. tab-item:: Python
:sync: py
The ``convert_model()`` method is the only method applicable to JAX/Flax models.
.. dropdown:: List of supported formats:
* **Python objects**:
* ``jax._src.core.ClosedJaxpr``
* ``flax.linen.Module``
* Conversion of the ``jax._src.core.ClosedJaxpr`` object
.. code-block:: py
:force:
import jax
import jax.numpy as jnp
import openvino as ov
# let user have some JAX function
def jax_func(x, y):
return jax.lax.tanh(jax.lax.max(x, y))
# use example inputs for creation of ClosedJaxpr object
x = jnp.array([1.0, 2.0])
y = jnp.array([-1.0, 10.0])
jaxpr = jax.make_jaxpr(jax_func)(x, y)
ov_model = ov.convert_model(jaxpr)
compiled_model = ov.compile_model(ov_model, "AUTO")
* Conversion of the ``flax.linen.Module`` object
.. code-block:: py
:force:
import flax.linen as nn
import jax
import jax.numpy as jnp
import openvino as ov
# let user have some Flax module
class SimpleDenseModule(nn.Module):
features: int
@nn.compact
def __call__(self, x):
return nn.Dense(features=self.features)(x)
module = SimpleDenseModule(features=4)
# create example_input used in training
example_input = jnp.ones((2, 3))
# prepare parameters to initialize the module
# they can be also loaded from a disk
# using pickle, flax.serialization for deserialization
key = jax.random.PRNGKey(0)
params = module.init(key, example_input)
module = module.bind(params)
ov_model = ov.convert_model(module, example_input=example_input)
compiled_model = ov.compile_model(ov_model, "AUTO")
For more details on conversion, refer to the :doc:`conversion guide <convert-model-jax>`.
These are basic examples, for detailed conversion instructions, see the individual guides on
:doc:PyTorch <convert-model-pytorch>, :doc:ONNX <convert-model-onnx>,
:doc:TensorFlow <convert-model-tensorflow>, :doc:TensorFlow Lite <convert-model-tensorflow-lite>,
:doc:PaddlePaddle <convert-model-paddle>, and :doc:JAX/Flax <convert-model-jax>.
Refer to the list of all supported conversion options in :doc:Conversion Parameters <conversion-parameters>.
IR Conversion Benefits ################################################
| Saving to IR to improve first inference latency
| When first inference latency matters, rather than convert the framework model each time it
is loaded, which may take some time depending on its size, it is better to do it once. Save
the model as an OpenVINO IR with save_model and then load it with read_model as
needed. This should improve the time it takes the model to make the first inference as it
avoids the conversion step.
| Saving to IR in FP16 to save space | Save storage space, even more so if FP16 is used as it may cut the size by about 50%, especially useful for large models, like Llama2-7B.
| Saving to IR to avoid large dependencies in inference code | Frameworks such as TensorFlow, PyTorch, and JAX/Flax tend to be large dependencies for applications running inference (multiple gigabytes). Converting models to OpenVINO IR removes this dependency, as OpenVINO can run its inference with no additional components. This way, much less disk space is needed, while loading and compiling usually takes less runtime memory than loading the model in the source framework and then converting and compiling it.
Here is an example of how to benefit from OpenVINO IR, saving a model once and running it multiple times:
.. code-block:: py
import openvino as ov import tensorflow as tf
model = tf.keras.applications.resnet50.ResNet50(weights="imagenet") ov_model = ov.convert_model(model)
ov.save_model(ov_model, 'model.xml', compress_to_fp16=True) # enabled by default
import openvino as ov
core = ov.Core() ov_model = core.read_model("model.xml")
compiled_model = ov.compile_model(ov_model)
Additional Resources ####################
parameters to adjust model conversion <./conversion-parameters>.Download models from Hugging Face <https://huggingface.co/models>__.