python-oxidized-importer/docs/oxidized_importer_api_reference.rst
.. py:currentmodule:: oxidized_importer
.. _oxidized_importer_api_reference:
.. py:function:: decode_source(io_module, source_bytes) -> str
Decodes Python source code bytes to a str.
This is effectively a reimplementation of
importlib._bootstrap_external.decode_source()
.. py:function:: find_resources_in_path(path) -> List
This function will scan the specified filesystem path and return an
iterable of objects representing found resources. Those objects will be 1
of the types documented in :ref:oxidized_importer_python_resource_types.
Only directories can be scanned.
.. py:function:: register_pkg_resources()
Enables pkg_resources integration.
This function effectively does the following:
pkg_resources.register_finder() to map
:py:class:OxidizedPathEntryFinder to
:py:func:pkg_resources_find_distributions`.pkg_resources.register_load_type() to map
:py:class:OxidizedFinder to :py:class:OxidizedPkgResourcesProvider.It is safe to call this function multiple times, as behavior should be deterministic.
.. py:function:: pkg_resources_find_distributions(finder: OxidizedPathEntryFinder, path_item: str, only=false) -> list
Resolve pkg_resources.Distribution instances given a
:py:class:OxidizedPathEntryFinder and search criteria.
This function is what is registered with pkg_resources for distribution
resolution and you likely don't need to call it directly.
OxidizedFinder Class.. py:class:: OxidizedFinder
A `meta path finder`_ that resolves indexed resources. See
See :ref:`oxidized_finder` for more high-level documentation.
This type implements the following interfaces:
* ``importlib.abc.MetaPathFinder``
* ``importlib.abc.Loader``
* ``importlib.abc.InspectLoader``
* ``importlib.abc.ExecutionLoader``
See the `importlib.abc documentation <https://docs.python.org/3/library/importlib.html#module-importlib.abc>`_
for more on these interfaces.
In addition to the methods on the above interfaces, the following methods
defined elsewhere in ``importlib`` are exposed:
* ``get_resource_reader(fullname: str) -> importlib.abc.ResourceReader``
* ``find_distributions(context: Optional[DistributionFinder.Context]) -> [Distribution]``
``ResourceReader`` is documented alongside other ``importlib.abc`` interfaces.
``find_distribution()`` is documented in
`importlib.metadata <https://docs.python.org/3/library/importlib.metadata.html>`_.
Instances have additional functionality beyond what is defined by
``importlib``. This functionality allows you to construct, inspect, and
manipulate instances.
.. py:attribute:: multiprocessing_set_start_method
(``Opional[str]``) Value to pass to :py:func:`multiprocessing.set_start_method` on
import of :py:mod:`multiprocessing` module.
``None`` means the method won't be called.
.. py:attribute:: origin
(``str``) The path this instance is using as the anchor for relative path
references.
.. py:attribute:: path_hook_base_str
(``str``) The base path that the path hook handler on this instance
will respond to.
This value is often the same as ``sys.executable`` but isn't guaranteed
to be that exact value.
.. py:attribute:: pkg_resources_import_auto_register
(``bool``) Whether this instance will be registered via
``pkg_resources.register_finder()`` upon this instance importing the
``pkg_resources`` module.
.. py:method:: __new__(cls, relative_path_origin: Optional[os.PathLike]) -> OxidizedFinder
Construct a new instance of :py:class:`OxidizedFinder`.
New instances of :py:class:`OxidizedFinder` can be constructed like
normal Python types:
.. code-block:: python
finder = OxidizedFinder()
The constructor takes the following named arguments:
``relative_path_origin``
A path-like object denoting the filesystem path that should be used as the
*origin* value for relative path resources. Filesystem-based resources are
stored as a relative path to an *anchor* value. This is that *anchor* value.
If not specified, the directory of the current executable will be used.
See the `python_packed_resources <https://docs.rs/python-packed-resources/0.1.0/python_packed_resources/>`_
Rust crate for the specification of the binary data blob defining *packed
resources data*.
.. important::
The *packed resources data* format is still evolving. It is recommended
to use the same version of the ``oxidized_importer`` extension to
produce and consume this data structure to ensure compatibility.
.. py:method:: index_bytes(data: bytes) -> None
This method parses any bytes-like object and indexes the resources within.
.. py:method:: index_file_memory_mapped(path: pathlib.Path) -> None
This method parses the given Path-like argument and indexes the resources
within. Memory mapped I/O is used to read the file. Rust managed the
memory map via the ``memmap`` crate: this does not use the Python
interpreter's memory mapping code.
.. py:method:: index_interpreter_builtins() -> None
This method indexes Python resources that are built-in to the Python
interpreter itself. This indexes built-in extension modules and frozen
modules.
.. py:method:: index_interpreter_builtin_extension_modules() -> None
This method will index Python extension modules that are compiled into
the Python interpreter itself.
.. py:method:: index_interpreter_frozen_modules() -> None
This method will index Python modules whose bytecode is frozen into
the Python interpreter itself.
.. py:method:: indexed_resources() -> List[OxidizedResource]
This method returns a list of resources that are indexed by the
instance. It allows Python code to inspect what the finder knows about.
Any mutations to returned values are not reflected in the finder.
See :ref:`oxidized_resource` for more on the returned type.
.. py:method:: add_resource(resource: OxidizedResource)
This method registers an :ref:`oxidized_resource` instance with the finder,
enabling the finder to use it to service lookups.
When an ``OxidizedResource`` is registered, its data is copied into the
finder instance. So changes to the original ``OxidizedResource`` are not
reflected on the finder. (This is because :py:class:`OxidizedFinder`
maintains an index and it is important for the data behind that index to
not change out from under it.)
Resources are stored in an invisible hash map where they are indexed by
the ``name`` attribute. When a resource is added, any existing resource
under the same name has its data replaced by the incoming
``OxidizedResource`` instance.
If you have source code and want to produce bytecode, you can do something
like the following:
.. code-block:: python
def register_module(finder, module_name, source):
code = compile(source, module_name, "exec")
bytecode = marshal.dumps(code)
resource = OxidizedResource()
resource.name = module_name
resource.is_module = True
resource.in_memory_bytecode = bytecode
resource.in_memory_source = source
finder.add_resource(resource)
.. py:method:: add_resources(resources: List[OxidizedResource]
This method is syntactic sugar for calling ``add_resource()`` for every
item in an iterable. It is exposed because function call overhead in Python
can be non-trivial and it can be quicker to pass in an iterable of
``OxidizedResource`` than to call ``add_resource()`` potentially hundreds
of times.
.. py:method:: serialize_indexed_resources(ignore_builtin=true, ignore_frozen=true) -> bytes
This method serializes all resources currently indexed by the instance
into an opaque ``bytes`` instance. The returned data can be fed into a
separate :py:class:`OxidizedFinder` instance by passing it to
:py:meth:`OxidizedFinder.__new__`.
Arguments:
``ignore_builtin`` (bool)
Whether to ignore ``builtin`` extension modules from the serialized data.
Default is ``True``
``ignore_frozen`` (bool)
Whether to ignore ``frozen`` extension modules from the serialized data.
Default is ``True``.
Entries for *built-in* and *frozen* modules are ignored by default because
they aren't portable, as they are compiled into the interpreter and aren't
guaranteed to work from one Python interpreter to another. The serialized
format does support expressing them. Use at your own risk.
.. py:method:: path_hook(path: Union[str, bytes, os.PathLike[AnyStr]]) -> OxidizedPathEntryFinder
Implements a *path hook* for obtaining a
`PathEntryFinder <https://docs.python.org/3/library/importlib.html#importlib.abc.PathEntryFinder>`_
from a ``sys.path`` entry. See :ref:`oxidized_finder_path_hooks` for details.
Raises ``ImportError`` if the given path isn't serviceable. The exception
should have ``.__cause__`` set to an inner exception with more details on why
the path was rejected.
OxidizedDistribution Class.. py:class:: OxidizedDistribution
Represents the metadata of a Python package. Comparable to
importlib.metadata.Distribution. Instances of this type are emitted by
OxidizedFinder.find_distributions.
.. py:method:: from_name(cls, name: str) -> OxidizedDistribution
:classmethod:
Resolve the instance for the given package name.
.. py:method:: discover(cls, **kwargs) -> list[OxidizedDistribution]
:classmethod:
Resolve instances for all known packages.
.. py:method:: read_text(filename) -> str
Attempt to read metadata file given its filename.
.. py:property:: metadata
:type: email.message.EmailMessage
Return the parsed metadata for this distribution.
.. py:property:: name
:type: str
Return the ``Name`` metadata for this distribution package.
.. py:property:: _normalized_name
:type: str
Return the normalized version of the ``Name``.
.. py:property:: version
:type: str
Return the ``Version`` metadata for this distribution package.
.. py:property:: entry_points
Resolve entry points for this distribution package.
.. py:property:: files
Not implemented. Always raises when called.
.. py:property:: requires
Generated requirements specified for this distribution.
OxidizedResourceReader Class.. py:class:: OxidizedResourceReader
importlib.abc.ResourceReader implementer for :py:class:OxidizedFinder.
.. py:method:: open_resource(resource: str)
.. py:method:: resource_path(resource: str)
.. py:method:: is_resource(name: str) -> bool
.. py:method:: contents() -> list[str]
OxidizedPathEntryFinder Class.. py:class:: OxidizedPathEntryFinder
A path entry finder_ that can find resources contained in an associated
:py:class:OxidizedFinder instance.
Instances are created via :py:meth:OxidizedFinder.path_hook <OxidizedFinder.path_hook>.
Direct use of :class:OxidizedPathEntryFinder is generally unnecessary:
:py:class:OxidizedFinder is the primary interface to the custom importer.
See :ref:oxidized_finder_path_hooks for more on path hook and path entry finder
behavior in oxidized_importer.
.. py:method:: find_spec(fullname: str, target: Optional[types.ModuleType] = None) -> Optional[importlib.machinery.ModuleSpec]
Search for modules visible to the instance.
.. py:method:: invalidate_caches() -> None
Invoke the same method on the :py:class:`OxidizedFinder` instance with
which the :class:`OxidizedPathEntryFinder` instance was constructed.
.. py:method:: iter_modules(prefix: str = "") -> List[pkgutil.ModuleInfo]
Iterate over the visible modules. This method complies with
``pkgutil.iter_modules``'s protocol.
OxidizedPkgResourcesProvider Class.. py:class:: OxidizedPkgResourcesProvider
A pkg_resources.IMetadataProvider and pkg_resources.IResourceProvider
enabling pkg_resources to access package metadata and resources.
All members of the aforementioned interfaces are implemented. Divergence
from pkg_resources defined behavior is documented next to the method.
.. py:method:: has_metadata(name: str) -> bool
.. py:method:: get_metadata(name: str) -> str
.. py:method:: get_metadata_lines(name: str) -> List[str]
Returns a ``list`` instead of a generator.
.. py:method:: metadata_isdir(name: str) -> bool
.. py:method:: metadata_listdir(name: str) -> List[str]
.. py:method:: run_script(script_name: str, namespace: Any)
Always raises ``NotImplementedError``.
Please leave a comment in
`#384 <https://github.com/indygreg/PyOxidizer/issues/384>`_ if you would like
this functionality implemented.
.. py:method:: get_resource_filename(manager, resource_name: str)
Always raises ``NotImplementedError``.
This behavior appears to be allowed given code in ``pkg_resources``.
However, it means that ``pkg_resources.resource_filename()`` will not
work. Please leave a comment in
`#383 <https://github.com/indygreg/PyOxidizer/issues/383>`_ if you would like
this functionality implemented.
.. py:method:: get_resource_stream(manager, resource_name: str) -> io.BytesIO
.. py:method:: get_resource_string(manager, resource_name: str) -> bytes
.. py:method:: has_resource(resource_name: str) -> bool
.. py:method:: resource_isdir(resource_name: str) -> bool
.. py:method:: resource_listdir(resource_name: str) -> List[str]
Returns a ``list`` instead of a generator.
OxidizedResource Class.. py:class:: OxidizedResource
Represents a resource that is indexed by a
:py:class:OxidizedFinder instance.
Each instance represents a named entity with associated metadata and data. e.g. an instance can represent a Python module with associated source and bytecode.
New instances can be constructed via OxidizedResource(). This will return
an instance whose name = "" and all properties will be None or
false.
.. py:attribute:: is_module
A ``bool`` indicating if this resource is a Python module. Python modules
are backed by source or bytecode.
.. py:attribute:: is_builtin_extension_module
A ``bool`` indicating if this resource is a Python extension module
built-in to the Python interpreter.
.. py:attribute:: is_frozen_module
A ``bool`` indicating if this resource is a Python module whose bytecode
is frozen into the Python interpreter.
.. py:attribute:: is_extension_module
A ``bool`` indicating if this resource is a Python extension module.
.. py:attribute:: is_shared_library
A ``bool`` indicating if this resource is a shared library.
.. py:attribute:: name
The ``str`` name of the resource.
.. py:attribute:: is_package
A ``bool`` indicating if this resource is a Python package.
.. py:attribute:: is_namespace_package
A ``bool`` indicating if this resource is a Python namespace package.
.. py:attribute:: in_memory_source
``bytes`` or ``None`` holding Python module source code that should be
imported from memory.
.. py:attribute:: in_memory_bytecode
``bytes`` or ``None`` holding Python module bytecode that should be
imported from memory.
This is raw Python bytecode, as produced from the ``marshal`` module.
``.pyc`` files have a header before this data that will need to be
stripped should you want to move data from a ``.pyc`` file into this
field.
.. py:attribute:: in_memory_bytecode_opt1
``bytes`` or ``None`` holding Python module bytecode at optimization level 1
that should be imported from memory.
This is raw Python bytecode, as produced from the ``marshal`` module.
``.pyc`` files have a header before this data that will need to be
stripped should you want to move data from a ``.pyc`` file into this
field.
.. py:attribute:: in_memory_bytecode_opt2
``bytes`` or ``None`` holding Python module bytecode at optimization level 2
that should be imported from memory.
This is raw Python bytecode, as produced from the ``marshal`` module.
``.pyc`` files have a header before this data that will need to be
stripped should you want to move data from a ``.pyc`` file into this
field.
.. py:attribute:: in_memory_extension_module_shared_library
``bytes`` or ``None`` holding native machine code defining a Python extension
module shared library that should be imported from memory.
.. py:attribute:: in_memory_package_resources
``dict[str, bytes]`` or ``None`` holding resource files to make available to
the ``importlib.resources`` APIs via in-memory data access. The ``name`` of
this object will be a Python package name. Keys in this dict are virtual
filenames under that package. Values are raw file data.
.. py:attribute:: in_memory_distribution_resources
``dict[str, bytes]`` or ``None`` holding resource files to make available to
the ``importlib.metadata`` API via in-memory data access. The ``name`` of
this object will be a Python package name. Keys in this dict are virtual
filenames. Values are raw file data.
.. py:attribute:: in_memory_shared_library
``bytes`` or ``None`` holding a shared library that should be imported from
memory.
.. py:attribute:: shared_library_dependency_names
``list[str]`` or ``None`` holding the names of shared libraries that this
resource depends on. If this resource defines a loadable shared library,
this list can be used to express what other shared libraries it depends on.
.. py:attribute:: relative_path_module_source
``pathlib.Path`` or ``None`` holding the relative path to Python module
source that should be imported from the filesystem.
.. py:attribute:: relative_path_module_bytecode
``pathlib.Path`` or ``None`` holding the relative path to Python module
bytecode that should be imported from the filesystem.
.. py:attribute:: relative_path_module_bytecode_opt1
``pathlib.Path`` or ``None`` holding the relative path to Python module
bytecode at optimization level 1 that should be imported from the filesystem.
.. py:attribute:: relative_path_module_bytecode_opt2
``pathlib.Path`` or ``None`` holding the relative path to Python module
bytecode at optimization level 2 that should be imported from the filesystem.
.. py:attribute:: relative_path_extension_module_shared_library
``pathlib.Path`` or ``None`` holding the relative path to a Python extension
module that should be imported from the filesystem.
.. py:attribute:: relative_path_package_resources
``dict[str, pathlib.Path]`` or ``None`` holding resource files to make
available to the ``importlib.resources`` APIs via filesystem access. The
``name`` of this object will be a Python package name. Keys in this dict are
filenames under that package. Values are relative paths to files from which
to read data.
.. py:attribute:: relative_path_distribution_resources
``dict[str, pathlib.Path]`` or ``None`` holding resource files to make
available to the ``importlib.metadata`` APIs via filesystem access. The
``name`` of this object will be a Python package name. Keys in this dict are
filenames under that package. Values are relative paths to files from which
to read data.
OxidizedResourceCollector Class.. py:class:: OxidizedResourceCollector
Provides functionality for turning instances of Python resource types into a
collection of :py:class:OxidizedResource for loading into an
:py:class:OxidizedFinder instance.
.. py:method:: new(cls, allowed_locations: list[str])
Construct an instance by defining locations that resources can be loaded
from.
The accepted string values are ``in-memory`` and ``filesystem-relative``.
.. py:attribute:: allowed_locations
(``list[str]``) Exposes allowed locations where resources can be loaded from.
.. py:method:: add_in_memory_resource(resource)
Adds a Python resource type (:py:class:`PythonModuleSource`,
:py:class:`PythonModuleBytecode`, etc) to the collector and marks it for
loading via in-memory mechanisms.
.. py:method:: add_filesystem_relative(prefix, resource)
Adds a Python resource type (:py:class:`PythonModuleSource`,
:py:class:`PythonModuleBytecode`, etc) to the collector and marks it for
loading via a relative path next to some *origin* path (as specified to the
:py:class:`OxidizedFinder`). That relative path can have a ``prefix`` value
prepended to it. If no prefix is desired and you want the resource placed
next to the *origin*, use an empty ``str`` for ``prefix``.
.. py:method:: oxidize() -> tuple[list[OxidizedResource], list[tuple[pathlib.Path, bytes, bool]]]
Takes all the resources collected so far and turns them into data
structures to facilitate later use.
The first element in the returned tuple is a list of
:py:class:`OxidizedResource` instances.
The second is a list of 3-tuples containing the relative filesystem
path for a file, the content to write to that path, and whether the file
should be marked as executable.
OxidizedResourceReader Class.. py:class:: OxidizedResourceResource
An implementation of
importlib.abc.ResourceReader <https://docs.python.org/3.9/library/importlib.html#importlib.abc.ResourceReader>_
to facilitate resource reading from an :py:class:OxidizedFinder.
See :ref:resource_reader_support for more.
OxidizedZipFinder Class.. py:class:: OxidizedZipFinder
A meta path finder_ that operates on zip files.
This type attempts to be a pure Rust reimplementation of the Python standard
library zipimport.zipimporter type.
This type implements the following interfaces:
importlib.abc.MetaPathFinderimportlib.abc.Loaderimportlib.abc.InspectLoader.. py:method:: from_zip_data(cls, source: bytes, path: Union[bytes, str, pathlib.Path, None] = None) -> OxidizedZipFinder
Construct an instance from zip archive data.
The source argument can be any bytes-like object. A reference to the
original Python object will be kept and zip I/O will be performed against
the memory tracked by that object. It is possible to trigger an
out-of-bounds memory read if the source object is mutated after being
passed into this function.
The ``path`` argument denotes the path to the zip archive. This path will
be advertised in ``__file__`` attributes. If not defined, the path of the
current executable will be used.
.. py:method:: from_path(cls, path: Union[bytes, str, pathlib.Path]) -> OxidizedZipFinder
Construct an instance from a filesystem path.
The source represents the path to a file containing zip archive data.
The file will be opened using Rust file I/O. The content of the file
will be read lazily.
If you don't already have a copy of the zip data and the zip file will
be immutable for the lifetime of the constructed instance, this method
may yield better performance than opening the file, reading its content,
and calling :py:meth:`OxidizedZipFinder.from_zip_data` because it may
incur less overall I/O.
PythonModuleSource Class.. py:class:: PythonModuleSource
Represents Python module source code. e.g. a .py file.
.. py:attribute:: module
(``str``) The fully qualified Python module name. e.g.
``my_package.foo``.
.. py:attribute:: source
(``bytes``) The source code of the Python module.
Note that source code is stored as ``bytes``, not ``str``. Most Python
source is stored as ``utf-8``, so you can ``.encode("utf-8")`` or
``.decode("utf-8")`` to convert between ``bytes`` and ``str``.
.. py:attribute:: is_package
(``bool``) Whether this module is a Python package.
PythonModuleBytecode Class.. py:class:: PythonModuleBytecode
Represents Python module bytecode. e.g. what a .pyc file holds (but
without the header that a .pyc file has).
.. py:attribute:: module
(``str``) The fully qualified Python module name.
.. py:attribute:: bytecode
(``bytes``) The bytecode of the Python module.
This is what you would get by compiling Python source code via
something like ``marshal.dumps(compile(source, "exe"))``. The bytecode
does **not** contain a header, like what would be found in a ``.pyc``
file.
.. py:attribute:: optimize_level
(``int``) The bytecode optimization level. Either ``0``, ``1``, or ``2``.
.. py:attribute:: is_package
(``bool``) Whether this module is a Python package.
PythonPackageResource Class.. py:class:: PythonPackageResource
Represents a non-module resource file. These are files that live next
to Python modules that are typically accessed via the APIs in
importlib.resources.
.. py:attribute:: package
(``str``) The name of the leaf-most Python package this resource is
associated with.
With :py:class:`OxidizedFinder`, an ``importlib.abc.ResourceReader``
associated with this package will be used to load the resource.
.. py:attribute:: name
(``str``) The name of the resource within its ``package``. This is
typically the filename of the resource. e.g. ``resource.txt`` or
``child/foo.png``.
.. py:attribute:: data
(``bytes``) The raw binary content of the resource.
PythonPackageDistributionResource Class.. py:class:: PythonPackageDistributionResource
Represents a non-module resource file living in a package distribution
directory (e.g. <package>-<version>.dist-info or
<package>-<version>.egg-info).
These resources are typically accessed via the APIs in importlib.metadata.
.. py:attribute:: package
(``str``) The name of the Python package this resource is associated with.
.. py:attribute:: version
(``str``) Version string of Python package this resource is associated with.
.. py:attribute:: name
(``str``) The name of the resource within the metadata distribution. This
is typically the filename of the resource. e.g. ``METADATA``.
.. py:attribute:: data
(``bytes``) The raw binary content of the resource.
PythonExtensionModule Class.. py:class:: PythonExtensionModule
Represents a Python extension module. This is a shared library
defining a Python extension implemented in native machine code that
can be loaded into a process and defines a Python module. Extension
modules are typically defined by .so, .dylib, or .pyd
files.
.. :py:attribute:: name
(``str``) The name of the extension module.
.. note::
Properties of this type are read-only.
.. rubric:: Footnotes
.. _meta path finder: https://docs.python.org/3/library/importlib.html#importlib.abc.MetaPathFinder
.. _path entry finder: https://docs.python.org/3/reference/import.html#path-entry-finders