Using ENOT Lite

ENOT Lite provides a unified interface for running neural network inference with various technologies.

To run neural network inference using ENOT Lite all you need to do is:

  1. Create Backend instance by using create() method of BackendFactory.

  2. Pass your input data into created Backend instance by using __call__() method to obtain prediction.

Here is an example which fully covers the basic usage of ENOT Lite:

1from enot_lite.backend import BackendFactory
2from enot_lite.type import BackendType
3
4backend = BackendFactory().create('path/to/model.onnx', BackendType.AUTO_CPU)
5prediction = backend(inputs)
  • At line 1 in example above we import BackendFactory which will be used to create an instance of Backend.

  • At line 2 we import BackendType which allows to easily choose among various backends.

  • At line 4 we create Backend instance by using create() method of BackendFactory. Created backend is a wrap for your model which provides an easy-to-use interface for inference.

  • And finally, at line 5 inference is done by passing inputs (it can be images, text or something else) into backend and the results are stored in prediction variable.

BackendType allows you to choose among various inference technologies, so you don’t need to do anything special, just create Backend instance by BackendFactory and use it for inference.

To refine Backend setting, see BackendType, ModelType.

class Backend

Interface for running inference.

All backends implemented in ENOT Lite framework follow this interface.

__call__(inputs, outputs=None, **kwargs)

Computes the predictions for given inputs.

Parameters
  • inputs (Any) – Model input. Input can be a single value or dictionary or list of values. Allowed types of values in the inputs are: numpy array, torch tensor or ort value. It is recomended to use numpy arrays as input values for CPU backends and torch tensors on GPU (device='cuda') for GPU backends. Right choice of input values device prevents unnecessary copy, saves the resources and improves inference latency.

  • outputs (Optional[Union[Dict, List]]) – Preallocated model output. Output can be dictionary or list of values. Allowed types of values in the outputs are: numpy array, torch tensor or ort value. This parameter is useful for GPU backends: preallocating arrays for outputs manually allows to save memory and reduce allocation time. It is ignored by OPENVINO backend.

Returns

Prediction.

Return type

Any

Examples

>>> backend(input_0)  # For models with only one input.
>>> backend([input_0, input_1, ..., input_n])  # For models with several inputs.
>>> backend((input_0, input_1, ..., input_n))  # Is equivalent to previous one.
>>> backend({
...     'input_name_0': input_0,  # Explicitly specifying mapping between
...     'input_name_1': input_1,  # input names and input data.
...     ...
...     'input_name_n': input_n,
... })