Backend module¶
The module provides a unified interface for running inference and implementations of different types of backends.
Each concrete backend wraps some execution provider or technology, so you don’t need to do anything special,
just create Backend instance and use it.
There are also preconfigured backends (presets) that can be useful for finding the optimal backend and its parameters.
Backend interface¶
Interface for running inference. All backends implemented in ENOT Lite framework follow this basic interface.
- class Backend¶
Interface for running inference.
ORT Backend interface¶
Interface for backends based on ONNX Runtime.
- class OrtBackend(model, provider_name, provider_options=None, sess_opt=None)¶
ORTbased backend.- There are subclasses with different presets based on this backend:
OrtTensorrtBackend
OrtOpenVinoBackend
OrtCpuBackend
OrtCudaBackend
- Parameters
- __init__(model, provider_name, provider_options=None, sess_opt=None)¶
- Parameters
model – Filename or serialized
ONNXorORTformat model in a byte string.provider_name (str) – Name of an
ORTexecution provider which will be used in inference.provider_options (Optional[Dict]) – Execution provider options or None.
sess_opt (Optional[ort.SessionOptions]) – Session options or None.
- get_inputs()¶
Returns model input.
- get_outputs()¶
Returns model output.
- run(output_names, input_feed, run_options=None)¶
Computes the predictions.
- Parameters
output_names – Names of the output.
input_feed – Dictionary
{ input_name: input_value }.run_options – Native backend options.
ORT CPU Backend¶
- class OrtCpuBackend(model, provider_options=None, sess_opt=None)¶
ORTbackend with aCPUexecution provider.- __init__(model, provider_options=None, sess_opt=None)¶
- Parameters
model – Filename or serialized
ONNXorORTformat model in a byte string.provider_options (Optional[Dict]) – CPU execution provider options or None.
sess_opt (Optional[ort.SessionOptions]) – Session options or None.
ORT CUDA Backend¶
- class OrtCudaBackend(model, provider_options=None, sess_opt=None)¶
ORTbackend with aCUDAexecution provider.- __init__(model, provider_options=None, sess_opt=None)¶
- Parameters
model – Filename or serialized
ONNXorORTformat model in a byte string.provider_options (Optional[Dict]) – CUDA execution provider options or None.
sess_opt (Optional[ort.SessionOptions]) – Session options or None.
ORT OpenVINO Backend¶
- class OrtOpenvinoBackend(model, provider_options=None, sess_opt=None)¶
ORTbackend with aOpenVINOexecution provider.- There are presets based on this backend:
OrtOpenvinoFloatBackend
- __init__(model, provider_options=None, sess_opt=None)¶
- Parameters
model – Filename or serialized
ONNXorORTformat model in a byte string.provider_options (Optional[Dict]) – OpenVINO execution provider options or None.
sess_opt (Optional[ort.SessionOptions]) – Session options or None.
ORT OpenVINO Float Backend¶
- class OrtOpenvinoFloatBackend(model, sess_opt=None)¶
ORTbackend with aOpenVINOexecution provider configured withCPU_FP32option.Examples
>>> from enot_lite.backend import OrtOpenvinoFloatBackend >>> backend = OrtOpenvinoFloatBackend('model.onnx') >>> input_name = backend.get_inputs()[0].name >>> backend.run(None, {input_name: sample})
- Parameters
sess_opt (
Optional[SessionOptions]) –
- __init__(model, sess_opt=None)¶
- Parameters
model – Filename or serialized
ONNXorORTformat model in a byte string.sess_opt (Optional[ort.SessionOptions]) – Session options or None.
ORT TensorRT Backend¶
- class OrtTensorrtBackend(model, provider_options=None, sess_opt=None)¶
ORTbackend with aTensorRTexecution provider.- There are presets based on this backend:
OrtTensorrtFloatBackendOrtTensorrtFloatOptimBackendOrtTensorrtInt8Backend
Notes
The first launch of this backend can take a long time.
- __init__(model, provider_options=None, sess_opt=None)¶
- Parameters
model – Filename or serialized
ONNXorORTformat model in a byte string.provider_options (Optional[Dict]) – TensorRT execution provider options or None.
sess_opt (Optional[ort.SessionOptions]) – Session options or None.
ORT TensorRT Float Backend¶
- class OrtTensorrtFloatBackend(model, sess_opt=None)¶
ORTbackend with aTensorRTexecution provider with default options.Notes
The first launch of this backend can take a long time.
Examples
>>> from enot_lite.backend import OrtTensorrtFloatBackend >>> backend = OrtTensorrtFloatBackend('model.onnx') >>> input_name = backend.get_inputs()[0].name >>> backend.run(None, {input_name: sample})
- Parameters
sess_opt (
Optional[SessionOptions]) –
- __init__(model, sess_opt=None)¶
- Parameters
model – Filename or serialized
ONNXorORTformat model in a byte string.sess_opt (Optional[ort.SessionOptions]) – Session options or None.
ORT TensorRT Optimal Float Backend¶
- class OrtTensorrtFloatOptimBackend(model, sess_opt=None)¶
ORTbackend with aTensorRTexecution provider configured with the optimal precision of floating point data type.Notes
The first launch of this backend can take a long time.
Examples
>>> from enot_lite.backend import OrtTensorrtFloatOptimBackend >>> backend = OrtTensorrtFloatOptimBackend('model.onnx') >>> input_name = backend.get_inputs()[0].name >>> backend.run(None, {input_name: sample})
- Parameters
sess_opt (
Optional[SessionOptions]) –
- __init__(model, sess_opt=None)¶
- Parameters
model – Filename or serialized
ONNXorORTformat model in a byte string.sess_opt (Optional[ort.SessionOptions]) – Session options or None.
ORT TensorRT Int-8 Backend¶
- class OrtTensorrtInt8Backend(model, calibration_table, sess_opt=None, allow_graph_optimization=False)¶
ORTbackend with aTensorRTexecution provider configured with int8.Notes
The first launch of this backend can take a long time.
Examples
>>> from enot_lite.backend import OrtTensorrtInt8Backend >>> from enot_lite.calibration import CalibrationTableTensorrt >>> from enot_lite.calibration import calibrate >>> calibration_table = CalibrationTableTensorrt.from_file_flatbuffers('table.flatbuffers') # Load from file. >>> calibration_table = calibrate('model.onnx', dataloader) # Create calibration table using Pytorch Dataloader. >>> backend = OrtTensorrtInt8Backend('model.onnx', calibration_table) >>> input_name = backend.get_inputs()[0].name >>> backend.run(None, {input_name: sample})
- Parameters
- __init__(model, calibration_table, sess_opt=None, allow_graph_optimization=False)¶
- Parameters
model – Filename or serialized
ONNXorORTformat model in a byte string.calibration_table (Union[CalibrationTableTensorrt, Union[str, Path]]) – Precalculated calibration table.
sess_opt (Optional[ort.SessionOptions]) – Session options or None.
allow_graph_optimization (bool) – This parameter allows to use graph optimization. Note, that graph optimization can broke quantization, use it only if you are sure that it is necessary. Optimization is only applied if sess_opt is passed.