Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork56.4k
Description
It's possible to quantize ONNX networks to reduce the storage requirements and also accelerate the inference:https://www.onnxruntime.ai/docs/how-to/quantization.html
However, OpenCV 4.x/pre-5.0 is unable to load such networks, because of the missing support for layersQLinearConv andQLinearMatMul that such quantized networks contain.
It would be nice to add support for such layers into OpenCV. By default, the weights can be converted to FP32 (or FP16 maybe), but the original INT8 weights should be preserved as well — we will be adding fixed-point paths to our implementations of convolution and fully-connected layers.
For testing, here is the original ONNX model:
https://drive.google.com/file/d/1JW6_zrgzjeSZQcKKEDhTvp3aseNu0pe9/view?usp=sharing
and its quantized variant:
https://drive.google.com/file/d/1RHkF8pGMfo0covNR0_GQhB11JvrzogFO/view?usp=sharing
(provided by @SamFC10)