This PR introduces block-quantized versions for most of the opencv_zoo models.
All the models have been quantized using a block size of 64, as this configuration demonstrated good performance empirically.

Additionally, the block quantization tool has been enhanced to handle more cases:

The tool now supports weights categorized asconstant in ONNX (previously, it only supportedinitializers).
It now properly handles blocks where all elements are identical.
Data type saturation has been added to prevent overflow during quantization.
The error metric has been updated to be independent of the weights' order of magnitude
A new verbose mode has been added to assist in troubleshooting.

Finally, the benchmark tool has been modified to support block quantized models.

The following table contains block quantization statistics of the quantized models

Model	Original Size (KB)	Block Quantized Size (KB)
face_detection_yunet	227.14	119.62
face_recognition_sface	37,789.41	10,417.82
facial_expression_recognition_mobilefacenet	4,679.58	1,344.44
handpose_estimation_mediapipe	4,003.54	1,199.43
human_segmentation_pphumanseg	6,019.47	1,694.07
image_classification_mobilenetv1	16,494.27	4,491.59
image_classification_mobilenetv2	13,637.28	3,782.18
image_classification_ppresnet50	100,163.12	27,435.20
license_plate_detection_lpd_yunet	4,049.04	1,158.07
object_detection_nanodet	3,711.87	1,097.62
object_detection_yolox	35,017.58	9,516.03
object_tracking_vittrack	697.97	264.97
optical_flow_estimation_raft	62,616.54	47,700.30
palm_detection_mediapipe	3,814.19	1,141.94
person_detection_mediapipe	11,709.14	3,400.44
person_reid_youtu	104,373.44	28,518.79
pose_estimation_mediapipe	5,426.99	1,655.17
text_detection_en_ppocrv3	2,366.69	835.33
text_detection_cn_ppocrv3	2,366.69	835.33
text_recognition_CRNN_CN	71,100.74	28,346.08
text_recognition_CRNN_CH	63,385.71	26,257.37

The tables below summarize the metrics change between the original fp32 model, block quantized and int8 quantized version:

Models	Accuracy
SFace	0.9940
SFace block	0.9942
SFace quant	0.9932

Models	Easy AP	Medium AP	Hard AP
YuNet	0.8844	0.8656	0.7503
YuNet block	0.8845	0.8652	0.7504
YuNet quant	0.8810	0.8629	0.7503

Models	Accuracy	mIoU
PPHumanSeg	0.9656	0.9164
PPHumanSeg block	0.9655	0.9162
PPHumanSeg quant	0.7285	0.3642

Models	Top-1 Accuracy	Top-5 Accuracy
MobileNet V1	67.64	87.97
MobileNet V1 block	67.21	87.62
MobileNet V1 quant	55.53	78.74
MobileNet V2	69.44	89.23
MobileNet V2 block	68.66	88.90
MobileNet V2 quant	68.37	88.56

Models	Top-1 Accuracy	Top-5 Accuracy
PP-ResNet	82.28	96.15
PP-ResNet block	82.27	96.15
PP-ResNet quant	0.22	0.96

The following models haven't been quantized:

image_segmentation_efficientsam: Not compliant with the ONNX standard (Efficient SAM is not compliant with onnx standard #269).
text_recognition_CRNN_EN: Even a minor error prevents the model from correctly predicting the characters, despite accurately predicting the text bounding box. The reason for this issue remains unclear, as the Chinese version of the model works properly.

DaniAffCH added7 commits

August 4, 2024 22:35

Gemm and MatMul block quantization support

4698403

refactoring

10cbbeb

fix indentation

974d32e

node name independent

00b8dde

Merge branch 'main' ofhttps://github.com/DaniAffCH/opencv_zoo

fb54356

Block quantization tool:

fd7c9fb

- constant weight category supported- add data type saturation- handled the case in which all the elements within a block are the samebenchmark script modified to support block quantized modelsblock quantized some models

add missing block quantized models

5639eba

DaniAffCH changed the title~~Add block quantized models~~[GSoC] Add block quantized models

Aug 15, 2024

formatting

0009ab6

fengyuentau self-requested a review

August 16, 2024 07:20

fengyuentau self-assigned this

Aug 16, 2024

fengyuentau added the GSoCGoogle Summer of Code projected related label

Aug 16, 2024

DaniAffCH added4 commits

August 17, 2024 14:28

add blocked models to eval script. Evaluation yunet

0fbcdaf

Add sface and pphumanseg evaluation, block quantization tool fix, han…

1642774

…dpose blocked model fix, removed blocked CRNN EN,

changed evaluation metric in block_quantize script and add verbose mode

11806d7

Add evaluation for PP-ResNet and Mobilenet

4f59fc7

DaniAffCH marked this pull request as ready for review

August 18, 2024 16:21

Copy link

Collaborator

vpisarev commentedOct 25, 2024

@fengyuentau, when it's expected to be merged? I believe, the patch is very useful.

Copy link

Member

fengyuentau commentedOct 25, 2024

@fengyuentau, when it's expected to be merged? I believe, the patch is very useful.

We can, but since block quantization in opencv does not have any acceleration in terms of inference speed, merging this patch only increases the size of zoo and we are not gaining any benifit from it for now. Merging is postponed until block quantization works practically better in opencv.

Copy link

Collaborator

vpisarev commentedOct 25, 2024

I don't quite get it. We don't see any performancedegradation with block-wise quantized modelsand the models get smaller, not bigger. That is, we get smaller size + same speed. But if you are still in doubt, let's discuss it next week at our meeting, pls, include it into the agenda

Copy link

Member

fengyuentau commentedOct 25, 2024

I don't quite get it. We don't see any performancedegradation with block-wise quantized modelsand the models get smaller, not bigger. That is, we get smaller size + same speed. But if you are still in doubt, let's discuss it next week at our meeting, pls, include it into the agenda

Okay, lets do it in the next meeting.

Copy link

ContributorAuthor

DaniAffCH commentedOct 25, 2024•
edited
Loading

Just to add my two cents: despite not achieving an inference speed improvement, this PR significantly reduces the network size while retaining the original accuracy.

Further optimization could be done in the future, also including adapting block-wise quantization for the new inference engine.

Copy link

Member

fengyuentau commentedOct 30, 2024

@DaniAffCH Thank you for all the effort on this pull request. We decided to merge this pull request with the following changes:

Could you report the accuracy of text detection and recoginition as well? Even though they may not have a good number, we still want to know about this since it is here in the pull request.
Change model naming suffix from_blocked to_int8bq. Add a note in eachREADME explaining models with suffix_int8bq is block-quantized in int8 precision.

Copy link

Member

fengyuentau commentedOct 30, 2024

Also need to provide each command that you use to generate the block-quantized models. Just want to ensure reproducibility.

Copy link

Member

fengyuentau commentedNov 6, 2024

@DaniAffCH Do you plan to push commits to finalize this PR? If no, I will merge this one first then do it in the subsequent PRs.

Copy link

ContributorAuthor

DaniAffCH commentedNov 6, 2024

Yes I'll definitely address your comments. I've been busy with other projects, but now I can finalize this PR.

changed file suffix and update readmes

d1139c7

Copy link

ContributorAuthor

DaniAffCH commentedNov 6, 2024

I've just updated the file suffixes and the related README files.

Could you report the accuracy of text detection and recoginition as well? Even though they may not have a good number, we still want to know about this since it is here in the pull request.

Regarding text detection, I couldn't find a suitable evaluation script ineval, so I don't know how to test it.

Regarding text recognition, I decided not to include the English version because of a severe drop in accuracy. Such a drop in accuracy doesn't occur in the Chinese version, so I decided to include onlytext_recognition_CRNN_CHtext_recognition_CRNN_CN.
However, the evaluation script only supports IIIT5K and ICDAR2003 datasets containing English words, so it's not possible to test the Chinese models on them.

Copy link

ContributorAuthor

DaniAffCH commentedNov 6, 2024

All the models have been block-quantized usingthe block quantization script with the following command:

python block_quantize.py --input_model INPUT_PATH --block_size 64

fengyuentau reviewed

Nov 6, 2024

View reviewed changes

Copy link

Member

fengyuentau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Generally looks good to me. It is suggested to finish all renamings from bint8 to int8bq.

Also add the following content in the sectionBlockwise quantization usage intools/quantize/README.md.

Block-quantized models under each model directory are generated with `--block_size=64`

models/face_detection_yunet/README.md OutdatedShow resolvedHide resolved

benchmark/README.md OutdatedShow resolvedHide resolved

benchmark/benchmark.py OutdatedShow resolvedHide resolved

models/__init__.py OutdatedShow resolvedHide resolved

fengyuentau added the quantizationAnything related to model quantization label

Nov 6, 2024

fengyuentau added this to the5.0 milestone

Nov 6, 2024

renamed int8bq

f301435

Copy link

ContributorAuthor

DaniAffCH commentedNov 6, 2024

Done!

fengyuentau approved these changes

Nov 6, 2024

View reviewed changes

Copy link

Member

fengyuentau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Great! Thank you! 👍

fengyuentau merged commit25f423d intoopencv:main

Nov 6, 2024

Labels

GSoC

Google Summer of Code projected related

quantization

Anything related to model quantization

Movatterモバイル変換

[GSoC] Add block quantized models#270

[GSoC] Add block quantized models#270

Uh oh!

Conversation

DaniAffCH commentedAug 15, 2024• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

vpisarev commentedOct 25, 2024

Uh oh!

fengyuentau commentedOct 25, 2024

Uh oh!

vpisarev commentedOct 25, 2024

Uh oh!

fengyuentau commentedOct 25, 2024

Uh oh!

DaniAffCH commentedOct 25, 2024• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

fengyuentau commentedOct 30, 2024

Uh oh!

fengyuentau commentedOct 30, 2024

Uh oh!

fengyuentau commentedNov 6, 2024

Uh oh!

DaniAffCH commentedNov 6, 2024

Uh oh!

DaniAffCH commentedNov 6, 2024

Uh oh!

DaniAffCH commentedNov 6, 2024

Uh oh!

fengyuentau left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DaniAffCH commentedNov 6, 2024

Uh oh!

fengyuentau left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DaniAffCH commentedAug 15, 2024•
edited
Loading

DaniAffCH commentedOct 25, 2024•
edited
Loading