Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[GSoC] Add block quantized models#270

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
fengyuentau merged 14 commits intoopencv:mainfromDaniAffCH:main
Nov 6, 2024
Merged

Conversation

@DaniAffCH
Copy link
Contributor

@DaniAffCHDaniAffCH commentedAug 15, 2024
edited
Loading

This PR introduces block-quantized versions for most of the opencv_zoo models.
All the models have been quantized using a block size of 64, as this configuration demonstrated good performance empirically.

Additionally, the block quantization tool has been enhanced to handle more cases:

  • The tool now supports weights categorized asconstant in ONNX (previously, it only supportedinitializers).
  • It now properly handles blocks where all elements are identical.
  • Data type saturation has been added to prevent overflow during quantization.
  • The error metric has been updated to be independent of the weights' order of magnitude
  • A new verbose mode has been added to assist in troubleshooting.

Finally, the benchmark tool has been modified to support block quantized models.

The following table contains block quantization statistics of the quantized models

ModelOriginal Size (KB)Block Quantized Size (KB)
face_detection_yunet227.14119.62
face_recognition_sface37,789.4110,417.82
facial_expression_recognition_mobilefacenet4,679.581,344.44
handpose_estimation_mediapipe4,003.541,199.43
human_segmentation_pphumanseg6,019.471,694.07
image_classification_mobilenetv116,494.274,491.59
image_classification_mobilenetv213,637.283,782.18
image_classification_ppresnet50100,163.1227,435.20
license_plate_detection_lpd_yunet4,049.041,158.07
object_detection_nanodet3,711.871,097.62
object_detection_yolox35,017.589,516.03
object_tracking_vittrack697.97264.97
optical_flow_estimation_raft62,616.5447,700.30
palm_detection_mediapipe3,814.191,141.94
person_detection_mediapipe11,709.143,400.44
person_reid_youtu104,373.4428,518.79
pose_estimation_mediapipe5,426.991,655.17
text_detection_en_ppocrv32,366.69835.33
text_detection_cn_ppocrv32,366.69835.33
text_recognition_CRNN_CN71,100.7428,346.08
text_recognition_CRNN_CH63,385.7126,257.37

The tables below summarize the metrics change between the original fp32 model, block quantized and int8 quantized version:

ModelsAccuracy
SFace0.9940
SFace block0.9942
SFace quant0.9932
ModelsEasy APMedium APHard AP
YuNet0.88440.86560.7503
YuNet block0.88450.86520.7504
YuNet quant0.88100.86290.7503
ModelsAccuracymIoU
PPHumanSeg0.96560.9164
PPHumanSeg block0.96550.9162
PPHumanSeg quant0.72850.3642
ModelsTop-1 AccuracyTop-5 Accuracy
MobileNet V167.6487.97
MobileNet V1 block67.2187.62
MobileNet V1 quant55.5378.74
MobileNet V269.4489.23
MobileNet V2 block68.6688.90
MobileNet V2 quant68.3788.56
ModelsTop-1 AccuracyTop-5 Accuracy
PP-ResNet82.2896.15
PP-ResNet block82.2796.15
PP-ResNet quant0.220.96

The following models haven't been quantized:

  • image_segmentation_efficientsam: Not compliant with the ONNX standard (Efficient SAM is not compliant with onnx standard #269).
  • text_recognition_CRNN_EN: Even a minor error prevents the model from correctly predicting the characters, despite accurately predicting the text bounding box. The reason for this issue remains unclear, as the Chinese version of the model works properly.

fengyuentau, vpisarev, and sayyid-abolfazl reacted with thumbs up emojivpisarev and sayyid-abolfazl reacted with rocket emoji
- constant weight category supported- add data type saturation- handled the case in which all the elements within a block are the samebenchmark script modified to support block quantized modelsblock quantized some models
@DaniAffCHDaniAffCH changed the titleAdd block quantized models[GSoC] Add block quantized modelsAug 15, 2024
@fengyuentaufengyuentau self-requested a reviewAugust 16, 2024 07:20
@fengyuentaufengyuentau self-assigned thisAug 16, 2024
@fengyuentaufengyuentau added the GSoCGoogle Summer of Code projected related labelAug 16, 2024
@DaniAffCHDaniAffCH marked this pull request as ready for reviewAugust 18, 2024 16:21
@vpisarev
Copy link
Collaborator

@fengyuentau, when it's expected to be merged? I believe, the patch is very useful.

sayyid-abolfazl reacted with thumbs up emoji

@fengyuentau
Copy link
Member

@fengyuentau, when it's expected to be merged? I believe, the patch is very useful.

We can, but since block quantization in opencv does not have any acceleration in terms of inference speed, merging this patch only increases the size of zoo and we are not gaining any benifit from it for now. Merging is postponed until block quantization works practically better in opencv.

sayyid-abolfazl reacted with heart emoji

@vpisarev
Copy link
Collaborator

I don't quite get it. We don't see any performancedegradation with block-wise quantized modelsand the models get smaller, not bigger. That is, we get smaller size + same speed. But if you are still in doubt, let's discuss it next week at our meeting, pls, include it into the agenda

sayyid-abolfazl reacted with thumbs up emoji

@fengyuentau
Copy link
Member

I don't quite get it. We don't see any performancedegradation with block-wise quantized modelsand the models get smaller, not bigger. That is, we get smaller size + same speed. But if you are still in doubt, let's discuss it next week at our meeting, pls, include it into the agenda

Okay, lets do it in the next meeting.

@DaniAffCH
Copy link
ContributorAuthor

DaniAffCH commentedOct 25, 2024
edited
Loading

Just to add my two cents: despite not achieving an inference speed improvement, this PR significantly reduces the network size while retaining the original accuracy.

Further optimization could be done in the future, also including adapting block-wise quantization for the new inference engine.

fengyuentau, WanliZhong, and sayyid-abolfazl reacted with thumbs up emoji

@fengyuentau
Copy link
Member

@DaniAffCH Thank you for all the effort on this pull request. We decided to merge this pull request with the following changes:

  1. Could you report the accuracy of text detection and recoginition as well? Even though they may not have a good number, we still want to know about this since it is here in the pull request.
  2. Change model naming suffix from_blocked to_int8bq. Add a note in eachREADME explaining models with suffix_int8bq is block-quantized in int8 precision.
DaniAffCH and sayyid-abolfazl reacted with thumbs up emoji

@fengyuentau
Copy link
Member

Also need to provide each command that you use to generate the block-quantized models. Just want to ensure reproducibility.

sayyid-abolfazl reacted with thumbs up emoji

@fengyuentau
Copy link
Member

@DaniAffCH Do you plan to push commits to finalize this PR? If no, I will merge this one first then do it in the subsequent PRs.

@DaniAffCH
Copy link
ContributorAuthor

Yes I'll definitely address your comments. I've been busy with other projects, but now I can finalize this PR.

fengyuentau reacted with thumbs up emoji

@DaniAffCH
Copy link
ContributorAuthor

I've just updated the file suffixes and the related README files.

Could you report the accuracy of text detection and recoginition as well? Even though they may not have a good number, we still want to know about this since it is here in the pull request.

Regarding text detection, I couldn't find a suitable evaluation script ineval, so I don't know how to test it.

Regarding text recognition, I decided not to include the English version because of a severe drop in accuracy. Such a drop in accuracy doesn't occur in the Chinese version, so I decided to include onlytext_recognition_CRNN_CHtext_recognition_CRNN_CN.
However, the evaluation script only supports IIIT5K and ICDAR2003 datasets containing English words, so it's not possible to test the Chinese models on them.

fengyuentau reacted with thumbs up emoji

@DaniAffCH
Copy link
ContributorAuthor

All the models have been block-quantized usingthe block quantization script with the following command:

python block_quantize.py --input_model INPUT_PATH --block_size 64
fengyuentau reacted with thumbs up emoji

Copy link
Member

@fengyuentaufengyuentau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Generally looks good to me. It is suggested to finish all renamings from bint8 to int8bq.


Also add the following content in the sectionBlockwise quantization usage intools/quantize/README.md.

Block-quantized models under each model directory are generated with `--block_size=64`

DaniAffCH reacted with thumbs up emoji
@fengyuentaufengyuentau added the quantizationAnything related to model quantization labelNov 6, 2024
@fengyuentaufengyuentau added this to the5.0 milestoneNov 6, 2024
@DaniAffCH
Copy link
ContributorAuthor

Done!

Copy link
Member

@fengyuentaufengyuentau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Great! Thank you! 👍

DaniAffCH reacted with thumbs up emoji
@fengyuentaufengyuentau merged commit25f423d intoopencv:mainNov 6, 2024
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@fengyuentaufengyuentaufengyuentau approved these changes

Assignees

@fengyuentaufengyuentau

Labels

GSoCGoogle Summer of Code projected relatedquantizationAnything related to model quantization

Projects

None yet

Milestone

5.0

Development

Successfully merging this pull request may close these issues.

3 participants

@DaniAffCH@vpisarev@fengyuentau

[8]ページ先頭

©2009-2025 Movatter.jp