Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

8-bit quantization in dnn module and int8 layers#20228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
opencv-pushbot merged 1 commit intoopencv:masterfromjebastin-nadar:int8
Aug 19, 2021

Conversation

@jebastin-nadar
Copy link
Contributor

@jebastin-nadarjebastin-nadar commentedJun 7, 2021
edited
Loading

PR for GSoC'21 project on quantization in DNN module. This PR adds functions to quantize FP32 models and int8 versions of some layers along with tests for the new layers.

LayerStatusRemarks
Convolution✔️Variable weights unsupported
Inner Product✔️Variable weights unsupported
Pooling✔️Only Max and Average pooling
Padding✔️
Flatten✔️
Activations✔️
Concat✔️
Eltwise✔️Eltwise division unsupported
BatchNorm, Scale, Shift✔️
Data Permutation layers✔️

A second PR is planned later this summer to load 8-bit quantized models from other frameworks (ONNX/Tensorflow) and perform inference using int8 layers and weights without converting them to FP32 (as done currently).

mentor :@vpisarev
relates :#16633#20188

Pull Request Readiness Checklist

See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • The PR is proposed to proper branch
  • There is reference to original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

alalek and arnaud-nt2i reacted with thumbs up emojiYashasSamaga reacted with hooray emojipeters, drabaioli, MarvinKlemp, and dkurt reacted with heart emojiYashasSamaga reacted with rocket emoji
@jebastin-nadar
Copy link
ContributorAuthor

@vpisarev MaxPool int8 tests pass now. Tests with maxpooling as the last layer are commented as they compute max indices which is not supported in int8 version.

Although I still don't get why maxpooling as the last layer should compute max index as well by default.

int numOutputs = requiredOutputs ? requiredOutputs : (type == MAX ?2 :1);

@vpisarev
Copy link
Contributor

vpisarev commentedJun 10, 2021
edited
Loading

@vpisarev MaxPool int8 tests pass now. Tests with maxpooling as the last layer are commented as they compute max indices which is not supported in int8 version.

Although I still don't get why maxpooling as the last layer should compute max index as well by default.

int numOutputs = requiredOutputs ? requiredOutputs : (type == MAX ?2 :1);

I actually suggest to support computing indices (in FP32, as before) together with computing the max value in INT8. Yes, it will be slower, but it will let us to provide 100% compatibility.

@jebastin-nadar
Copy link
ContributorAuthor

support computing indices (in FP32, as before) together with computing the max value in INT8

This is not possible right now as a single variable determines the datatypes of all the outputs for a layer. So both outputs[0] and outputs[1] can either be CV_32F or CV_8S. One as CV_32F and another as CV_8S is not possible currently.

int dtype;// Datatype of output blobs.

dst.create(shape, dtype);

Ofcourse changing "dtype" to std::vector would solve it, but that will introduce a lot of complexity in allocating blobs and a lot of work for a feature which is rarely used. Maybe we can keep it as low priority and look at it later.

@jebastin-nadar
Copy link
ContributorAuthor

Some build issues :

  1. AVX-512 path in int8layers/layers_common.simd.hpp causes build warnings and convolution tests failure (segmentation fault). The same tests are passed locally and in some other builders so I suspect there is an issue with that specific path. Also, I cannot reproduce this locally as my CPU only supports up to AVX2.
  2. DNN Tests failure in OpenCL builders. From what I remember, I haven't modified any OpenCL related code, so don't know whats causing the failures.

@vpisarev
Copy link
Contributor

vpisarev commentedJun 12, 2021
edited
Loading

Ofcourse changing "dtype" to std::vector would solve it, but that will introduce a lot of complexity in allocating blobs and a lot of work for a feature which is rarely used. Maybe we can keep it as low priority and look at it later.

ok, sounds good to me. Let's keep it as a low-priority item

@vpisarev
Copy link
Contributor

Some build issues :

  1. AVX-512 path in int8layers/layers_common.simd.hpp causes build warnings and convolution tests failure (segmentation fault). The same tests are passed locally and in some other builders so I suspect there is an issue with that specific path. Also, I cannot reproduce this locally as my CPU only supports up to AVX2.

I suggest to comment off AVX-512 branches for now (in your newly added code, not everywhere)

  1. DNN Tests failure in OpenCL builders. From what I remember, I haven't modified any OpenCL related code, so don't know whats causing the failures.

well, you need to figure that out. If you stuck at that, we can look at it together

@jebastin-nadar
Copy link
ContributorAuthor

@alalek@vpisarev How do I ensure only dnn module tests are run in CI Linux OpenCL builder. I edited my original comment but it looks like tests for all modules is being checked.

@jebastin-nadarjebastin-nadarforce-pushed theint8 branch 9 times, most recently from28d7c78 tofc8350dCompareJune 29, 2021 03:34
@jebastin-nadarjebastin-nadarforce-pushed theint8 branch 2 times, most recently from1afb5b9 to7b5a392CompareJuly 7, 2021 13:43
@jebastin-nadar
Copy link
ContributorAuthor

@vpisarev As discussed, int8 layers which had mostly duplicated code (concat, flatten, padding) have been removed and the original fp32 layers are modified to support 8-bit inputs as well.

In some parallel_for(), I have used templates to support multiple datatypes, please check the latest commit to see if any changes have to be made.

@jebastin-nadarjebastin-nadar changed the titleWIP : 8-bit quantization in dnn module and int8 layers8-bit quantization in dnn module and int8 layersJul 12, 2021
@jebastin-nadarjebastin-nadar marked this pull request as ready for reviewJuly 12, 2021 07:10
@vpisarev
Copy link
Contributor

@SamFC10, could you please fix the merge conflicts once again? And then squash commits? We will try to merge your pull request quickly.

@jebastin-nadar
Copy link
ContributorAuthor

And then squash commits

Looks like I messed up something.
Command used :

git reset --soft HEAD~38git commit -m ""git push -f origin int8

@alalek
Copy link
Member

You can rollback changes:

git checkout -B int8 79eca09675fecd2b58c237233a0aaaa7197ace6f

@jebastin-nadar
Copy link
ContributorAuthor

Managed to restore my commits and squashed them. Thanks for the help@alalek@vpisarev

@vpisarevvpisarev self-requested a reviewAugust 19, 2021 08:08
@vpisarev
Copy link
Contributor

👍

mojianbiao reacted with thumbs up emoji

@opencv-pushbotopencv-pushbot merged commitf787c49 intoopencv:masterAug 19, 2021
asmorkalov pushed a commit that referenced this pull requestFeb 16, 2024
dnn cleanup: On-fly-quantization removal#2498On-fly-quantization is first introduced via#20228.We decided to remove it but keep int8 layers implementation because on-fly-quantizationis less practical given the fact that there has been so many dedicated tools for modelquantization.### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [x] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [x] The feature is well documented and sample code can be built with the project CMake
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@vpisarevvpisarevvpisarev approved these changes

Assignees

@vpisarevvpisarev

Projects

None yet

Milestone

4.5.4

Development

Successfully merging this pull request may close these issues.

5 participants

@jebastin-nadar@vpisarev@alalek@opencv-pushbot@asmorkalov

[8]ページ先頭

©2009-2025 Movatter.jp