Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork56.4k
8-bit quantization in dnn module and int8 layers#20228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
jebastin-nadar commentedJun 9, 2021
@vpisarev MaxPool int8 tests pass now. Tests with maxpooling as the last layer are commented as they compute max indices which is not supported in int8 version. Although I still don't get why maxpooling as the last layer should compute max index as well by default.
|
Uh oh!
There was an error while loading.Please reload this page.
vpisarev commentedJun 10, 2021 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
I actually suggest to support computing indices (in FP32, as before) together with computing the max value in INT8. Yes, it will be slower, but it will let us to provide 100% compatibility. |
jebastin-nadar commentedJun 11, 2021
This is not possible right now as a single variable determines the datatypes of all the outputs for a layer. So both outputs[0] and outputs[1] can either be CV_32F or CV_8S. One as CV_32F and another as CV_8S is not possible currently. opencv/modules/dnn/src/dnn.cpp Line 582 inc2c67c2
opencv/modules/dnn/src/dnn.cpp Line 980 inc2c67c2
Ofcourse changing "dtype" to std::vector would solve it, but that will introduce a lot of complexity in allocating blobs and a lot of work for a feature which is rarely used. Maybe we can keep it as low priority and look at it later. |
jebastin-nadar commentedJun 11, 2021
Some build issues :
|
vpisarev commentedJun 12, 2021 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
ok, sounds good to me. Let's keep it as a low-priority item |
vpisarev commentedJun 12, 2021
I suggest to comment off AVX-512 branches for now (in your newly added code, not everywhere)
well, you need to figure that out. If you stuck at that, we can look at it together |
f11678f tob36cf00Comparejebastin-nadar commentedJun 22, 2021
28d7c78 tofc8350dCompare1afb5b9 to7b5a392Comparejebastin-nadar commentedJul 7, 2021
@vpisarev As discussed, int8 layers which had mostly duplicated code (concat, flatten, padding) have been removed and the original fp32 layers are modified to support 8-bit inputs as well. In some parallel_for(), I have used templates to support multiple datatypes, please check the latest commit to see if any changes have to be made. |
1ff055e toc8a294bCompare80998b3 to6f0162cComparevpisarev commentedAug 18, 2021
@SamFC10, could you please fix the merge conflicts once again? And then squash commits? We will try to merge your pull request quickly. |
jebastin-nadar commentedAug 18, 2021
Looks like I messed up something. |
alalek commentedAug 18, 2021
You can rollback changes: |
jebastin-nadar commentedAug 19, 2021
vpisarev commentedAug 19, 2021
👍 |
dnn cleanup: On-fly-quantization removal#2498On-fly-quantization is first introduced via#20228.We decided to remove it but keep int8 layers implementation because on-fly-quantizationis less practical given the fact that there has been so many dedicated tools for modelquantization.### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [x] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name.- [x] The feature is well documented and sample code can be built with the project CMake
Uh oh!
There was an error while loading.Please reload this page.
PR for GSoC'21 project on quantization in DNN module. This PR adds functions to quantize FP32 models and int8 versions of some layers along with tests for the new layers.
A second PR is planned later this summer to load 8-bit quantized models from other frameworks (ONNX/Tensorflow) and perform inference using int8 layers and weights without converting them to FP32 (as done currently).
mentor :@vpisarev
relates :#16633#20188
Pull Request Readiness Checklist
See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.