NotificationsYou must be signed in to change notification settings
Fork56.4k
Star85.3k

Added flag to GaussianBlur for faster but not bit-exact implementation#25792

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Merged

asmorkalov merged 8 commits intoopencv:4.xfromasmorkalov:as/HAL_fast_GaussianBlur

Jul 12, 2024

Merged

Added flag to GaussianBlur for faster but not bit-exact implementation#25792

asmorkalov merged 8 commits intoopencv:4.xfromasmorkalov:as/HAL_fast_GaussianBlur

Jul 12, 2024

Conversation

Copy link

Contributor

asmorkalov commentedJun 20, 2024•
edited
Loading

Rationale:
Current implementation of GaussianBlur is almost always bit-exact. It helps to get predictable results according platforms, but prohibits most of approximations and optimization tricks.

The patch convertsborderType parameter to more genericflags and introducesGAUSS_ALLOW_APPROXIMATIONS flag to allow not bit-exact implementation. With the flag IPP and generic HAL implementation are called first. The flag naming and location is a subject for discussion.

Replaces#22073
Possibly related issue:#24135

Pull Request Readiness Checklist

See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

asmorkalov added optimization category: imgproc RFC labels

Jun 20, 2024

asmorkalov added this to the4.11.0 milestone

Jun 20, 2024

asmorkalov requested review frommshabunin,opencv-alalek andvpisarev

June 20, 2024 09:25

asmorkalov force-pushed theas/HAL_fast_GaussianBlur branch 2 times, most recently from1c62df7 tob5df286Compare

June 20, 2024 10:16

asmorkalov added the pr: Discussion Required label

Jun 20, 2024

asmorkalov changed the title~~Added flag to GaussianBlur for faster but not bit-exact implementation.~~Added flag to GaussianBlur for faster but not bit-exact implementation

Jun 20, 2024

asmorkalov force-pushed theas/HAL_fast_GaussianBlur branch 2 times, most recently from671ce14 toea56bd9Compare

June 20, 2024 12:37

asmorkalov added the pr: needs testNew functionality requires minimal tests set label

Jun 20, 2024

vpisarev reviewed

Jun 21, 2024

View reviewed changes

modules/imgproc/include/opencv2/imgproc.hpp OutdatedShow resolvedHide resolved

Copy link

ContributorAuthor

asmorkalov commentedJun 21, 2024

Discussion result:

move enum to core
dedicated parameter for perf hint, do not mix with border type.
check python and java bindings

asmorkalov added category: core test and removed pr: Discussion Required pr: needs testNew functionality requires minimal tests set labels

Jun 25, 2024

Copy link

ContributorAuthor

asmorkalov commentedJun 26, 2024

@mshabunin @opencv-alalek @vpisarev I reworked interface as discussed offline and added accuracy tests for IPP branch. Could you take a look?

mshabunin reviewed

Jun 26, 2024

View reviewed changes

modules/imgproc/src/smooth.dispatch.cpp OutdatedShow resolvedHide resolved

vpisarev reviewed

Jun 28, 2024

View reviewed changes

modules/core/include/opencv2/core.hppShow resolvedHide resolved

vpisarev reviewed

Jun 28, 2024

View reviewed changes

modules/imgproc/include/opencv2/imgproc.hpp OutdatedShow resolvedHide resolved

opencv-alalek reviewed

Jun 28, 2024

View reviewed changes

modules/imgproc/include/opencv2/imgproc.hpp OutdatedShow resolvedHide resolved

asmorkalov added3 commits

July 10, 2024 13:39

Added flag to GaussianBlur for faster but not bit-exact implementation.

f702da6

Implemented alternative interface for implementation hints.

12a5914

Accuracy test for GaussianBlur with IMPL_ALLOW_APPROXIMATION.

771034c

asmorkalov force-pushed theas/HAL_fast_GaussianBlur branch froma753443 to8fa2e47Compare

July 10, 2024 11:49

asmorkalov added3 commits

July 10, 2024 15:20

Added Implementation hint to test system reports and CMake flags docu…

a7ce249

…mentation.

Code review fixes.

bf8a99d

Code review fixes.

a677218

vpisarev self-requested a review

July 10, 2024 19:22

vpisarev approved these changes

Jul 10, 2024

View reviewed changes

opencv-alalek reviewed

Jul 11, 2024

View reviewed changes

CMakeLists.txt OutdatedShow resolvedHide resolved

modules/imgproc/include/opencv2/imgproc.hpp OutdatedShow resolvedHide resolved

modules/imgproc/test/test_smooth_bitexact.cpp

		cv::absdiff(dst, gt, diff);
		cv::Mat flatten_diff = diff.reshape(1, diff.rows);

		int nz =countNonZero(flatten_diff);

Copy link

Contributor

opencv-alalekJul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

norm is faster thancountNonZero approach.

Use relative NORM_L1/L2 and NORM_INF instead.

Copy link

ContributorAuthor

asmorkalovJul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I intentionally split the check on min-max deviation and amount of different pixels.

Copy link

Contributor

opencv-alalekJul 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

EXPECT_LE(max_val, 2); // expectes results floating +-1

comment doesn't follow to the check anyway.

NORM_INF <=1 works perfect.

With 1-limited NORM_INF, we could use NORM_L1 + RELATIVE to define the part of pixels of different values.

modules/core/include/opencv2/core.hpp OutdatedShow resolvedHide resolved

asmorkalov added feature and removed RFC labels

Jul 11, 2024

Copy link

ContributorAuthor

asmorkalov commentedJul 11, 2024

@opencv-alalek I fixed your review notes. Please take a look again.

asmorkalov force-pushed theas/HAL_fast_GaussianBlur branch froma47f442 toee840b5Compare

July 11, 2024 09:48

Code review fixes.

13b6caa

asmorkalov force-pushed theas/HAL_fast_GaussianBlur branch fromee840b5 to13b6caaCompare

July 11, 2024 11:04

asmorkalov merged commit15783d6 intoopencv:4.x

Jul 12, 2024

Kumataro mentioned this pull request

Jul 13, 2024

core: hal: avoid to use _tzcnt_u32 for ARM64EC#25903

Merged

6 tasks

opencv-alalek reviewed

Jul 14, 2024

View reviewed changes

modules/core/include/opencv2/core.hpp

		ALGO_APPROX =2,//!< Allow alternative approximations to get faster implementation. Behaviour and result depends on a platform
		};

		/*! @brief Returns ImplementationHint selected by default, a.k.a. `IMPL_DEFAULT` defined during OpenCV compilation.

Copy link

Contributor

opencv-alalekJul 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

ImplementationHint

not renamed

IMPL_DEFAULT

What is that?

modules/core/include/opencv2/core.hpp

		*/
		enum AlgorithmHint {
		ALGO_DEFAULT =0,//!< Default algorithm behaviour defined during OpenCV build
		ALGO_ACCURATE =1,//!< Use generic portable implementation

Copy link

Contributor

opencv-alalekJul 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

ALGO_HINT_ then.

modules/core/src/system.cpp

		#include<iostream>
		#include<ostream>

		#include<opencv2/core.hpp>

Copy link

Contributor

opencv-alalekJul 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

To be removed.

modules/core/include/opencv2/core.hpp


		/*! @brief Returns ImplementationHint selected by default, a.k.a. `IMPL_DEFAULT` defined during OpenCV compilation.
		*/
		CV_EXPORTS_W AlgorithmHintgetDefaultAlgorithmHint();

Copy link

Contributor

opencv-alalekJul 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Should go toutility.hpp somewhere nearsetUseOptimized()

Copy link

Contributor

opencv-alalekJul 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

setUseOptimized() should also control behavior of that:

setUseOptimized(false) disables these hints and use accurate versions.

modules/imgproc/test/test_smooth_bitexact.cpp

		cv::absdiff(dst, gt, diff);
		cv::Mat flatten_diff = diff.reshape(1, diff.rows);

		int nz =countNonZero(flatten_diff);

Copy link

Contributor

opencv-alalekJul 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

EXPECT_LE(max_val, 2); // expectes results floating +-1

comment doesn't follow to the check anyway.

NORM_INF <=1 works perfect.

With 1-limited NORM_INF, we could use NORM_L1 + RELATIVE to define the part of pixels of different values.

This was referencedJul 15, 2024

Post-merge fixes for algorithm hint API#25911

Merged

(5.x) Merge 4.x#25915

Merged

Added xxxApprox overloads for YUV color conversions in HAL and AlgorithmHint to cvtColor#25932

Merged

Fix ipp_GaussianBlur will not be called with IPP enabled flag(ENABLE_…#22073

Closed

asmorkalov added a commit that referenced this pull request

Aug 6, 2024

Merge pull request#25932from asmorkalov:as/HAL_cvtColor_aprox

49459d4

Added xxxApprox overloads for YUV color conversions in HAL and AlgorithmHint to cvtColor#25932The xxxApprox to implement HAL functions with less bits for arithmetic of FP.The hint was introduced in#25792 and#25911### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [x] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [x] The feature is well documented and sample code can be built with the project CMake

fengyuentau pushed a commit to fengyuentau/opencv that referenced this pull request

Aug 15, 2024

Merge pull requestopencv#25792from asmorkalov:as/HAL_fast_GaussianBlur

14d984a

Added flag to GaussianBlur for faster but not bit-exact implementationopencv#25792Rationale:Current implementation of GaussianBlur is almost always bit-exact. It helps to get predictable results according platforms, but prohibits most of approximations and optimization tricks.The patch converts `borderType` parameter to more generic `flags` and introduces `GAUSS_ALLOW_APPROXIMATIONS` flag to allow not bit-exact implementation. With the flag IPP and generic HAL implementation are called first. The flag naming and location is a subject for discussion.Replacesopencv#22073Possibly related issue:opencv#24135### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake

fengyuentau pushed a commit to fengyuentau/opencv that referenced this pull request

Aug 15, 2024

Merge pull requestopencv#25932from asmorkalov:as/HAL_cvtColor_aprox

f7f78fb

Added xxxApprox overloads for YUV color conversions in HAL and AlgorithmHint to cvtColoropencv#25932The xxxApprox to implement HAL functions with less bits for arithmetic of FP.The hint was introduced inopencv#25792 andopencv#25911### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [x] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [x] The feature is well documented and sample code can be built with the project CMake

savuor pushed a commit to savuor/opencv that referenced this pull request

Nov 1, 2024

Merge pull requestopencv#25792from asmorkalov:as/HAL_fast_GaussianBlur

153a50a

Added flag to GaussianBlur for faster but not bit-exact implementationopencv#25792Rationale:Current implementation of GaussianBlur is almost always bit-exact. It helps to get predictable results according platforms, but prohibits most of approximations and optimization tricks.The patch converts `borderType` parameter to more generic `flags` and introduces `GAUSS_ALLOW_APPROXIMATIONS` flag to allow not bit-exact implementation. With the flag IPP and generic HAL implementation are called first. The flag naming and location is a subject for discussion.Replacesopencv#22073Possibly related issue:opencv#24135See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake

savuor mentioned this pull request

Nov 1, 2024

Recent HAL changes ported to 4.9#26395

Closed

6 tasks

savuor pushed a commit to savuor/opencv that referenced this pull request

Nov 5, 2024

Merge pull requestopencv#25792from asmorkalov:as/HAL_fast_GaussianBlur

81eb0cc

Added flag to GaussianBlur for faster but not bit-exact implementationopencv#25792Rationale:Current implementation of GaussianBlur is almost always bit-exact. It helps to get predictable results according platforms, but prohibits most of approximations and optimization tricks.The patch converts `borderType` parameter to more generic `flags` and introduces `GAUSS_ALLOW_APPROXIMATIONS` flag to allow not bit-exact implementation. With the flag IPP and generic HAL implementation are called first. The flag naming and location is a subject for discussion.Replacesopencv#22073Possibly related issue:opencv#24135### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake

savuor pushed a commit to savuor/opencv that referenced this pull request

Nov 8, 2024

Merge pull requestopencv#25792from asmorkalov:as/HAL_fast_GaussianBlur

aa65c71

Added flag to GaussianBlur for faster but not bit-exact implementationopencv#25792Rationale:Current implementation of GaussianBlur is almost always bit-exact. It helps to get predictable results according platforms, but prohibits most of approximations and optimization tricks.The patch converts `borderType` parameter to more generic `flags` and introduces `GAUSS_ALLOW_APPROXIMATIONS` flag to allow not bit-exact implementation. With the flag IPP and generic HAL implementation are called first. The flag naming and location is a subject for discussion.Replacesopencv#22073Possibly related issue:opencv#24135### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake

savuor pushed a commit to savuor/opencv that referenced this pull request

Nov 21, 2024

Merge pull requestopencv#25792from asmorkalov:as/HAL_fast_GaussianBlur

57ce02c

Added flag to GaussianBlur for faster but not bit-exact implementationopencv#25792Rationale:Current implementation of GaussianBlur is almost always bit-exact. It helps to get predictable results according platforms, but prohibits most of approximations and optimization tricks.The patch converts `borderType` parameter to more generic `flags` and introduces `GAUSS_ALLOW_APPROXIMATIONS` flag to allow not bit-exact implementation. With the flag IPP and generic HAL implementation are called first. The flag naming and location is a subject for discussion.Replacesopencv#22073Possibly related issue:opencv#24135### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake

thewoz pushed a commit to CobbsLab/OPENCV that referenced this pull request

Feb 13, 2025

Merge pull requestopencv#25932from asmorkalov:as/HAL_cvtColor_aprox

2d65efa

Added xxxApprox overloads for YUV color conversions in HAL and AlgorithmHint to cvtColoropencv#25932The xxxApprox to implement HAL functions with less bits for arithmetic of FP.The hint was introduced inopencv#25792 andopencv#25911### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [x] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [x] The feature is well documented and sample code can be built with the project CMake

asmorkalov moved this toDone inOpenCV 4.x HAL improvement

Sep 11, 2025