NotificationsYou must be signed in to change notification settings
Fork56.4k
Star85.3k

Add RISC-V HAL implementation for minMaxIdx#26789

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Merged

asmorkalov merged 6 commits intoopencv:4.xfromamane-ame:minmax_hal_rvv

Jan 31, 2025

Merged

Add RISC-V HAL implementation for minMaxIdx#26789

asmorkalov merged 6 commits intoopencv:4.xfromamane-ame:minmax_hal_rvv

Jan 31, 2025

Conversation

Copy link

Contributor

amane-ame commentedJan 17, 2025•
edited
Loading

On the RISC-V platform,minMaxIdx cannot benefit from Universal Intrinsics because the UI-optimizedminMaxIdx only supportsCV_SIMD128 (and does not acceptCV_SIMD_SCALABLE for RVV).

opencv/modules/core/src/minmax.cpp

Lines 209 to 214 in1d701d1

	staticvoidminMaxIdx_8u(const uchar* src,const uchar* mask,int* minval,int* maxval,
	size_t* minidx,size_t* maxidx,int len,size_t startidx )
	{
	#if CV_SIMD128
	if ( len >= VTraits<v_uint8x16>::vlanes() )
	{

This patch implementsminMaxIdx function in RVV_HAL using native intrinsic, optimizing the performance for all data types with one channel.

Tested on MUSE-PI for both gcc 14.2 and clang 20.0.

$ opencv_test_core --gtest_filter="*MinMaxLoc*"$ opencv_perf_core --gtest_filter="*minMaxLoc*"

Pull Request Readiness Checklist

See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

amane-ameand others added2 commits

January 17, 2025 18:39

Vectorize minMaxIdx with HAL RVV.

13ba172

Use a macro generator for all data types of minMaxIdx.Add the mask_step argument for minMaxIdx, check mask.isContinuous() before calling cv_hal_minMaxIdx.Including two algorithms, one read source data twice while the other only read once but with more calculation.EMUL is configurable, use EMUL=4 when EEW>=32.Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>

Add description for parameter mask_step in hal_ni_minMaxIdx.

82903e3

Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>

amane-ame force-pushed theminmax_hal_rvv branch fromc697fd0 to82903e3Compare

January 17, 2025 12:21

amane-ame mentioned this pull request

Jan 19, 2025

Add RISC-V HAL implementation for cv::norm and cv::normalize#26804

Merged

6 tasks

asmorkalov requested changes

Jan 20, 2025

View reviewed changes

modules/core/src/hal_replacement.hppShow resolvedHide resolved

asmorkalov added optimization platform: riscv labels

Jan 20, 2025

Create a new HAL interface for minMaxIdx with mask_step.

007a989

Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>

amane-ame requested a review fromasmorkalov

January 20, 2025 15:42

asmorkalov requested a review frommshabunin

January 21, 2025 15:00

opencv-alalek approved these changes

Jan 24, 2025

View reviewed changes

Copy link

Contributor

opencv-alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

HAL integration looks good to me 👍

3rdparty/hal_rvv/hal_rvv_1p0/minmax.hpp OutdatedShow resolvedHide resolved

mshabunin approved these changes

Jan 29, 2025

View reviewed changes

Copy link

Contributor

mshabunin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Overall looks good to me.

Though, I agree with@opencv-alalek - it would be better to reimplement macros as templates.

@asmorkalov , did we decide anything on adding new function variabnt with mask_step?

Copy link

Contributor

asmorkalov commentedJan 29, 2025

Let's add. The current version presumes compatibility. I vote for template instead of macros.

Copy link

Contributor

horror-proton commentedJan 29, 2025•
edited
Loading

My solution to similar problems is to add some template specialization. And use implicit (overloaded) intrinsics in other places, however this solution does not work for integer/floating-point and signed/unsigned version yet, we may have to add more member functions

template<typename T,int LOGM>structrvv_helper;#defineRVV_HELPER(T, NAME, SIZE, EEW, EMUL, LOGM)                             \template<>structrvv_helper<T, LOGM> {                                     \using type = v##NAME##EMUL##_t;                                            \staticsize_tsetvl(size_t n) {return __riscv_vsetvl_e##SIZE##EMUL(n); }  \staticsize_tsetvlmax() {return __riscv_vsetvlmax_e##SIZE##EMUL(); }     \static typele(const T *ptr,size_t vl) {                                  \return __riscv_vle##SIZE##_v_##EEW##EMUL(ptr, vl);                       \    }                                                                          \    ...                                                                        \  }...RVV_HELPER(unsignedchar, uint8,8,u8, mf2, -1);RVV_HELPER(unsignedchar, uint8,8,u8, m1,0);RVV_HELPER(unsignedchar, uint8,8,u8, m2,1);......RVV_HELPER(float, float32,32,f32, mf2, -1);RVV_HELPER(float, float32,32,f32, m1,0);RVV_HELPER(float, float32,32,f32, m2,1);...

Move the macro definitions into a struct of function pointers.

01fed8e

Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>

Copy link

ContributorAuthor

amane-ame commentedJan 30, 2025

cc@asmorkalov @mshabunin.
I have converted these into templates.

Use real functions instead of function pointers to make clang happy.

1efdf3c

Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>

Copy link

Contributor

asmorkalov commentedJan 31, 2025

@amane-ame Thanks a lot for the effort. Please fix conflict and I'll merge it.

Merge branch '4.x' into minmax_hal_rvv

99d1552

Copy link

ContributorAuthor

amane-ame commentedJan 31, 2025

@asmorkalov Ready to be merged.

asmorkalov approved these changes

Jan 31, 2025

View reviewed changes

asmorkalov merged commit13b2caf intoopencv:4.x

Jan 31, 2025

30 of 31 checks passed

asmorkalov self-assigned this

Jan 31, 2025

asmorkalov added this to the4.12.0 milestone

Jan 31, 2025

asmorkalov pushed a commit that referenced this pull request

Feb 6, 2025

Merge pull request#26804from amane-ame:norm_hal_rvv

2e909c3

Add RISC-V HAL implementation for cv::norm and cv::normalize#26804This patch implements `cv::norm` with norm types `NORM_INF/NORM_L1/NORM_L2/NORM_L2SQR` and `Mat::convertTo` function in RVV_HAL using native intrinsic, optimizing the performance for `cv::norm(src)`, `cv::norm(src1, src2)`, and `cv::normalize(src)` with data types `8UC1/8UC4/32FC1`.`cv::normalize` also calls `minMaxIdx`,#26789 implements RVV_HAL for this.Tested on MUSE-PI for both gcc 14.2 and clang 20.0.```$ opencv_test_core --gtest_filter="*Norm*"$ opencv_perf_core --gtest_filter="*norm*" --perf_min_samples=300 --perf_force_samples=300```The head of the perf table is shown below since the table is too long.View the full perf table here: [hal_rvv_norm.pdf](https://github.com/user-attachments/files/18468255/hal_rvv_norm.pdf)<img width="1304" alt="Untitled" src="https://github.com/user-attachments/assets/3550b671-6d96-4db3-8b5b-d4cb241da650" />### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [ ] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake

asmorkalov mentioned this pull request

Feb 19, 2025

5.x merge 4.x#26939

Merged

NanQin555 pushed a commit to NanQin555/opencv that referenced this pull request

Feb 24, 2025

Merge pull requestopencv#26789from amane-ame:minmax_hal_rvv

c1f7b1d

Add RISC-V HAL implementation for minMaxIdxopencv#26789On the RISC-V platform, `minMaxIdx` cannot benefit from Universal Intrinsics because the UI-optimized `minMaxIdx` only supports `CV_SIMD128` (and does not accept `CV_SIMD_SCALABLE` for RVV).https://github.com/opencv/opencv/blob/1d701d1690b8cc9aa6b86744bffd5d9841ac6fd3/modules/core/src/minmax.cpp#L209-L214This patch implements `minMaxIdx` function in RVV_HAL using native intrinsic, optimizing the performance for all data types with one channel.Tested on MUSE-PI for both gcc 14.2 and clang 20.0.```$ opencv_test_core --gtest_filter="*MinMaxLoc*"$ opencv_perf_core --gtest_filter="*minMaxLoc*"```<img width="1122" alt="Untitled" src="https://github.com/user-attachments/assets/6a246852-87af-42c5-a50b-c349c2765f3f" />See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [ ] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake

NanQin555 pushed a commit to NanQin555/opencv that referenced this pull request

Feb 24, 2025

Merge pull requestopencv#26804from amane-ame:norm_hal_rvv

c167af7

Add RISC-V HAL implementation for cv::norm and cv::normalizeopencv#26804This patch implements `cv::norm` with norm types `NORM_INF/NORM_L1/NORM_L2/NORM_L2SQR` and `Mat::convertTo` function in RVV_HAL using native intrinsic, optimizing the performance for `cv::norm(src)`, `cv::norm(src1, src2)`, and `cv::normalize(src)` with data types `8UC1/8UC4/32FC1`.`cv::normalize` also calls `minMaxIdx`,opencv#26789 implements RVV_HAL for this.Tested on MUSE-PI for both gcc 14.2 and clang 20.0.```$ opencv_test_core --gtest_filter="*Norm*"$ opencv_perf_core --gtest_filter="*norm*" --perf_min_samples=300 --perf_force_samples=300```The head of the perf table is shown below since the table is too long.View the full perf table here: [hal_rvv_norm.pdf](https://github.com/user-attachments/files/18468255/hal_rvv_norm.pdf)<img width="1304" alt="Untitled" src="https://github.com/user-attachments/assets/3550b671-6d96-4db3-8b5b-d4cb241da650" />See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [ ] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake