Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork56.4k
Add RISC-V HAL implementation for minMaxIdx#26789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Use a macro generator for all data types of minMaxIdx.Add the mask_step argument for minMaxIdx, check mask.isContinuous() before calling cv_hal_minMaxIdx.Including two algorithms, one read source data twice while the other only read once but with more calculation.EMUL is configurable, use EMUL=4 when EEW>=32.Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
Uh oh!
There was an error while loading.Please reload this page.
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
opencv-alalek left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
HAL integration looks good to me 👍
Uh oh!
There was an error while loading.Please reload this page.
mshabunin left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Overall looks good to me.
Though, I agree with@opencv-alalek - it would be better to reimplement macros as templates.
@asmorkalov , did we decide anything on adding new function variabnt with mask_step?
asmorkalov commentedJan 29, 2025
Let's add. The current version presumes compatibility. I vote for template instead of macros. |
horror-proton commentedJan 29, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
My solution to similar problems is to add some template specialization. And use implicit (overloaded) intrinsics in other places, however this solution does not work for integer/floating-point and signed/unsigned version yet, we may have to add more member functions template<typename T,int LOGM>structrvv_helper;#defineRVV_HELPER(T, NAME, SIZE, EEW, EMUL, LOGM) \template<>structrvv_helper<T, LOGM> { \using type = v##NAME##EMUL##_t; \staticsize_tsetvl(size_t n) {return __riscv_vsetvl_e##SIZE##EMUL(n); } \staticsize_tsetvlmax() {return __riscv_vsetvlmax_e##SIZE##EMUL(); } \static typele(const T *ptr,size_t vl) { \return __riscv_vle##SIZE##_v_##EEW##EMUL(ptr, vl); \ } \ ... \ }...RVV_HELPER(unsignedchar, uint8,8,u8, mf2, -1);RVV_HELPER(unsignedchar, uint8,8,u8, m1,0);RVV_HELPER(unsignedchar, uint8,8,u8, m2,1);......RVV_HELPER(float, float32,32,f32, mf2, -1);RVV_HELPER(float, float32,32,f32, m1,0);RVV_HELPER(float, float32,32,f32, m2,1);... |
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
amane-ame commentedJan 30, 2025
cc@asmorkalov@mshabunin. |
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
asmorkalov commentedJan 31, 2025
@amane-ame Thanks a lot for the effort. Please fix conflict and I'll merge it. |
amane-ame commentedJan 31, 2025
@asmorkalov Ready to be merged. |
13b2caf intoopencv:4.xUh oh!
There was an error while loading.Please reload this page.
Add RISC-V HAL implementation for cv::norm and cv::normalize#26804This patch implements `cv::norm` with norm types `NORM_INF/NORM_L1/NORM_L2/NORM_L2SQR` and `Mat::convertTo` function in RVV_HAL using native intrinsic, optimizing the performance for `cv::norm(src)`, `cv::norm(src1, src2)`, and `cv::normalize(src)` with data types `8UC1/8UC4/32FC1`.`cv::normalize` also calls `minMaxIdx`,#26789 implements RVV_HAL for this.Tested on MUSE-PI for both gcc 14.2 and clang 20.0.```$ opencv_test_core --gtest_filter="*Norm*"$ opencv_perf_core --gtest_filter="*norm*" --perf_min_samples=300 --perf_force_samples=300```The head of the perf table is shown below since the table is too long.View the full perf table here: [hal_rvv_norm.pdf](https://github.com/user-attachments/files/18468255/hal_rvv_norm.pdf)<img width="1304" alt="Untitled" src="https://github.com/user-attachments/assets/3550b671-6d96-4db3-8b5b-d4cb241da650" />### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [ ] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake
Add RISC-V HAL implementation for minMaxIdxopencv#26789On the RISC-V platform, `minMaxIdx` cannot benefit from Universal Intrinsics because the UI-optimized `minMaxIdx` only supports `CV_SIMD128` (and does not accept `CV_SIMD_SCALABLE` for RVV).https://github.com/opencv/opencv/blob/1d701d1690b8cc9aa6b86744bffd5d9841ac6fd3/modules/core/src/minmax.cpp#L209-L214This patch implements `minMaxIdx` function in RVV_HAL using native intrinsic, optimizing the performance for all data types with one channel.Tested on MUSE-PI for both gcc 14.2 and clang 20.0.```$ opencv_test_core --gtest_filter="*MinMaxLoc*"$ opencv_perf_core --gtest_filter="*minMaxLoc*"```<img width="1122" alt="Untitled" src="https://github.com/user-attachments/assets/6a246852-87af-42c5-a50b-c349c2765f3f" />See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [ ] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake
Add RISC-V HAL implementation for cv::norm and cv::normalizeopencv#26804This patch implements `cv::norm` with norm types `NORM_INF/NORM_L1/NORM_L2/NORM_L2SQR` and `Mat::convertTo` function in RVV_HAL using native intrinsic, optimizing the performance for `cv::norm(src)`, `cv::norm(src1, src2)`, and `cv::normalize(src)` with data types `8UC1/8UC4/32FC1`.`cv::normalize` also calls `minMaxIdx`,opencv#26789 implements RVV_HAL for this.Tested on MUSE-PI for both gcc 14.2 and clang 20.0.```$ opencv_test_core --gtest_filter="*Norm*"$ opencv_perf_core --gtest_filter="*norm*" --perf_min_samples=300 --perf_force_samples=300```The head of the perf table is shown below since the table is too long.View the full perf table here: [hal_rvv_norm.pdf](https://github.com/user-attachments/files/18468255/hal_rvv_norm.pdf)<img width="1304" alt="Untitled" src="https://github.com/user-attachments/assets/3550b671-6d96-4db3-8b5b-d4cb241da650" />See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [ ] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake
Uh oh!
There was an error while loading.Please reload this page.
On the RISC-V platform,
minMaxIdxcannot benefit from Universal Intrinsics because the UI-optimizedminMaxIdxonly supportsCV_SIMD128(and does not acceptCV_SIMD_SCALABLEfor RVV).opencv/modules/core/src/minmax.cpp
Lines 209 to 214 in1d701d1
This patch implements
minMaxIdxfunction in RVV_HAL using native intrinsic, optimizing the performance for all data types with one channel.Tested on MUSE-PI for both gcc 14.2 and clang 20.0.
Pull Request Readiness Checklist
See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.