Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Add RISC-V HAL implementation for minMaxIdx#26789

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
asmorkalov merged 6 commits intoopencv:4.xfromamane-ame:minmax_hal_rvv
Jan 31, 2025

Conversation

@amane-ame
Copy link
Contributor

@amane-ameamane-ame commentedJan 17, 2025
edited
Loading

On the RISC-V platform,minMaxIdx cannot benefit from Universal Intrinsics because the UI-optimizedminMaxIdx only supportsCV_SIMD128 (and does not acceptCV_SIMD_SCALABLE for RVV).

staticvoidminMaxIdx_8u(const uchar* src,const uchar* mask,int* minval,int* maxval,
size_t* minidx,size_t* maxidx,int len,size_t startidx )
{
#if CV_SIMD128
if ( len >= VTraits<v_uint8x16>::vlanes() )
{

This patch implementsminMaxIdx function in RVV_HAL using native intrinsic, optimizing the performance for all data types with one channel.

Tested on MUSE-PI for both gcc 14.2 and clang 20.0.

$ opencv_test_core --gtest_filter="*MinMaxLoc*"$ opencv_perf_core --gtest_filter="*minMaxLoc*"
Untitled

Pull Request Readiness Checklist

See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

amane-ameand others added2 commitsJanuary 17, 2025 18:39
Use a macro generator for all data types of minMaxIdx.Add the mask_step argument for minMaxIdx, check mask.isContinuous() before calling cv_hal_minMaxIdx.Including two algorithms, one read source data twice while the other only read once but with more calculation.EMUL is configurable, use EMUL=4 when EEW>=32.Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
Copy link
Contributor

@opencv-alalekopencv-alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

HAL integration looks good to me 👍

Copy link
Contributor

@mshabuninmshabunin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Overall looks good to me.

Though, I agree with@opencv-alalek - it would be better to reimplement macros as templates.

@asmorkalov , did we decide anything on adding new function variabnt with mask_step?

@asmorkalov
Copy link
Contributor

Let's add. The current version presumes compatibility. I vote for template instead of macros.

@horror-proton
Copy link
Contributor

horror-proton commentedJan 29, 2025
edited
Loading

My solution to similar problems is to add some template specialization. And use implicit (overloaded) intrinsics in other places, however this solution does not work for integer/floating-point and signed/unsigned version yet, we may have to add more member functions

template<typename T,int LOGM>structrvv_helper;#defineRVV_HELPER(T, NAME, SIZE, EEW, EMUL, LOGM)                             \template<>structrvv_helper<T, LOGM> {                                     \using type = v##NAME##EMUL##_t;                                            \staticsize_tsetvl(size_t n) {return __riscv_vsetvl_e##SIZE##EMUL(n); }  \staticsize_tsetvlmax() {return __riscv_vsetvlmax_e##SIZE##EMUL(); }     \static typele(const T *ptr,size_t vl) {                                  \return __riscv_vle##SIZE##_v_##EEW##EMUL(ptr, vl);                       \    }                                                                          \    ...                                                                        \  }...RVV_HELPER(unsignedchar, uint8,8,u8, mf2, -1);RVV_HELPER(unsignedchar, uint8,8,u8, m1,0);RVV_HELPER(unsignedchar, uint8,8,u8, m2,1);......RVV_HELPER(float, float32,32,f32, mf2, -1);RVV_HELPER(float, float32,32,f32, m1,0);RVV_HELPER(float, float32,32,f32, m2,1);...
mshabunin and amane-ame reacted with thumbs up emoji

Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
@amane-ame
Copy link
ContributorAuthor

cc@asmorkalov@mshabunin.
I have converted these into templates.

Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
@asmorkalov
Copy link
Contributor

@amane-ame Thanks a lot for the effort. Please fix conflict and I'll merge it.

@amane-ame
Copy link
ContributorAuthor

@asmorkalov Ready to be merged.

asmorkalov reacted with thumbs up emoji

@asmorkalovasmorkalov merged commit13b2caf intoopencv:4.xJan 31, 2025
30 of 31 checks passed
@asmorkalovasmorkalov self-assigned thisJan 31, 2025
@asmorkalovasmorkalov added this to the4.12.0 milestoneJan 31, 2025
asmorkalov pushed a commit that referenced this pull requestFeb 6, 2025
Add RISC-V HAL implementation for cv::norm and cv::normalize#26804This patch implements `cv::norm` with norm types `NORM_INF/NORM_L1/NORM_L2/NORM_L2SQR` and `Mat::convertTo` function in RVV_HAL using native intrinsic, optimizing the performance for `cv::norm(src)`, `cv::norm(src1, src2)`, and `cv::normalize(src)` with data types `8UC1/8UC4/32FC1`.`cv::normalize` also calls `minMaxIdx`,#26789 implements RVV_HAL for this.Tested on MUSE-PI for both gcc 14.2 and clang 20.0.```$ opencv_test_core --gtest_filter="*Norm*"$ opencv_perf_core --gtest_filter="*norm*" --perf_min_samples=300 --perf_force_samples=300```The head of the perf table is shown below since the table is too long.View the full perf table here: [hal_rvv_norm.pdf](https://github.com/user-attachments/files/18468255/hal_rvv_norm.pdf)<img width="1304" alt="Untitled" src="https://github.com/user-attachments/assets/3550b671-6d96-4db3-8b5b-d4cb241da650" />### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [ ] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake
@asmorkalovasmorkalov mentioned this pull requestFeb 19, 2025
NanQin555 pushed a commit to NanQin555/opencv that referenced this pull requestFeb 24, 2025
Add RISC-V HAL implementation for minMaxIdxopencv#26789On the RISC-V platform, `minMaxIdx` cannot benefit from Universal Intrinsics because the UI-optimized `minMaxIdx` only supports `CV_SIMD128` (and does not accept `CV_SIMD_SCALABLE` for RVV).https://github.com/opencv/opencv/blob/1d701d1690b8cc9aa6b86744bffd5d9841ac6fd3/modules/core/src/minmax.cpp#L209-L214This patch implements `minMaxIdx` function in RVV_HAL using native intrinsic, optimizing the performance for all data types with one channel.Tested on MUSE-PI for both gcc 14.2 and clang 20.0.```$ opencv_test_core --gtest_filter="*MinMaxLoc*"$ opencv_perf_core --gtest_filter="*minMaxLoc*"```<img width="1122" alt="Untitled" src="https://github.com/user-attachments/assets/6a246852-87af-42c5-a50b-c349c2765f3f" />See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [ ] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake
NanQin555 pushed a commit to NanQin555/opencv that referenced this pull requestFeb 24, 2025
Add RISC-V HAL implementation for cv::norm and cv::normalizeopencv#26804This patch implements `cv::norm` with norm types `NORM_INF/NORM_L1/NORM_L2/NORM_L2SQR` and `Mat::convertTo` function in RVV_HAL using native intrinsic, optimizing the performance for `cv::norm(src)`, `cv::norm(src1, src2)`, and `cv::normalize(src)` with data types `8UC1/8UC4/32FC1`.`cv::normalize` also calls `minMaxIdx`,opencv#26789 implements RVV_HAL for this.Tested on MUSE-PI for both gcc 14.2 and clang 20.0.```$ opencv_test_core --gtest_filter="*Norm*"$ opencv_perf_core --gtest_filter="*norm*" --perf_min_samples=300 --perf_force_samples=300```The head of the perf table is shown below since the table is too long.View the full perf table here: [hal_rvv_norm.pdf](https://github.com/user-attachments/files/18468255/hal_rvv_norm.pdf)<img width="1304" alt="Untitled" src="https://github.com/user-attachments/assets/3550b671-6d96-4db3-8b5b-d4cb241da650" />See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [ ] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@asmorkalovasmorkalovasmorkalov approved these changes

@mshabuninmshabuninmshabunin approved these changes

@opencv-alalekopencv-alalekopencv-alalek approved these changes

Assignees

@asmorkalovasmorkalov

Projects

None yet

Milestone

4.12.0

Development

Successfully merging this pull request may close these issues.

5 participants

@amane-ame@asmorkalov@horror-proton@mshabunin@opencv-alalek

[8]ページ先頭

©2009-2025 Movatter.jp