Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Add RISC-V HAL implementation for cv::dft and cv::dct#26865

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
asmorkalov merged 12 commits intoopencv:4.xfromamane-ame:dxt_hal_rvv
Mar 7, 2025

Conversation

@amane-ame
Copy link
Contributor

This patch implementsstatic cv::DFT function in RVV_HAL using native intrinsic, optimizing the performance forcv::dft andcv::dct with data types32FC1/64FC1/32FC2/64FC2.

The reason I chose to create a newcv_hal_dftOcv interface is that if I were to use the existing interfaces (cv_hal_dftInit1D andcv_hal_dft1D), it would require handling and parsing the dft flags within HAL, as well as performing preprocessing operations such as handling unit roots. Since these operations are not performance hotspots and do not require optimization, reusing the existing interfaces would result in copying approximately 300 lines of code fromcore/src/dxt.cpp into HAL, which I believe is unnecessary.

Moreover, if I insert the new interface intostatic cv::DFT, bothstatic cv::RealDFT andstatic cv::DCT can be optimized as well. The processing performed before and after callingstatic cv::DFT in these functions is also not a performance hotspot.

Tested on MUSE-PI (Spacemit X60) for both gcc 14.2 and clang 20.0.

$ opencv_test_core --gtest_filter="*DFT*"$ opencv_perf_core --gtest_filter="*dft*:*dct*" --perf_min_samples=30 --perf_force_samples=30

The head of the perf table is shown below since the table is too long.

View the full perf table here:hal_rvv_dxt.pdf

Untitled

Pull Request Readiness Checklist

See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

amane-ameand others added4 commitsJanuary 29, 2025 02:24
…calar.Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
@asmorkalovasmorkalov self-assigned thisFeb 5, 2025
@vpisarev
Copy link
Contributor

@fengyuentau, can you please check it, measure performance against the current scalar implementation?

fengyuentau reacted with thumbs up emoji

@amane-ame
Copy link
ContributorAuthor

cc@mshabunin@fengyuentau
Could anyone please review this?

Copy link
Contributor

@mshabuninmshabunin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Looks good to me overall.

amane-ameand others added2 commitsFebruary 19, 2025 12:04
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
@fengyuentau
Copy link
Member

@fengyuentau, can you please check it, measure performance against the current scalar implementation?

This patch generally makes sense with some speedup (tested on K1).

dxt-perf.zip

amane-ame reacted with heart emoji

@amane-ame
Copy link
ContributorAuthor

cc@asmorkalov Ready to be merged.

asmorkalov reacted with eyes emoji

Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
@amane-ame
Copy link
ContributorAuthor

amane-ame commentedFeb 24, 2025
edited
Loading

Slightly optimized the performance further.

This optimize ran into the same problem in#26923 (comment). I strongly recommend that update the clang to at least 18.1.0 becausevcreate series function would be very important in the optimizing of algorithms with multi channels images.

hal_rvv_dxt.pdf

Untitled

@amane-ame
Copy link
ContributorAuthor

Committed to make clang 17 happy. This should be reverted once clang is updated.

Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
@asmorkalovasmorkalov merged commitbb525fe intoopencv:4.xMar 7, 2025
27 of 28 checks passed
@asmorkalovasmorkalov mentioned this pull requestMar 11, 2025
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@fengyuentaufengyuentaufengyuentau left review comments

@asmorkalovasmorkalovasmorkalov approved these changes

@mshabuninmshabuninmshabunin approved these changes

Assignees

@asmorkalovasmorkalov

Projects

None yet

Milestone

4.12.0

Development

Successfully merging this pull request may close these issues.

5 participants

@amane-ame@vpisarev@fengyuentau@asmorkalov@mshabunin

[8]ページ先頭

©2009-2025 Movatter.jp