Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Parallel_for in box Filter and support for 32f box filter in Fastcv hal#27182

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
asmorkalov merged 6 commits intoopencv:4.xfromCodeLinaro:boxFilter_hal_changes
Apr 16, 2025

Conversation

@adsha-quic
Copy link
Contributor

Added parallel_for in box filter hal and support for 32f box filter

Pull Request Readiness Checklist

See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@asmorkalovasmorkalov self-requested a reviewApril 1, 2025 13:10
@asmorkalovasmorkalov added optimization platform: armARM boards related issues: RPi, NVIDIA TK/TX, etc labelsApr 1, 2025
@asmorkalovasmorkalov added this to the4.12.0 milestoneApr 1, 2025
@asmorkalov
Copy link
Contributor

asmorkalov commentedApr 2, 2025
edited
Loading

With my Jetson Orin:

./bin/opencv_perf_imgproc --gtest_filter=Size_MatType_BorderType_blur16x16.blur16x16/36TEST: Skip tests with tags: 'mem_6gb', 'verylong'CTEST_FULL_OUTPUTOpenCV version: 4.12.0-devOpenCV VCS version: 4.11.0-315-g42de7e6ee8Build type: ReleaseCompiler: /usr/bin/c++  (ver 9.4.0)Algorithm hint: ALGO_HINT_ACCURATEHAL: YES (carotene (ver 0.0.1) fastcv (ver 0.0.1))Parallel framework: pthreads (nthreads=12)CPU features: NEON FP16 *NEON_DOTPROD *NEON_FP16OpenCL is disabledNote: Google Test filter = Size_MatType_BorderType_blur16x16.blur16x16/36[==========] Running 1 test from 1 test case.[----------] Global test environment set-up.[----------] 1 test from Size_MatType_BorderType_blur16x16[ RUN      ] Size_MatType_BorderType_blur16x16.blur16x16/36, where GetParam() = (1280x720, 32FC1, BORDER_REPLICATE)/mnt/flashdrive/opencv/modules/ts/src/ts_perf.cpp:381: FailureThe difference between expect_last and actual_last is 0.001220703125, which exceeds eps, whereexpect_last evaluates to -565.76580810546875,actual_last evaluates to -565.76702880859375, andeps evaluates to 0.001.Argument "dst" has unexpected value of the last elementparams    = (1280x720, 32FC1, BORDER_REPLICATE)termination reason:  reached maximum number of iterationsbytesIn   =    3686400bytesOut  =    3686400samples   =        100outliers  =          8frequency = 1000000000min       =    1792400 = 1.79msmedian    =    1994482 = 1.99msgmean     =    2041647 = 2.04msgstddev   = 0.08525334 = 1.06ms for 97% dispersion intervalmean      =    2049113 = 2.05msstddev    =     178893 = 0.18ms[  FAILED  ] Size_MatType_BorderType_blur16x16.blur16x16/36, where GetParam() = (1280x720, 32FC1, BORDER_REPLICATE) (228 ms)[----------] 1 test from Size_MatType_BorderType_blur16x16 (228 ms total)[----------] Global test environment tear-down[==========] 1 test from 1 test case ran. (228 ms total)[  PASSED  ] 0 tests.[  FAILED  ] 1 test, listed below:[  FAILED  ] Size_MatType_BorderType_blur16x16.blur16x16/36, where GetParam() = (1280x720, 32FC1, BORDER_REPLICATE)

@asmorkalovasmorkalov self-assigned thisApr 2, 2025
@adsha-quic
Copy link
ContributorAuthor

adsha-quic commentedApr 2, 2025
edited
Loading

With my Jetson Orin:

./bin/opencv_perf_imgproc --gtest_filter=Size_MatType_BorderType_blur16x16.blur16x16/36TEST: Skip tests with tags: 'mem_6gb', 'verylong'CTEST_FULL_OUTPUTOpenCV version: 4.12.0-devOpenCV VCS version: 4.11.0-315-g42de7e6ee8Build type: ReleaseCompiler: /usr/bin/c++  (ver 9.4.0)Algorithm hint: ALGO_HINT_ACCURATEHAL: YES (carotene (ver 0.0.1) fastcv (ver 0.0.1))Parallel framework: pthreads (nthreads=12)CPU features: NEON FP16 *NEON_DOTPROD *NEON_FP16OpenCL is disabledNote: Google Test filter = Size_MatType_BorderType_blur16x16.blur16x16/36[==========] Running 1 test from 1 test case.[----------] Global test environment set-up.[----------] 1 test from Size_MatType_BorderType_blur16x16[ RUN      ] Size_MatType_BorderType_blur16x16.blur16x16/36, where GetParam() = (1280x720, 32FC1, BORDER_REPLICATE)/mnt/flashdrive/opencv/modules/ts/src/ts_perf.cpp:381: FailureThe difference between expect_last and actual_last is 0.001220703125, which exceeds eps, whereexpect_last evaluates to -565.76580810546875,actual_last evaluates to -565.76702880859375, andeps evaluates to 0.001.Argument "dst" has unexpected value of the last elementparams    = (1280x720, 32FC1, BORDER_REPLICATE)termination reason:  reached maximum number of iterationsbytesIn   =    3686400bytesOut  =    3686400samples   =        100outliers  =          8frequency = 1000000000min       =    1792400 = 1.79msmedian    =    1994482 = 1.99msgmean     =    2041647 = 2.04msgstddev   = 0.08525334 = 1.06ms for 97% dispersion intervalmean      =    2049113 = 2.05msstddev    =     178893 = 0.18ms[  FAILED  ] Size_MatType_BorderType_blur16x16.blur16x16/36, where GetParam() = (1280x720, 32FC1, BORDER_REPLICATE) (228 ms)[----------] 1 test from Size_MatType_BorderType_blur16x16 (228 ms total)[----------] Global test environment tear-down[==========] 1 test from 1 test case ran. (228 ms total)[  PASSED  ] 0 tests.[  FAILED  ] 1 test, listed below:[  FAILED  ] Size_MatType_BorderType_blur16x16.blur16x16/36, where GetParam() = (1280x720, 32FC1, BORDER_REPLICATE)

Hi Alex

Is it possible to tune the eps a bit
"The difference between expect_last and actual_last is 0.001220703125, which exceeds eps 0.001"

asmorkalov reacted with eyes emoji

@asmorkalov
Copy link
Contributor

Similar test failure on Android:

umi:/data/local/tmp/fastcv_pr $ ./opencv_perf_imgproc --gtest_filter=*Size_MatType_BorderType_blur16x16.blur16x16/36*                                                                                                                                                        TEST: Skip tests with tags: 'mem_6gb', 'verylong'CTEST_FULL_OUTPUTOpenCV version: 4.12.0-devOpenCV VCS version: 4.11.0-315-g42de7e6ee8Build type: ReleaseCompiler: /mnt/Projects/Android/Sdk/ndk/28.0.12433566/toolchains/llvm/prebuilt/linux-x86_64/bin/clang++  (ver 19.0.0)Algorithm hint: ALGO_HINT_ACCURATEHAL: YES (carotene (ver 0.0.1) KleidiCV (ver 0.3.0) fastcv (ver 0.0.1))Parallel framework: pthreads (nthreads=2)CPU features: NEON FP16 *NEON_DOTPROD *NEON_FP16 *NEON_BF16?Note: Google Test filter = *Size_MatType_BorderType_blur16x16.blur16x16/36*[==========] Running 1 test from 1 test case.[----------] Global test environment set-up.[----------] 1 test from Size_MatType_BorderType_blur16x16[ RUN      ] Size_MatType_BorderType_blur16x16.blur16x16/36, where GetParam() = (1280x720, 32FC1, BORDER_REPLICATE)/mnt/Projects/Projects/opencv/modules/ts/src/ts_perf.cpp:381: FailureThe difference between expect_last and actual_last is 0.001220703125, which exceeds eps, whereexpect_last evaluates to -565.76580810546875,actual_last evaluates to -565.76702880859375, andeps evaluates to 0.001.Argument "dst" has unexpected value of the last elementparams    = (1280x720, 32FC1, BORDER_REPLICATE)termination reason:  unknownbytesIn   =    3686400bytesOut  =    3686400samples   =         83 of 100outliers  =          6frequency = 1000000000min       =    3481198 = 3.48msmedian    =    3547604 = 3.55msgmean     =    3553248 = 3.55msgstddev   = 0.02692594 = 0.57ms for 97% dispersion intervalmean      =    3554609 = 3.55msstddev    =     106086 = 0.11ms[  FAILED  ] Size_MatType_BorderType_blur16x16.blur16x16/36, where GetParam() = (1280x720, 32FC1, BORDER_REPLICATE) (321 ms)[----------] 1 test from Size_MatType_BorderType_blur16x16 (321 ms total)[----------] Global test environment tear-down[==========] 1 test from 1 test case ran. (321 ms total)[  PASSED  ] 0 tests.[  FAILED  ] 1 test, listed below:[  FAILED  ] Size_MatType_BorderType_blur16x16.blur16x16/36, where GetParam() = (1280x720, 32FC1, BORDER_REPLICATE)

@adsha-quic
Copy link
ContributorAuthor

Hey Alex
I will also raise the threshold tuning patch in this PR shortly

@asmorkalovasmorkalov merged commit6ffc515 intoopencv:4.xApr 16, 2025
27 of 28 checks passed
@asmorkalovasmorkalov mentioned this pull requestApr 29, 2025
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@asmorkalovasmorkalovasmorkalov approved these changes

Assignees

@asmorkalovasmorkalov

Labels

optimizationplatform: armARM boards related issues: RPi, NVIDIA TK/TX, etc

Projects

None yet

Milestone

4.12.0

Development

Successfully merging this pull request may close these issues.

2 participants

@adsha-quic@asmorkalov

[8]ページ先頭

©2009-2025 Movatter.jp