Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Improve and refactor softmax layer#24466

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
vpisarev merged 11 commits intoopencv:4.xfromWanliZhong:refactor_softmax
Nov 6, 2023

Conversation

@WanliZhong
Copy link
Member

@WanliZhongWanliZhong commentedOct 28, 2023
edited
Loading

This PR improves softmax fromficus nn.

Performance Test result (use min value and Muti-threads):

macOS M2

Name of Testbeforeafterafter vs before (x-factor)
{ 16, 50, 50 }, 00.0470.0480.98
{ 16, 50, 50 }, 10.0520.0750.69
{ 16, 50, 50 }, 20.3670.0458.19
{ 16, 197, 197 }, 00.7000.2562.73
{ 16, 197, 197 }, 10.6020.3681.64
{ 16, 197, 197 }, 25.7060.23024.81
{ 16, 1024, 1024 }, 017.14318.4640.93
{ 16, 1024, 1024 }, 116.00130.0270.53
{ 16, 1024, 1024 }, 2162.1743.12051.99

UbuntuIntel Core i7-12700K: 8 Performance-cores (3.60 GHz, turbo up to 4.90 GHz), 4 Efficient-cores (2.70 GHz, turbo up to 3.80 GHz), 20 threads.

Name of Testbeforeafterafter vs before (x-factor)
{ 16, 50, 50 }, 00.0170.0600.29
{ 16, 50, 50 }, 10.0220.0580.38
{ 16, 50, 50 }, 20.1980.0424.78
{ 16, 197, 197 }, 00.4250.1303.26
{ 16, 197, 197 }, 10.3680.6740.55
{ 16, 197, 197 }, 23.2810.16420.00
{ 16, 1024, 1024 }, 027.9856.6394.22
{ 16, 1024, 1024 }, 121.23022.2190.96
{ 16, 1024, 1024 }, 291.4064.15322.01

Ubuntu Loongnix

Name of Testbeforeafterafter vs before (x-factor)
{ 16, 50, 50 }, 00.1980.1581.25
{ 16, 50, 50 }, 10.2390.2590.92
{ 16, 50, 50 }, 21.0360.2633.93
{ 16, 197, 197 }, 03.1780.30910.27
{ 16, 197, 197 }, 13.1521.0323.05
{ 16, 197, 197 }, 215.0530.96115.66
{ 16, 1024, 1024 }, 0127.87050.7792.52
{ 16, 1024, 1024 }, 1116.08537.2003.12
{ 16, 1024, 1024 }, 2405.58919.36320.95

fengyuentau reacted with thumbs up emojifengyuentau, asmorkalov, and athy125 reacted with rocket emoji
@WanliZhong

This comment was marked as resolved.

@WanliZhong
Copy link
MemberAuthor

WanliZhong commentedOct 29, 2023
edited
Loading

The performance test result was updated, the speed increase is very obvious. BTW, I am not sure why windows CI failed, seems like it's not related to this PR.

@fengyuentau
Copy link
Member

Please take a look at the failed log fromdefault Win64:

C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(276): error C2105: '--' needs l-value (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(288): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(289): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(289): error C2088: '[': illegal for union (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(290): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(290): error C2088: '[': illegal for union (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(291): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(291): error C2088: '[': illegal for union (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(292): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(292): error C2088: '[': illegal for union (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(293): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(293): error C2088: '[': illegal for union (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(294): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(294): error C2088: '[': illegal for union (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(295): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(295): error C2088: '[': illegal for union (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(312): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(312): error C2088: '[': illegal for union (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(314): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(314): error C2088: '[': illegal for union (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]

@fengyuentau
Copy link
Member

@asmorkalov This build is actually failed but somehow the workflow did not catch a failed signal and it continued:https://github.com/opencv/opencv/actions/runs/6682987045/job/18158738007?pr=24466. It seemsif: ${{ always() && steps.build-opencv.outcome == 'success' }} from the workflow file is not always working?

asmorkalov and WanliZhong reacted with eyes emoji

@asmorkalov
Copy link
Contributor

Windows:

C:/GHA-OCV-3/_work/opencv/opencv/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(276): error C2105: '--' needs l-valueC:/GHA-OCV-3/_work/opencv/opencv/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(288): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operatorC:/GHA-OCV-3/_work/opencv/opencv/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(289): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator

@asmorkalov
Copy link
Contributor

Windows:

C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(276): error C2105: '--' needs l-value (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]

@WanliZhong
Copy link
MemberAuthor

Thanks@asmorkalov. I found the code will throwerror C2105: '--' needs l-value on windows, but I think I don't use-- operator. Let me try to solve it.

@asmorkalov
Copy link
Contributor

I just tried armv7 configuration locally. It produces the following warning (ubuntu 16.04):

n file included from /home/ubuntu/Projects/opencv-build/modules/dnn/layers/cpu_kernels/softmax_kernels.neon.cpp:3:0:/home/ubuntu/Projects/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp: In function ‘float cv::dnn::opt_NEON::_calculate_axis(float*, size_t, size_t)’:/home/ubuntu/Projects/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp:247:26: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]     float maxVal = vmax[0];                          ^/home/ubuntu/Projects/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp:269:19: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]     float s = vs[0] + vs[1] + vs[2] + vs[3];                   ^[ 95%] Linking CXX shared library ../../lib/libopencv_dnn.so

@WanliZhong
Copy link
MemberAuthor

WanliZhong commentedOct 30, 2023
edited
Loading

@asmorkalov That's because the operators[idx],+,-,*,/ are not overrided on some platform. I can solve it by copying the result to an array then do this operation.

@asmorkalov
Copy link
Contributor

Armv7 (Jetson-tk1) perf results with and without NEON:

Geometric mean (ms)             Name of Test              dnn-baseline-1 dnn-NEON-1   dnn-NEON-1                                                                         vs                                                                       dnn-baseline-1                                                                   (x-factor)  Softmax_large::Layer_Softmax::OCV/CPU     4610.644     1452.926       3.17     Softmax_middle::Layer_Softmax::OCV/CPU     27.993       8.684         3.22     Softmax_small::Layer_Softmax::OCV/CPU      2.483        1.013         2.45

@asmorkalov
Copy link
Contributor

Jetson Tk1 with 2 GBs of RAM:

Note: Google Test filter = Layer_Softmax*[==========] Running 3 tests from 1 test case.[----------] Global test environment set-up.[----------] 3 tests from Layer_Softmax[ RUN      ] Layer_Softmax.Softmax_small/0, where GetParam() = OCV/CPU[ PERFSTAT ]    (samples=13   mean=1.15   median=1.14   min=1.13   stddev=0.01 (1.1%))[       OK ] Layer_Softmax.Softmax_small/0 (22 ms)[ RUN      ] Layer_Softmax.Softmax_middle/0, where GetParam() = OCV/CPU[ PERFSTAT ]    (samples=100   mean=8.56   median=8.43   min=8.32   stddev=0.52 (6.0%))[       OK ] Layer_Softmax.Softmax_middle/0 (933 ms)[ RUN      ] Layer_Softmax.Softmax_large/0, where GetParam() = OCV/CPU/home/ubuntu/Projects/opencv/modules/ts/src/ts_perf.cpp:1965: FailureFailedExpected: PerfTestBody() doesn't throw an exception.  Actual: it throws cv::Exception:  OpenCV(4.8.0-dev) /home/ubuntu/Projects/opencv/modules/core/src/alloc.cpp:73: error: (-4:Insufficient memory) Failed to allocate 398131200 bytes in function 'OutOfMemoryError'params    =     OCV/CPUtermination reason:  unhandled exceptionbytesIn   =          0bytesOut  =          0samples   =          0 of 100outliers  =          0frequency =          0[  FAILED  ] Layer_Softmax.Softmax_large/0, where GetParam() = OCV/CPU (1760 ms)[----------] 3 tests from Layer_Softmax (2717 ms total)[----------] Global test environment tear-down[==========] 3 tests from 1 test case ran. (2719 ms total)[  PASSED  ] 2 tests.[  FAILED  ] 1 test, listed below:[  FAILED  ] Layer_Softmax.Softmax_large/0, where GetParam() = OCV/CPU

@WanliZhong
Copy link
MemberAuthor

WanliZhong commentedOct 30, 2023
edited
Loading

Jetson Tk1 with 2 GBs of RAM:

Note: Google Test filter = Layer_Softmax*[==========] Running 3 tests from 1 test case.[----------] Global test environment set-up.[----------] 3 tests from Layer_Softmax[ RUN      ] Layer_Softmax.Softmax_small/0, where GetParam() = OCV/CPU[ PERFSTAT ]    (samples=13   mean=1.15   median=1.14   min=1.13   stddev=0.01 (1.1%))[       OK ] Layer_Softmax.Softmax_small/0 (22 ms)[ RUN      ] Layer_Softmax.Softmax_middle/0, where GetParam() = OCV/CPU[ PERFSTAT ]    (samples=100   mean=8.56   median=8.43   min=8.32   stddev=0.52 (6.0%))[       OK ] Layer_Softmax.Softmax_middle/0 (933 ms)[ RUN      ] Layer_Softmax.Softmax_large/0, where GetParam() = OCV/CPU/home/ubuntu/Projects/opencv/modules/ts/src/ts_perf.cpp:1965: FailureFailedExpected: PerfTestBody() doesn't throw an exception.  Actual: it throws cv::Exception:  OpenCV(4.8.0-dev) /home/ubuntu/Projects/opencv/modules/core/src/alloc.cpp:73: error: (-4:Insufficient memory) Failed to allocate 398131200 bytes in function 'OutOfMemoryError'params    =     OCV/CPUtermination reason:  unhandled exceptionbytesIn   =          0bytesOut  =          0samples   =          0 of 100outliers  =          0frequency =          0[  FAILED  ] Layer_Softmax.Softmax_large/0, where GetParam() = OCV/CPU (1760 ms)[----------] 3 tests from Layer_Softmax (2717 ms total)[----------] Global test environment tear-down[==========] 3 tests from 1 test case ran. (2719 ms total)[  PASSED  ] 2 tests.[  FAILED  ] 1 test, listed below:[  FAILED  ] Layer_Softmax.Softmax_large/0, where GetParam() = OCV/CPU

The performance test has a large input with16x1080x1920x3 and takes 398131200 bytes, it's too large. I think I need to create a smaller one for "large" case.

@WanliZhong
Copy link
MemberAuthor

WanliZhong commentedOct 30, 2023
edited
Loading

The error on windows because a marco was defined as-2.12194440e-4 and use it as--2.12194440e-4. Others complier will treat it as a positive number, but VS2019 on windows will treat it as--variable, so the error occurred. 😂

asmorkalov and fengyuentau reacted with thumbs up emoji

@vpisarev
Copy link
Contributor

@WanliZhong, excellent job, great acceleration numbers! As we discussed, please, refactor the code to reduce code duplication. Then we will gladly merge it.

WanliZhong reacted with thumbs up emoji

@WanliZhong
Copy link
MemberAuthor

Update: As discuss with Vadim, I only use the universal intrinsics to accelerate the softmax layer. The results show that even faster than implementing it individually on each platform.

Note: Added performance tests on different axis. The test results show some cases are slower than before, especially with small size softmax and 0 or 1 axis.

@WanliZhong
Copy link
MemberAuthor

I have no idea why this error occur in some platforms.

/home/ci/opencv/modules/dnn/src/layers/cpu_kernels/softmax.cpp:78:32: error:'cv::hal_baseline::v_float32x4::<unnamed enum> cv::hal_baseline::v_float32x4::nlanes' is private within this context   78|     size_t nlanes = v_float32::nlanes;|                                ^~~~~~In file included from /home/ci/opencv/modules/core/include/opencv2/core/hal/intrin.hpp:221,                 from /home/ci/opencv/modules/dnn/src/layers/cpu_kernels/softmax.hpp:15,                 from /home/ci/opencv/modules/dnn/src/layers/cpu_kernels/softmax.cpp:13:/home/ci/opencv/modules/core/include/opencv2/core/hal/intrin_neon.hpp:301:12: note: declared private here  301|     enum { nlanes = 4 };|            ^~~~~~

@asmorkalov
Copy link
Contributor

asmorkalov commentedNov 3, 2023
edited
Loading

OpenCV migrated to new Universal Intrinsics approach to support scalable intrinsics like RISC-V RVV. The size of vector is not defined in compile time and may be different in runtime. You need to replace:

  • v_float32::nlanes ->VTraits<v_float32>::vlanes() for loops and other places, where it's applicable
  • v_float32::nlanes ->VTraits<v_float32>::max_nlanes for local arrays. It defines maximal possible vector size.
WanliZhong reacted with thumbs up emoji

@vpisarevvpisarev merged commited52f7f intoopencv:4.xNov 6, 2023
asmorkalov added a commit that referenced this pull requestNov 11, 2023
Enable softmax layer vectorization on RISC-V RVV#24510 Related:#24466### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [x] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake
IskXCr pushed a commit to Haosonn/opencv that referenced this pull requestDec 20, 2023
* improve and refactor softmax layer* fix building error* compatible region layer* fix axisStep when disable SIMD* fix dynamic array* try to fix error* use nlanes from VTraits* move axisBias to srcOffset* fix bug caused by axisBias* remove macro* replace #ifdef with #if for CV_SIMD
IskXCr pushed a commit to Haosonn/opencv that referenced this pull requestDec 20, 2023
Enable softmax layer vectorization on RISC-V RVVopencv#24510 Related:opencv#24466### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [x] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake
thewoz pushed a commit to thewoz/opencv that referenced this pull requestJan 4, 2024
* improve and refactor softmax layer* fix building error* compatible region layer* fix axisStep when disable SIMD* fix dynamic array* try to fix error* use nlanes from VTraits* move axisBias to srcOffset* fix bug caused by axisBias* remove macro* replace #ifdef with #if for CV_SIMD
thewoz pushed a commit to thewoz/opencv that referenced this pull requestJan 4, 2024
Enable softmax layer vectorization on RISC-V RVVopencv#24510 Related:opencv#24466### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [x] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake
@asmorkalovasmorkalov mentioned this pull requestJan 19, 2024
thewoz pushed a commit to thewoz/opencv that referenced this pull requestMay 29, 2024
* improve and refactor softmax layer* fix building error* compatible region layer* fix axisStep when disable SIMD* fix dynamic array* try to fix error* use nlanes from VTraits* move axisBias to srcOffset* fix bug caused by axisBias* remove macro* replace #ifdef with #if for CV_SIMD
thewoz pushed a commit to thewoz/opencv that referenced this pull requestMay 29, 2024
Enable softmax layer vectorization on RISC-V RVVopencv#24510 Related:opencv#24466### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [x] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@fengyuentaufengyuentaufengyuentau left review comments

@dkurtdkurtdkurt left review comments

@vpisarevvpisarevvpisarev approved these changes

Assignees

No one assigned

Labels

category: dnn (onnx)ONNX suport issues in DNN modulecategory: dnnoptimization

Projects

None yet

Milestone

4.9.0

Development

Successfully merging this pull request may close these issues.

5 participants

@WanliZhong@fengyuentau@asmorkalov@vpisarev@dkurt

[8]ページ先頭

©2009-2025 Movatter.jp