NotificationsYou must be signed in to change notification settings
Fork56.4k
Star85.3k

Improve and refactor softmax layer#24466

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Merged

vpisarev merged 11 commits intoopencv:4.xfromWanliZhong:refactor_softmax

Nov 6, 2023

Merged

Improve and refactor softmax layer#24466

vpisarev merged 11 commits intoopencv:4.xfromWanliZhong:refactor_softmax

Nov 6, 2023

Conversation

Copy link

Member

WanliZhong commentedOct 28, 2023•
edited
Loading

This PR improves softmax fromficus nn.

Performance Test result (use min value and Muti-threads):

macOS M2

Name of Test	before	after	after vs before (x-factor)
{ 16, 50, 50 }, 0	0.047	0.048	0.98
{ 16, 50, 50 }, 1	0.052	0.075	0.69
{ 16, 50, 50 }, 2	0.367	0.045	8.19
{ 16, 197, 197 }, 0	0.700	0.256	2.73
{ 16, 197, 197 }, 1	0.602	0.368	1.64
{ 16, 197, 197 }, 2	5.706	0.230	24.81
{ 16, 1024, 1024 }, 0	17.143	18.464	0.93
{ 16, 1024, 1024 }, 1	16.001	30.027	0.53
{ 16, 1024, 1024 }, 2	162.174	3.120	51.99

UbuntuIntel Core i7-12700K: 8 Performance-cores (3.60 GHz, turbo up to 4.90 GHz), 4 Efficient-cores (2.70 GHz, turbo up to 3.80 GHz), 20 threads.

Name of Test	before	after	after vs before (x-factor)
{ 16, 50, 50 }, 0	0.017	0.060	0.29
{ 16, 50, 50 }, 1	0.022	0.058	0.38
{ 16, 50, 50 }, 2	0.198	0.042	4.78
{ 16, 197, 197 }, 0	0.425	0.130	3.26
{ 16, 197, 197 }, 1	0.368	0.674	0.55
{ 16, 197, 197 }, 2	3.281	0.164	20.00
{ 16, 1024, 1024 }, 0	27.985	6.639	4.22
{ 16, 1024, 1024 }, 1	21.230	22.219	0.96
{ 16, 1024, 1024 }, 2	91.406	4.153	22.01

Ubuntu Loongnix

Name of Test	before	after	after vs before (x-factor)
{ 16, 50, 50 }, 0	0.198	0.158	1.25
{ 16, 50, 50 }, 1	0.239	0.259	0.92
{ 16, 50, 50 }, 2	1.036	0.263	3.93
{ 16, 197, 197 }, 0	3.178	0.309	10.27
{ 16, 197, 197 }, 1	3.152	1.032	3.05
{ 16, 197, 197 }, 2	15.053	0.961	15.66
{ 16, 1024, 1024 }, 0	127.870	50.779	2.52
{ 16, 1024, 1024 }, 1	116.085	37.200	3.12
{ 16, 1024, 1024 }, 2	405.589	19.363	20.95

WanliZhong added optimization category: dnn category: dnn (onnx)ONNX suport issues in DNN module labels

Oct 28, 2023

WanliZhong added this to the4.9.0 milestone

Oct 28, 2023

WanliZhong requested review fromdkurt,fengyuentau andvpisarev

October 28, 2023 17:16

This comment was marked as resolved.

Copy link

MemberAuthor

WanliZhong commentedOct 29, 2023•
edited
Loading

The performance test result was updated, the speed increase is very obvious. BTW, I am not sure why windows CI failed, seems like it's not related to this PR.

fengyuentau reviewed

Oct 30, 2023

View reviewed changes

modules/dnn/src/layers/cpu_kernels/softmax.hpp OutdatedShow resolvedHide resolved

modules/dnn/src/layers/cpu_kernels/softmax_kernels.default.hpp OutdatedShow resolvedHide resolved

modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp OutdatedShow resolvedHide resolved

modules/dnn/src/layers/softmax_layer.cpp OutdatedShow resolvedHide resolved

Copy link

Member

fengyuentau commentedOct 30, 2023

Please take a look at the failed log fromdefault Win64:

C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(276): error C2105: '--' needs l-value (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(288): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(289): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(289): error C2088: '[': illegal for union (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(290): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(290): error C2088: '[': illegal for union (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(291): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(291): error C2088: '[': illegal for union (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(292): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(292): error C2088: '[': illegal for union (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(293): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(293): error C2088: '[': illegal for union (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(294): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(294): error C2088: '[': illegal for union (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(295): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(295): error C2088: '[': illegal for union (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(312): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(312): error C2088: '[': illegal for union (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(314): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(314): error C2088: '[': illegal for union (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]

Copy link

Member

fengyuentau commentedOct 30, 2023

@asmorkalov This build is actually failed but somehow the workflow did not catch a failed signal and it continued:https://github.com/opencv/opencv/actions/runs/6682987045/job/18158738007?pr=24466. It seemsif: ${{ always() && steps.build-opencv.outcome == 'success' }} from the workflow file is not always working?

Copy link

Contributor

asmorkalov commentedOct 30, 2023

Windows:

C:/GHA-OCV-3/_work/opencv/opencv/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(276): error C2105: '--' needs l-valueC:/GHA-OCV-3/_work/opencv/opencv/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(288): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operatorC:/GHA-OCV-3/_work/opencv/opencv/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(289): error C2676: binary '[': '__m256' does not define this operator or a conversion to a type acceptable to the predefined operator

dkurt reviewed

Oct 30, 2023

View reviewed changes

modules/dnn/perf/perf_layer.cpp OutdatedShow resolvedHide resolved

modules/dnn/perf/perf_layer.cppShow resolvedHide resolved

modules/dnn/src/layers/softmax_layer.cpp OutdatedShow resolvedHide resolved

Copy link

Contributor

asmorkalov commentedOct 30, 2023

Windows:

C:/build/precommit_windows64/4.x/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp(276): error C2105: '--' needs l-value (compiling source file C:\build\precommit_windows64\build\modules\dnn\layers\cpu_kernels\softmax_kernels.avx.cpp) [C:\build\precommit_windows64\build\modules\dnn\opencv_dnn_AVX.vcxproj]

Copy link

MemberAuthor

WanliZhong commentedOct 30, 2023

Thanks@asmorkalov. I found the code will throwerror C2105: '--' needs l-value on windows, but I think I don't use-- operator. Let me try to solve it.

Copy link

Contributor

asmorkalov commentedOct 30, 2023

I just tried armv7 configuration locally. It produces the following warning (ubuntu 16.04):

n file included from /home/ubuntu/Projects/opencv-build/modules/dnn/layers/cpu_kernels/softmax_kernels.neon.cpp:3:0:/home/ubuntu/Projects/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp: In function ‘float cv::dnn::opt_NEON::_calculate_axis(float*, size_t, size_t)’:/home/ubuntu/Projects/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp:247:26: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]     float maxVal = vmax[0];                          ^/home/ubuntu/Projects/opencv/modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp:269:19: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]     float s = vs[0] + vs[1] + vs[2] + vs[3];                   ^[ 95%] Linking CXX shared library ../../lib/libopencv_dnn.so

Copy link

MemberAuthor

WanliZhong commentedOct 30, 2023•
edited
Loading

@asmorkalov That's because the operators[idx],+,-,*,/ are not overrided on some platform. I can solve it by copying the result to an array then do this operation.

Copy link

Contributor

asmorkalov commentedOct 30, 2023

Armv7 (Jetson-tk1) perf results with and without NEON:

Geometric mean (ms)             Name of Test              dnn-baseline-1 dnn-NEON-1   dnn-NEON-1                                                                         vs                                                                       dnn-baseline-1                                                                   (x-factor)  Softmax_large::Layer_Softmax::OCV/CPU     4610.644     1452.926       3.17     Softmax_middle::Layer_Softmax::OCV/CPU     27.993       8.684         3.22     Softmax_small::Layer_Softmax::OCV/CPU      2.483        1.013         2.45

dkurt reviewed

Oct 30, 2023

View reviewed changes

modules/dnn/src/layers/cpu_kernels/softmax.cpp OutdatedShow resolvedHide resolved

Copy link

Contributor

asmorkalov commentedOct 30, 2023

Jetson Tk1 with 2 GBs of RAM:

Note: Google Test filter = Layer_Softmax*[==========] Running 3 tests from 1 test case.[----------] Global test environment set-up.[----------] 3 tests from Layer_Softmax[ RUN      ] Layer_Softmax.Softmax_small/0, where GetParam() = OCV/CPU[ PERFSTAT ]    (samples=13   mean=1.15   median=1.14   min=1.13   stddev=0.01 (1.1%))[       OK ] Layer_Softmax.Softmax_small/0 (22 ms)[ RUN      ] Layer_Softmax.Softmax_middle/0, where GetParam() = OCV/CPU[ PERFSTAT ]    (samples=100   mean=8.56   median=8.43   min=8.32   stddev=0.52 (6.0%))[       OK ] Layer_Softmax.Softmax_middle/0 (933 ms)[ RUN      ] Layer_Softmax.Softmax_large/0, where GetParam() = OCV/CPU/home/ubuntu/Projects/opencv/modules/ts/src/ts_perf.cpp:1965: FailureFailedExpected: PerfTestBody() doesn't throw an exception.  Actual: it throws cv::Exception:  OpenCV(4.8.0-dev) /home/ubuntu/Projects/opencv/modules/core/src/alloc.cpp:73: error: (-4:Insufficient memory) Failed to allocate 398131200 bytes in function 'OutOfMemoryError'params    =     OCV/CPUtermination reason:  unhandled exceptionbytesIn   =          0bytesOut  =          0samples   =          0 of 100outliers  =          0frequency =          0[  FAILED  ] Layer_Softmax.Softmax_large/0, where GetParam() = OCV/CPU (1760 ms)[----------] 3 tests from Layer_Softmax (2717 ms total)[----------] Global test environment tear-down[==========] 3 tests from 1 test case ran. (2719 ms total)[  PASSED  ] 2 tests.[  FAILED  ] 1 test, listed below:[  FAILED  ] Layer_Softmax.Softmax_large/0, where GetParam() = OCV/CPU

Copy link

MemberAuthor

WanliZhong commentedOct 30, 2023•
edited
Loading

Jetson Tk1 with 2 GBs of RAM:

Note: Google Test filter = Layer_Softmax*[==========] Running 3 tests from 1 test case.[----------] Global test environment set-up.[----------] 3 tests from Layer_Softmax[ RUN      ] Layer_Softmax.Softmax_small/0, where GetParam() = OCV/CPU[ PERFSTAT ]    (samples=13   mean=1.15   median=1.14   min=1.13   stddev=0.01 (1.1%))[       OK ] Layer_Softmax.Softmax_small/0 (22 ms)[ RUN      ] Layer_Softmax.Softmax_middle/0, where GetParam() = OCV/CPU[ PERFSTAT ]    (samples=100   mean=8.56   median=8.43   min=8.32   stddev=0.52 (6.0%))[       OK ] Layer_Softmax.Softmax_middle/0 (933 ms)[ RUN      ] Layer_Softmax.Softmax_large/0, where GetParam() = OCV/CPU/home/ubuntu/Projects/opencv/modules/ts/src/ts_perf.cpp:1965: FailureFailedExpected: PerfTestBody() doesn't throw an exception.  Actual: it throws cv::Exception:  OpenCV(4.8.0-dev) /home/ubuntu/Projects/opencv/modules/core/src/alloc.cpp:73: error: (-4:Insufficient memory) Failed to allocate 398131200 bytes in function 'OutOfMemoryError'params    =     OCV/CPUtermination reason:  unhandled exceptionbytesIn   =          0bytesOut  =          0samples   =          0 of 100outliers  =          0frequency =          0[  FAILED  ] Layer_Softmax.Softmax_large/0, where GetParam() = OCV/CPU (1760 ms)[----------] 3 tests from Layer_Softmax (2717 ms total)[----------] Global test environment tear-down[==========] 3 tests from 1 test case ran. (2719 ms total)[  PASSED  ] 2 tests.[  FAILED  ] 1 test, listed below:[  FAILED  ] Layer_Softmax.Softmax_large/0, where GetParam() = OCV/CPU

The performance test has a large input with16x1080x1920x3 and takes 398131200 bytes, it's too large. I think I need to create a smaller one for "large" case.

Copy link

MemberAuthor

WanliZhong commentedOct 30, 2023•
edited
Loading

The error on windows because a marco was defined as-2.12194440e-4 and use it as--2.12194440e-4. Others complier will treat it as a positive number, but VS2019 on windows will treat it as--variable, so the error occurred. 😂

fengyuentau reviewed

Oct 31, 2023

View reviewed changes

modules/dnn/test/test_onnx_importer.cpp OutdatedShow resolvedHide resolved

modules/dnn/src/layers/softmax_layer.cpp OutdatedShow resolvedHide resolved

modules/dnn/src/layers/cpu_kernels/softmax.hpp OutdatedShow resolvedHide resolved

modules/dnn/perf/perf_layer.cpp OutdatedShow resolvedHide resolved

modules/dnn/src/layers/cpu_kernels/softmax.cpp OutdatedShow resolvedHide resolved

modules/dnn/src/layers/cpu_kernels/softmax.cppShow resolvedHide resolved

modules/dnn/src/layers/cpu_kernels/softmax_kernels.simd.hpp OutdatedShow resolvedHide resolved

Copy link

Contributor

vpisarev commentedNov 1, 2023

@WanliZhong, excellent job, great acceleration numbers! As we discussed, please, refactor the code to reduce code duplication. Then we will gladly merge it.

improve and refactor softmax layer

790da1b

WanliZhong force-pushed therefactor_softmax branch fromcbf0474 to790da1bCompare

November 2, 2023 06:26

Copy link

MemberAuthor

WanliZhong commentedNov 2, 2023

Update: As discuss with Vadim, I only use the universal intrinsics to accelerate the softmax layer. The results show that even faster than implementing it individually on each platform.

Note: Added performance tests on different axis. The test results show some cases are slower than before, especially with small size softmax and 0 or 1 axis.

dkurt reviewed

Nov 2, 2023

View reviewed changes

modules/dnn/src/layers/softmax_layer.cpp OutdatedShow resolvedHide resolved

fix building error

4c729bd

WanliZhong added3 commits

November 2, 2023 20:44

compatible region layer

928b3f4

fix axisStep when disable SIMD

8ab200d

fix dynamic array

e4b37d3

Copy link

MemberAuthor

WanliZhong commentedNov 2, 2023

I have no idea why this error occur in some platforms.

/home/ci/opencv/modules/dnn/src/layers/cpu_kernels/softmax.cpp:78:32: error:'cv::hal_baseline::v_float32x4::<unnamed enum> cv::hal_baseline::v_float32x4::nlanes' is private within this context   78|     size_t nlanes = v_float32::nlanes;|                                ^~~~~~In file included from /home/ci/opencv/modules/core/include/opencv2/core/hal/intrin.hpp:221,                 from /home/ci/opencv/modules/dnn/src/layers/cpu_kernels/softmax.hpp:15,                 from /home/ci/opencv/modules/dnn/src/layers/cpu_kernels/softmax.cpp:13:/home/ci/opencv/modules/core/include/opencv2/core/hal/intrin_neon.hpp:301:12: note: declared private here  301|     enum { nlanes = 4 };|            ^~~~~~

try to fix error

c6a349a

Copy link

Contributor

asmorkalov commentedNov 3, 2023•
edited
Loading

OpenCV migrated to new Universal Intrinsics approach to support scalable intrinsics like RISC-V RVV. The size of vector is not defined in compile time and may be different in runtime. You need to replace:

v_float32::nlanes ->VTraits<v_float32>::vlanes() for loops and other places, where it's applicable
v_float32::nlanes ->VTraits<v_float32>::max_nlanes for local arrays. It defines maximal possible vector size.

WanliZhong added3 commits

November 3, 2023 15:05

use nlanes from VTraits

399e92a

move axisBias to srcOffset

e9a8b31

fix bug caused by axisBias

fc77182

fengyuentau reviewed

Nov 3, 2023

View reviewed changes

modules/dnn/src/layers/cpu_kernels/softmax.cpp OutdatedShow resolvedHide resolved

WanliZhong added2 commits

November 4, 2023 15:00

remove macro

c49a332

replace #ifdef with #if for CV_SIMD

ac9e410

vpisarev approved these changes

Nov 6, 2023

View reviewed changes

vpisarev merged commited52f7f intoopencv:4.x

Nov 6, 2023

asmorkalov mentioned this pull request

Nov 8, 2023

Enable softmax layer vectorization on RISC-V RVV#24510

Merged

6 tasks

asmorkalov added a commit that referenced this pull request

Nov 11, 2023

Merge pull request#24510from asmorkalov:as/softmax_rvv

960a926

Enable softmax layer vectorization on RISC-V RVV#24510 Related:#24466### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [x] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake

IskXCr pushed a commit to Haosonn/opencv that referenced this pull request

Dec 20, 2023

Improve and refactor softmax layer (opencv#24466)

a9f1f73

* improve and refactor softmax layer* fix building error* compatible region layer* fix axisStep when disable SIMD* fix dynamic array* try to fix error* use nlanes from VTraits* move axisBias to srcOffset* fix bug caused by axisBias* remove macro* replace #ifdef with #if for CV_SIMD

IskXCr pushed a commit to Haosonn/opencv that referenced this pull request

Dec 20, 2023

Merge pull requestopencv#24510from asmorkalov:as/softmax_rvv

c7ad145

Enable softmax layer vectorization on RISC-V RVVopencv#24510 Related:opencv#24466### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [x] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake

thewoz pushed a commit to thewoz/opencv that referenced this pull request

Jan 4, 2024

Improve and refactor softmax layer (opencv#24466)

17f0573

* improve and refactor softmax layer* fix building error* compatible region layer* fix axisStep when disable SIMD* fix dynamic array* try to fix error* use nlanes from VTraits* move axisBias to srcOffset* fix bug caused by axisBias* remove macro* replace #ifdef with #if for CV_SIMD

thewoz pushed a commit to thewoz/opencv that referenced this pull request

Jan 4, 2024

Merge pull requestopencv#24510from asmorkalov:as/softmax_rvv

bdd525b

Enable softmax layer vectorization on RISC-V RVVopencv#24510 Related:opencv#24466### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [x] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake

asmorkalov mentioned this pull request

Jan 19, 2024

5.x merge 4.x#24862

Merged

thewoz pushed a commit to thewoz/opencv that referenced this pull request

May 29, 2024

Improve and refactor softmax layer (opencv#24466)

acb0dcf

* improve and refactor softmax layer* fix building error* compatible region layer* fix axisStep when disable SIMD* fix dynamic array* try to fix error* use nlanes from VTraits* move axisBias to srcOffset* fix bug caused by axisBias* remove macro* replace #ifdef with #if for CV_SIMD

thewoz pushed a commit to thewoz/opencv that referenced this pull request

May 29, 2024

Merge pull requestopencv#24510from asmorkalov:as/softmax_rvv

4ea2b5f

Enable softmax layer vectorization on RISC-V RVVopencv#24510 Related:opencv#24466### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [x] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake

Labels

category: dnn (onnx)

ONNX suport issues in DNN module

category: dnn optimization

Movatterモバイル変換

Uh oh!

Improve and refactor softmax layer#24466

Improve and refactor softmax layer#24466

Uh oh!

Conversation

WanliZhong commentedOct 28, 2023• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

This comment was marked as resolved.

WanliZhong commentedOct 29, 2023• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fengyuentau commentedOct 30, 2023

Uh oh!

fengyuentau commentedOct 30, 2023

Uh oh!

asmorkalov commentedOct 30, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

asmorkalov commentedOct 30, 2023

Uh oh!

WanliZhong commentedOct 30, 2023

Uh oh!

asmorkalov commentedOct 30, 2023

Uh oh!

WanliZhong commentedOct 30, 2023• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

asmorkalov commentedOct 30, 2023

Uh oh!

Uh oh!

asmorkalov commentedOct 30, 2023

Uh oh!

WanliZhong commentedOct 30, 2023• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

WanliZhong commentedOct 30, 2023• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vpisarev commentedNov 1, 2023

Uh oh!

WanliZhong commentedNov 2, 2023

Uh oh!

Uh oh!

WanliZhong commentedNov 2, 2023

Uh oh!

asmorkalov commentedNov 3, 2023• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

WanliZhong commentedOct 28, 2023•
edited
Loading

WanliZhong commentedOct 29, 2023•
edited
Loading

WanliZhong commentedOct 30, 2023•
edited
Loading

WanliZhong commentedOct 30, 2023•
edited
Loading

WanliZhong commentedOct 30, 2023•
edited
Loading

asmorkalov commentedNov 3, 2023•
edited
Loading