benchmark (before this PR vs. with this PR vs. ORT).
Layer fusion: Take care Slice with end=INT64_MAX.
Layer fusion: match more potential attention (VIT) patterns.
- Single-head attention is supported.
Test AttentionSubgraph fusion.
Add acc tests for VIT_B_32 and VitTrack
Add perf tests for VIT_B_32 and VitTrack

Benchmarks

Platform: Macbook Air M1.

Attention Subgraph

Input scale: [1, 197, 768].

	mean (ms)	median (ms)	min (ms)
w/ Attention (this PR)	3.75	3.68	3.22
w/o Attention	9.06	9.01	8.24
ORT (python)	4.32	2.63	2.50

ViTs

All data in millisecond (ms).

ViTs	With Attention	Without Attention	ORT
vit_b_16	302.77	365.35	109.70
vit_b_32	89.92	116.22	30.36
vit_l_16	1593.32	1730.74	419.92
vit_l_32	468.11	577.41	134.12
VitTrack	3.80	3.87	2.25

Pull Request Readiness Checklist

See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

fengyuentau added feature category: dnn category: dnn (onnx)ONNX suport issues in DNN module labels

Nov 1, 2023

fengyuentau added this to the4.9.0 milestone

Nov 1, 2023

dkurt reviewed

Nov 1, 2023

View reviewed changes

modules/dnn/include/opencv2/dnn/all_layers.hpp OutdatedShow resolvedHide resolved

dkurt reviewed

Nov 1, 2023

View reviewed changes

modules/dnn/src/onnx/onnx_graph_simplifier.cpp OutdatedShow resolvedHide resolved

fengyuentau mentioned this pull request

Nov 3, 2023

[Feature Request]: attention operator support?openvinotoolkit/openvino#20850

Closed

1 task

fengyuentau force-pushed theattention_layer branch 2 times, most recently from304584b to5671decCompare

November 8, 2023 03:36

fengyuentau force-pushed theattention_layer branch from32293fd to816c331Compare

November 20, 2023 10:22

fengyuentau mentioned this pull request

Nov 23, 2023

dnn: add model and data for attention layer implementationopencv/opencv_extra#1128

Merged

fengyuentau marked this pull request as ready for review

November 23, 2023 10:34

fengyuentau assignedvpisarev

Nov 23, 2023

Copy link

MemberAuthor

fengyuentau commentedNov 24, 2023

Benchmark results are added.

Copy link

Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I will benchmark on CUDA and with OpenVINO

Copy link

Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Current numbers a bit confusing. Let's wait for#24476 (comment) because it fails.

input shape:[1, 320, 48]
model fromopencv/opencv_extra#1128
CPU: 12th Gen Intel(R) Core(TM) i9-12900

backend	4.x	PR
OpenCV, CPU	18.46ms	11.90ms (x0.64)
OpenVINO, CPU	0.25ms	11.91ms

Copy link

MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Well, the acc test have a reduced scale so that it does not take a lot time to run. For benchmark, I use [1, 197, 768] (I have twoattention.onnx models, one with input [1, 197, 768], the other one [1, 320, 48]).

For OpenVINO, since we do not have backend-specific graph fusion, we can recreate this subgraph ininitNgraph.

Copy link

Member

dkurtDec 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Updated results. Please ignore the table above:

input: 1x197x768
CPU: 12th Gen Intel(R) Core(TM) i9-12900

backend	4.x	PR
OpenCV, CPU	16.93ms	2.37ms (x7.14)
OpenVINO, CPU	1.54ms	2.33ms (x0.66)

So there is a degradation in OpenVINO performance in case of fallback to OpenCV layer.

Copy link

MemberAuthor

fengyuentau commentedNov 24, 2023

I will try to implement different backends for this layer in another pull request. Just try to reduce the review and merge this ASAP.

dkurt reviewed

Nov 24, 2023

View reviewed changes

modules/dnn/src/layers/attention_layer.cppShow resolvedHide resolved

dkurt reviewed

Nov 24, 2023

View reviewed changes

modules/dnn/test/test_onnx_importer.cppShow resolvedHide resolved

dkurt reviewed

Nov 24, 2023

View reviewed changes

modules/dnn/perf/perf_layer.cppShow resolvedHide resolved

fengyuentau force-pushed theattention_layer branch from669b503 tob3068a3Compare

November 25, 2023 10:34

WanliZhong mentioned this pull request

Nov 29, 2023

Make default axis of softmax in onnx "-1" without opset option#24613

Merged

fengyuentau mentioned this pull request

Nov 29, 2023

dnn graph simplifier: support optional constant inputs when match#24609

Closed

Copy link

Contributor

asmorkalov commentedDec 1, 2023

@dkurt @WanliZhong Friendly reminder.
@vpisarev Could you join the PR review.

dkurt reviewed

Dec 1, 2023

View reviewed changes

modules/dnn/src/onnx/onnx_graph_simplifier.cpp OutdatedShow resolvedHide resolved

dkurt reviewed

Dec 1, 2023

View reviewed changes

modules/dnn/perf/perf_net.cpp Outdated

		\| vit_b_32 \| 89.92 \| 116.22 \| 30.36 \|
		\| vit_l_16 \| 1593.32 \| 1730.74 \| 419.92 \|
		\| vit_l_32 \| 468.11 \| 577.41 \| 134.12 \|
		\| VitTrack \| 3.80 \| 3.87 \| 2.25 \|

Copy link

Member

dkurtDec 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Please remove the results from source code - it's enough to add to PR's description

Copy link

MemberAuthor

fengyuentauDec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Running this models inperf_net.cpp is taking up too much time for now. How about we commentPERF_TEST_P_(DNNTestNetwork, VIT_B_16) and such for now until we get the close or even better inference speed as ORT? In this case, these perfamce results should be deleted and kept in the first comment of this PR.

Copy link

Member

dkurtDec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

You may add a corresponding skip exception:

applyTestTag(CV_TEST_TAG_LONG, CV_TEST_TAG_DEBUG_LONG);

dkurt reviewed

Dec 1, 2023

View reviewed changes

modules/dnn/src/graph_simplifier.cpp OutdatedShow resolvedHide resolved

dkurt requested changes

Dec 1, 2023

View reviewed changes

Copy link

Member

dkurt left a comment•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

There is no issue if we specify 4 inputs for "Slice". So no need to add new logic with optional inputs:

Track#24609 and proposal#24609 (comment)

Copy link

MemberAuthor

fengyuentau commentedDec 1, 2023

I dont think the proposal is good enough. We actually have other models that haveStep to be non-default value. Can we achieve something like a switch perSubgraph that it omits optinal inputs if turned on?

Copy link

Member

dkurt commentedDec 4, 2023

@fengyuentau, got it. So can you please in this PR do a simpler workaround like a parameter with number of Slice inputs? This problem should be solved separately.

classAttentionSubGraph :publicSubgraph {public:AttentionSubGraph(int numSliceInps) {        std::vector<std::string>inps(1 + numSliceInps, att_add);for (int i =0; i < numSliceInps; ++i)            inps[i +1] =addNodeToMatch("");        slice_v =addNodeToMatch("Slice", inps);    }

fengyuentau added14 commits

December 20, 2023 15:27

support v_Slice.end=INT64_MAX; support single-head attention subgraph…

a1128aa

…; add test for attention subgraph fusion

add acc tests (commented for now); clear qkv_hidden_sizes everytime i…

92701ea

…t is matched; clean comments

handle optional inputs in graph simplifier

c8832ca

add perf and acc test for vittrack (comment for now)

61965d6

revert graph simplifier changes before rebase

0d36eac

fix graph simplifier

0776fc5

clear perf results

9e617fa

cpu only attention subgraph fusion

5c12f40

use OPENCV_DNN_BACKEND_DEFAULT

5fce823

changes by review

677694a

slice up to 5 inputs

35c3123

add acc and perf tests

a4f4811

empty commit to trigger tests

2db7246

empty commit to trigger tests 1

846237d

fengyuentau force-pushed theattention_layer branch from1bc226e to846237dCompare

December 20, 2023 07:29

fengyuentauand others added2 commits

December 20, 2023 16:53

try to make ci green by fixing output dimension problem

51e2f25

fix shape; set weight path of vittrack optional

2e5ea89

Copy link

Contributor

asmorkalov commentedDec 20, 2023

CUDA:

[ RUN      ] Test_ONNX_nets.ViT_B_32/1, where GetParam() = CUDA/CUDA_FP16/home/ci/opencv/modules/dnn/test/test_common.impl.hpp:76: FailureExpected: (normL1) <= (l1), actual: 0.00828554 vs 0.004ViTB_32  |ref| = 2.8940291404724121/home/ci/opencv/modules/dnn/test/test_common.impl.hpp:79: FailureExpected: (normInf) <= (lInf), actual: 0.0407351 vs 0.02ViTB_32  |ref| = 2.8940291404724121[  FAILED  ] Test_ONNX_nets.ViT_B_32/1, where GetParam() = CUDA/CUDA_FP16 (1393 ms)

Copy link

Contributor

asmorkalov commentedDec 20, 2023

@opencv-alalek Please update test data on Buildbot.

Copy link

MemberAuthor

fengyuentau commentedDec 20, 2023

CUDA:

[ RUN      ] Test_ONNX_nets.ViT_B_32/1, where GetParam() = CUDA/CUDA_FP16/home/ci/opencv/modules/dnn/test/test_common.impl.hpp:76: FailureExpected: (normL1) <= (l1), actual: 0.00828554 vs 0.004ViTB_32  |ref| = 2.8940291404724121/home/ci/opencv/modules/dnn/test/test_common.impl.hpp:79: FailureExpected: (normInf) <= (lInf), actual: 0.0407351 vs 0.02ViTB_32  |ref| = 2.8940291404724121[  FAILED  ] Test_ONNX_nets.ViT_B_32/1, where GetParam() = CUDA/CUDA_FP16 (1393 ms)

I was also looking into this issue. It is so weird that it only fails on CUDA_FP16 with almost double the threshold value. Could we apply skip tagCV_TEST_TAG_DNN_SKIP_CUDA_FP16 for now?

Copy link

Contributor

asmorkalov commentedDec 20, 2023

CV_TEST_TAG_DNN_SKIP_CUDA_FP16 - sure.

Skit new test for CUDA FP16 for now.

cb8ac70

asmorkalov approved these changes

Dec 20, 2023

View reviewed changes

Copy link

Contributor

asmorkalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

👍

asmorkalov merged commit0521a3a intoopencv:4.x

Dec 20, 2023

fengyuentau deleted the attention_layer branch

December 21, 2023 09:12

opencv-alalek mentioned this pull request

Dec 21, 2023

dnn(test): tune FP16 test tolerance#24737

Merged

asmorkalov mentioned this pull request

Jan 19, 2024

5.x merge 4.x#24862

Merged

fengyuentau mentioned this pull request

Feb 21, 2024

ONNX conformance test results#21078

Open

48 tasks

thewoz pushed a commit to thewoz/opencv that referenced this pull request

May 29, 2024

Merge pull requestopencv#24613from WanliZhong:softmax_default_axis

284e6be

Make default axis of softmax in onnx "-1" without opset optionopencv#24613Try to solve problem:opencv#24476 (comment)**ONNX**`opset <= 11` use 1`else` use -1**TensorFlow**`TF version = 2.x` use -1`else` use 1**Darknet, Caffe, Torch**use 1 by definition

thewoz pushed a commit to thewoz/opencv that referenced this pull request

May 29, 2024

Merge pull requestopencv#24476from fengyuentau:attention_layer

c2b8890

dnn: add attention layeropencv#24476Resolvesopencv#24609Merge with:opencv/opencv_extra#1128.Attention operator spec from onnxruntime:https://github.com/microsoft/onnxruntime/blob/v1.16.1/docs/ContribOperators.md#com.microsoft.Attention.TODO:- [x] benchmark (before this PR vs. with this PR vs. ORT).- [x] Layer fusion: Take care Slice with end=INT64_MAX.- [x] Layer fusion: match more potential attention (VIT) patterns.    - [x] Single-head attention is supported.- [x] Test AttentionSubgraph fusion.- [x] Add acc tests for VIT_B_32 and VitTrack- [x] Add perf tests for VIT_B_32 and VitTrack## BenchmarksPlatform: Macbook Air M1.### Attention SubgraphInput scale: [1, 197, 768].|                        | mean (ms) | median (ms) | min (ms) || ---------------------- | --------- | ----------- | -------- || w/ Attention (this PR) | 3.75      | 3.68        | 3.22     || w/o Attention          | 9.06      | 9.01        | 8.24     || ORT (python)           | 4.32      | 2.63        | 2.50     |### ViTsAll data in millisecond (ms).| ViTs     | With Attention | Without Attention | ORT    || -------- | -------------- | ----------------- | ------ || vit_b_16 | 302.77         | 365.35            | 109.70 || vit_b_32 | 89.92          | 116.22            | 30.36  || vit_l_16 | 1593.32        | 1730.74           | 419.92 || vit_l_32 | 468.11         | 577.41            | 134.12 || VitTrack | 3.80           | 3.87              | 2.25   |### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [x] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [x] The feature is well documented and sample code can be built with the project CMake

Labels

category: dnn (onnx)

ONNX suport issues in DNN module

category: dnn feature

5 participants

		net.setInput(bias, input_names[2]);

		net.setPreferableBackend(backendId);
		net.setPreferableTarget(targetId);

Movatterモバイル変換

Uh oh!

dnn: add attention layer#24476

dnn: add attention layer#24476

Uh oh!

Conversation

fengyuentau commentedNov 1, 2023• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Benchmarks

Attention Subgraph

ViTs

Pull Request Readiness Checklist

Uh oh!

Uh oh!

Uh oh!

fengyuentau commentedNov 24, 2023

Uh oh!

dkurt commentedNov 24, 2023

Uh oh!

dkurtNov 24, 2023

Choose a reason for hiding this comment

Uh oh!

dkurtNov 24, 2023

Choose a reason for hiding this comment

Uh oh!

fengyuentauNov 24, 2023

Choose a reason for hiding this comment

Uh oh!

dkurtDec 1, 2023

Choose a reason for hiding this comment

Uh oh!

fengyuentau commentedNov 24, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

asmorkalov commentedDec 1, 2023

Uh oh!

Uh oh!

dkurtDec 1, 2023

Choose a reason for hiding this comment

Uh oh!

fengyuentauDec 7, 2023

Choose a reason for hiding this comment

Uh oh!

dkurtDec 7, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dkurt left a comment• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fengyuentau commentedDec 1, 2023

Uh oh!

dkurt commentedDec 4, 2023

Uh oh!

asmorkalov commentedDec 20, 2023

Uh oh!

asmorkalov commentedDec 20, 2023

Uh oh!

fengyuentau commentedDec 20, 2023

Uh oh!

asmorkalov commentedDec 20, 2023

Uh oh!

asmorkalov left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

fengyuentau commentedNov 1, 2023•
edited
Loading

dkurt left a comment•
edited
Loading