Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

dnn: add attention layer#24476

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
asmorkalov merged 26 commits intoopencv:4.xfromfengyuentau:attention_layer
Dec 20, 2023
Merged

Conversation

@fengyuentau
Copy link
Member

@fengyuentaufengyuentau commentedNov 1, 2023
edited
Loading

Resolves#24609

Merge with:opencv/opencv_extra#1128.

Attention operator spec from onnxruntime:https://github.com/microsoft/onnxruntime/blob/v1.16.1/docs/ContribOperators.md#com.microsoft.Attention.

TODO:

  • benchmark (before this PR vs. with this PR vs. ORT).
  • Layer fusion: Take care Slice with end=INT64_MAX.
  • Layer fusion: match more potential attention (VIT) patterns.
    • Single-head attention is supported.
  • Test AttentionSubgraph fusion.
  • Add acc tests for VIT_B_32 and VitTrack
  • Add perf tests for VIT_B_32 and VitTrack

Benchmarks

Platform: Macbook Air M1.

Attention Subgraph

Input scale: [1, 197, 768].

mean (ms)median (ms)min (ms)
w/ Attention (this PR)3.753.683.22
w/o Attention9.069.018.24
ORT (python)4.322.632.50

ViTs

All data in millisecond (ms).

ViTsWith AttentionWithout AttentionORT
vit_b_16302.77365.35109.70
vit_b_3289.92116.2230.36
vit_l_161593.321730.74419.92
vit_l_32468.11577.41134.12
VitTrack3.803.872.25

Pull Request Readiness Checklist

See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

dkurt and zihaomu reacted with thumbs up emoji
@fengyuentaufengyuentau added feature category: dnn category: dnn (onnx)ONNX suport issues in DNN module labelsNov 1, 2023
@fengyuentaufengyuentau added this to the4.9.0 milestoneNov 1, 2023
@fengyuentau
Copy link
MemberAuthor

Benchmark results are added.

@dkurt
Copy link
Member

PR is good, thanks a lot. The only concern about the potential regression/fallback in backends because of transition from separate ops to a new fused layer.

However, there is no alternatives for this problem for now and I recommend to keep this PR only for default CPU implementation.

Please also take a look at this comment:opencv/opencv_extra#1128 (comment)

net.setInput(bias, input_names[2]);

net.setPreferableBackend(backendId);
net.setPreferableTarget(targetId);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

  • I will benchmark on CUDA and with OpenVINO

fengyuentau reacted with thumbs up emoji
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Current numbers a bit confusing. Let's wait for#24476 (comment) because it fails.

input shape:[1, 320, 48]
model fromopencv/opencv_extra#1128
CPU: 12th Gen Intel(R) Core(TM) i9-12900

backend4.xPR
OpenCV, CPU18.46ms11.90ms (x0.64)
OpenVINO, CPU0.25ms11.91ms

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Well, the acc test have a reduced scale so that it does not take a lot time to run. For benchmark, I use [1, 197, 768] (I have twoattention.onnx models, one with input [1, 197, 768], the other one [1, 320, 48]).


For OpenVINO, since we do not have backend-specific graph fusion, we can recreate this subgraph ininitNgraph.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Updated results. Please ignore the table above:

input: 1x197x768
CPU: 12th Gen Intel(R) Core(TM) i9-12900

backend4.xPR
OpenCV, CPU16.93ms2.37ms (x7.14)
OpenVINO, CPU1.54ms2.33ms (x0.66)

So there is a degradation in OpenVINO performance in case of fallback to OpenCV layer.

@fengyuentau
Copy link
MemberAuthor

I will try to implement different backends for this layer in another pull request. Just try to reduce the review and merge this ASAP.

dkurt reacted with thumbs up emoji

@asmorkalov
Copy link
Contributor

@dkurt@WanliZhong Friendly reminder.
@vpisarev Could you join the PR review.

| vit_b_32 | 89.92 | 116.22 | 30.36 |
| vit_l_16 | 1593.32 | 1730.74 | 419.92 |
| vit_l_32 | 468.11 | 577.41 | 134.12 |
| VitTrack | 3.80 | 3.87 | 2.25 |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Please remove the results from source code - it's enough to add to PR's description

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Running this models inperf_net.cpp is taking up too much time for now. How about we commentPERF_TEST_P_(DNNTestNetwork, VIT_B_16) and such for now until we get the close or even better inference speed as ORT? In this case, these perfamce results should be deleted and kept in the first comment of this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

You may add a corresponding skip exception:

applyTestTag(CV_TEST_TAG_LONG, CV_TEST_TAG_DEBUG_LONG);

Copy link
Member

@dkurtdkurt left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

There is no issue if we specify 4 inputs for "Slice". So no need to add new logic with optional inputs:

Track#24609 and proposal#24609 (comment)

@fengyuentau
Copy link
MemberAuthor

I dont think the proposal is good enough. We actually have other models that haveStep to be non-default value. Can we achieve something like a switch perSubgraph that it omits optinal inputs if turned on?

@dkurt
Copy link
Member

@fengyuentau, got it. So can you please in this PR do a simpler workaround like a parameter with number of Slice inputs? This problem should be solved separately.

classAttentionSubGraph :publicSubgraph {public:AttentionSubGraph(int numSliceInps) {        std::vector<std::string>inps(1 + numSliceInps, att_add);for (int i =0; i < numSliceInps; ++i)            inps[i +1] =addNodeToMatch("");        slice_v =addNodeToMatch("Slice", inps);    }

@asmorkalov
Copy link
Contributor

CUDA:

[ RUN      ] Test_ONNX_nets.ViT_B_32/1, where GetParam() = CUDA/CUDA_FP16/home/ci/opencv/modules/dnn/test/test_common.impl.hpp:76: FailureExpected: (normL1) <= (l1), actual: 0.00828554 vs 0.004ViTB_32  |ref| = 2.8940291404724121/home/ci/opencv/modules/dnn/test/test_common.impl.hpp:79: FailureExpected: (normInf) <= (lInf), actual: 0.0407351 vs 0.02ViTB_32  |ref| = 2.8940291404724121[  FAILED  ] Test_ONNX_nets.ViT_B_32/1, where GetParam() = CUDA/CUDA_FP16 (1393 ms)

@asmorkalov
Copy link
Contributor

@opencv-alalek Please update test data on Buildbot.

opencv-alalek reacted with thumbs up emoji

@fengyuentau
Copy link
MemberAuthor

CUDA:

[ RUN      ] Test_ONNX_nets.ViT_B_32/1, where GetParam() = CUDA/CUDA_FP16/home/ci/opencv/modules/dnn/test/test_common.impl.hpp:76: FailureExpected: (normL1) <= (l1), actual: 0.00828554 vs 0.004ViTB_32  |ref| = 2.8940291404724121/home/ci/opencv/modules/dnn/test/test_common.impl.hpp:79: FailureExpected: (normInf) <= (lInf), actual: 0.0407351 vs 0.02ViTB_32  |ref| = 2.8940291404724121[  FAILED  ] Test_ONNX_nets.ViT_B_32/1, where GetParam() = CUDA/CUDA_FP16 (1393 ms)

I was also looking into this issue. It is so weird that it only fails on CUDA_FP16 with almost double the threshold value. Could we apply skip tagCV_TEST_TAG_DNN_SKIP_CUDA_FP16 for now?

@asmorkalov
Copy link
Contributor

CV_TEST_TAG_DNN_SKIP_CUDA_FP16 - sure.

fengyuentau reacted with thumbs up emoji

Copy link
Contributor

@asmorkalovasmorkalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

👍

fengyuentau reacted with thumbs up emoji
@asmorkalovasmorkalov merged commit0521a3a intoopencv:4.xDec 20, 2023
@fengyuentaufengyuentau deleted the attention_layer branchDecember 21, 2023 09:12
@asmorkalovasmorkalov mentioned this pull requestJan 19, 2024
@fengyuentaufengyuentau mentioned this pull requestFeb 21, 2024
48 tasks
thewoz pushed a commit to thewoz/opencv that referenced this pull requestMay 29, 2024
Make default axis of softmax in onnx "-1" without opset optionopencv#24613Try to solve problem:opencv#24476 (comment)**ONNX**`opset <= 11` use 1`else` use -1**TensorFlow**`TF version = 2.x` use -1`else` use 1**Darknet, Caffe, Torch**use 1 by definition
thewoz pushed a commit to thewoz/opencv that referenced this pull requestMay 29, 2024
dnn: add attention layeropencv#24476Resolvesopencv#24609Merge with:opencv/opencv_extra#1128.Attention operator spec from onnxruntime:https://github.com/microsoft/onnxruntime/blob/v1.16.1/docs/ContribOperators.md#com.microsoft.Attention.TODO:- [x] benchmark (before this PR vs. with this PR vs. ORT).- [x] Layer fusion: Take care Slice with end=INT64_MAX.- [x] Layer fusion: match more potential attention (VIT) patterns.    - [x] Single-head attention is supported.- [x] Test AttentionSubgraph fusion.- [x] Add acc tests for VIT_B_32 and VitTrack- [x] Add perf tests for VIT_B_32 and VitTrack## BenchmarksPlatform: Macbook Air M1.### Attention SubgraphInput scale: [1, 197, 768].|                        | mean (ms) | median (ms) | min (ms) || ---------------------- | --------- | ----------- | -------- || w/ Attention (this PR) | 3.75      | 3.68        | 3.22     || w/o Attention          | 9.06      | 9.01        | 8.24     || ORT (python)           | 4.32      | 2.63        | 2.50     |### ViTsAll data in millisecond (ms).| ViTs     | With Attention | Without Attention | ORT    || -------- | -------------- | ----------------- | ------ || vit_b_16 | 302.77         | 365.35            | 109.70 || vit_b_32 | 89.92          | 116.22            | 30.36  || vit_l_16 | 1593.32        | 1730.74           | 419.92 || vit_l_32 | 468.11         | 577.41            | 134.12 || VitTrack | 3.80           | 3.87              | 2.25   |### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [x] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [x] The feature is well documented and sample code can be built with the project CMake
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@dkurtdkurtdkurt approved these changes

@WanliZhongWanliZhongWanliZhong left review comments

@asmorkalovasmorkalovasmorkalov approved these changes

@vpisarevvpisarevAwaiting requested review from vpisarev

Assignees

@vpisarevvpisarev

Labels

category: dnn (onnx)ONNX suport issues in DNN modulecategory: dnnfeature

Projects

None yet

Milestone

4.9.0

Development

Successfully merging this pull request may close these issues.

dnn graph simplifier: support optional constant inputs when match

5 participants

@fengyuentau@dkurt@asmorkalov@vpisarev@WanliZhong

[8]ページ先頭

©2009-2025 Movatter.jp