Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork56.4k
dnn (cuda): support broadcasting if a.rank() != b.rank()#24834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
fengyuentau commentedJan 9, 2024
Tried to add yolov8n to test on different backends, but turns out we may have more problems, especially in CUDA_FP16 target: |
Abdurrahheem commentedJan 9, 2024
@fengyuentau once this PR is complete (currently yolov8 is not supported on CUDA here, AFAK) does it mean that PR#24786 is going be obsolete? |
fengyuentau commentedJan 9, 2024
Yes.
It's not true. There are some minor differences in the results between CPU and CUDA/CUDA, which is OK I think, but the differences are much bigger when it comes to the CUDA_FP16 target. I guess we lose some accuracy in |
asmorkalov commentedJan 9, 2024
Locally I observe several test failures like this: Full list: |
fengyuentau commentedJan 9, 2024
It was due to there are inputs of shape [1] (1d mat) in these failed tests. In cuda backend, there are asserts checking It works previously because it was not actually testing the CUDA backend; If two inputs have different dimensions, it falls back to CPU implementation. So it tests nothing related to the CUDA backend in these case. See below for the fall back (Line 804-805): opencv/modules/dnn/src/layers/nary_eltwise_layers.cpp Lines 800 to 811 in5c9ad9d
With that being said, I propose to turn off these tests specifically for CUDA backend.@asmorkalov What do you think? @WanliZhong Please join this talk as well. |
fengyuentau commentedJan 9, 2024
Or we still fall back to CPU when dimension is 1. |
WanliZhong commentedJan 9, 2024
I propose fallback when dim is 1 to make sure cuda run correctly rather than throw an error |
fengyuentau commentedJan 9, 2024
It does not work due to the 1d mat is actually produced during the broadcasting implementation in the CUDA backend. Let me find another solution to this. |
fengyuentau commentedJan 9, 2024
New commits should resolve this problem. |
asmorkalov commentedJan 10, 2024
Pass tests with CUDA locally now. |
fengyuentau commentedJan 10, 2024
Sporadic crash in |
Uh oh!
There was an error while loading.Please reload this page.
Inspired by#24786. This PR keeps the fusion of
NaryEltwiseandConcatwhile addressed the data missing problem via supporting broadcasting if a.rank() != b.rank().Resolves#23977
Resolves#24606
Resolves#24635
Resolves#24721
Pull Request Readiness Checklist
See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.