Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Einsum Layer Performance Test#24445

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
asmorkalov merged 6 commits intoopencv:4.xfromAbdurrahheem:ash/dev_einsum_pref
Nov 8, 2023

Conversation

@Abdurrahheem
Copy link
Contributor

@AbdurrahheemAbdurrahheem commentedOct 24, 2023
edited
Loading

This PR adds performance tests for Einsum Layer. See below results of performance test on different inputs

Notation:

  • WX: windows10_x64
  • MX: macos_x64
  • MA: macos_arm64
  • UX: ubuntu_x64
  • UA: ubuntu_arm64

All data in ms (milliseconds).
Gemm is backend for matrix multiplication


Benchmarks:

EquationInputs Mat DimsUX (ms)UA (ms)MX (ms)MA (ms)WX (ms)
"ij, jk -> ik"[2, 3], [3,2]0.04 ± 0.00----
"ij, jk -> ik"[20, 30], [30,20]0.08 ± 0.00----
"ij, jk -> ik"[113, 127], [127,113]2.41 ± 0.05----
"imkj, injs -> imnks"[1, 4, 7, 9], [1, 5, 9, 8]0.11 ± 0.00----
"imkj, injs -> imnks"[1, 4, 70, 90], [1, 5, 90, 80]15.49 ± 0.46----
"imkj, injs -> imnks"[1, 4, 73, 91], [1, 5, 91, 57]11.53 ± 0.06----
"ij -> i"[30, 40]0.03 ± 0.00----
"ij -> i"[113, 374]0.13 ± 0.00----
"...ij -> ...i"[30, 40]0.03 ± 0.00----
"...ij -> ...i"[113, 374]0.13 ± 0.00----
"...ij, ...jk -> ...ik"[40, 50], [50,80]0.37 ± 0.01----
"...ij, ...jk -> ...ik"[47, 51], [51, 83]0.43 ± 0.01----

Pull Request Readiness Checklist

See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@AbdurrahheemAbdurrahheem self-assigned thisOct 24, 2023
@AbdurrahheemAbdurrahheem marked this pull request as ready for reviewOctober 24, 2023 18:33
@fengyuentau
Copy link
Member

IIRC,cv::gemm is used in einsum layer. Could you usefastGemm and make a comparison?

dkurt reacted with thumbs up emoji

@asmorkalov
Copy link
Contributor

[ RUN      ] Layer_Einsum.einsum/5, where GetParam() = Eqiation: imkj, injs -> imnksInputSize: 2OutputSize: 1InputShape 0: 100 400 700 900 InputShape 1: 100 500 900 800 [ERROR:0@1279.725] global net_impl.cpp:1197 getLayerShapesRecursively OPENCV/DNN: []:(_input): getMemoryShapes() post validation failed. inputs=2 outputs=2/2 blobs=0 inplace=0[ERROR:0@1279.728] global net_impl.cpp:1204 getLayerShapesRecursively     input[0] = [ 100 400 700 900 ][ERROR:0@1279.728] global net_impl.cpp:1204 getLayerShapesRecursively     input[1] = [ 100 500 900 800 ][ERROR:0@1279.728] global net_impl.cpp:1208 getLayerShapesRecursively     output[0] = [ 100 400 700 900 ][ERROR:0@1279.728] global net_impl.cpp:1208 getLayerShapesRecursively     output[1] = [ 100 500 900 800 ][ERROR:0@1279.728] global net_impl.cpp:1214 getLayerShapesRecursively Exception message: OpenCV(4.8.0-dev) /home/ci/opencv/modules/dnn/src/net_impl.cpp:1193: error: (-2:Unspecified error) in function 'void cv::dnn::dnn4_v20230620::Net::Impl::getLayerShapesRecursively(int, cv::dnn::dnn4_v20230620::Net::Impl::LayersShapesMap&)'>  (expected: 'total(os[i]) > 0'), where>     'total(os[i])' is -569803776> must be greater than>     '0' is 0/home/ci/opencv/modules/ts/src/ts_perf.cpp:1965: FailureFailedExpected: PerfTestBody() doesn't throw an exception.  Actual: it throws cv::Exception:  OpenCV(4.8.0-dev) /home/ci/opencv/modules/dnn/src/net_impl.cpp:1193: error: (-2:Unspecified error) in function 'void cv::dnn::dnn4_v20230620::Net::Impl::getLayerShapesRecursively(int, cv::dnn::dnn4_v20230620::Net::Impl::LayersShapesMap&)'>  (expected: 'total(os[i]) > 0'), where>     'total(os[i])' is -569803776> must be greater than>     '0' is 0params    = Eqiation: imkj, injs -> imnksInputSize: 2OutputSize: 1InputShape 0: 100 400 700 900 InputShape 1: 100 500 900 800 termination reason:  unhandled exceptionbytesIn   =          0bytesOut  =          0samples   =          0 of 1outliers  =          0frequency =          0[  FAILED  ] Layer_Einsum.einsum/5, where GetParam() = Eqiation: imkj, injs -> imnksInputSize: 2OutputSize: 1InputShape 0: 100 400 700 900 InputShape 1: 100 500 900 800  (1288349 ms)

@asmorkalov
Copy link
Contributor

Abduragim will add fastGemm with the next iteration.

fengyuentau and dkurt reacted with thumbs up emoji

@asmorkalovasmorkalov added this to the4.9.0 milestoneOct 26, 2023
Copy link
Contributor

@asmorkalovasmorkalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

👍

@asmorkalov
Copy link
Contributor

@dkurt@fengyuentau I want to merge the PR. fastGem will be integrated with the next one to simplify performance comparison. Do you have any concerns?

Copy link
Member

@fengyuentaufengyuentau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Also what is the time cost on CI for these tests? is it tolerable (< 1000ms for example)?

@Abdurrahheem
Copy link
ContributorAuthor

Also what is the time cost on CI for these tests? is it tolerable (< 1000ms for example)?

@asmorkalov

@asmorkalov
Copy link
Contributor

On my old PC without AVX2: 17 tests from 1 test case ran. (15615 ms total)
The longest case is:

[ RUN      ] Layer_Einsum.einsum/7, where GetParam() = Eqiation=imkj, injs -> imnks, InputSize=2, OutputSize=1, InputShape={{1, 4, 700, 900}, {1, 5, 900, 800}}.[ PERFSTAT ]    (samples=10   mean=1273.04   median=1274.26   min=1261.88   stddev=6.78 (0.5%))

@fengyuentau
Copy link
Member

@Abdurrahheem Could you collect the perf results fromdetail pages and fill your table in the first comment?

@asmorkalov
Copy link
Contributor

@fengyuentau I propose to rerun the benchmark locally and update the PR. CI runs perf tests with single iteration and concurrently with other builds. The numbers are not reliable.

@fengyuentau
Copy link
Member

ARM64: ~3.5s

[ RUN ] Layer_Einsum.einsum/7, where GetParam() = Eqiation=imkj, injs -> imnks, InputSize=2, OutputSize=1, InputShape={{1, 4, 700, 900}, {1, 5, 900, 800}}
[ PERFSTAT ] (samples=1 mean=3578.70 median=3578.70 min=3578.70 stddev=0.00 (0.0%))

X64: ~1.8s

[ RUN ] Layer_Einsum.einsum/7, where GetParam() = Eqiation=imkj, injs -> imnks, InputSize=2, OutputSize=1, InputShape={{1, 4, 700, 900}, {1, 5, 900, 800}}
[ PERFSTAT ] (samples=1 mean=1797.72 median=1797.72 min=1797.72 stddev=0.00 (0.0%))

Win-X64: ~7.6s

[ RUN ] Layer_Einsum.einsum/7, where GetParam() = Eqiation=imkj, injs -> imnks, InputSize=2, OutputSize=1, InputShape={{1, 4, 700, 900}, {1, 5, 900, 800}}
[ PERFSTAT ] (samples=1 mean=7660.13 median=7660.13 min=7660.13 stddev=0.00 (0.0%))

I propose to make it a smaller scale.

@asmorkalov
Copy link
Contributor

@Abdurrahheem friendly reminder.

@Abdurrahheem
Copy link
ContributorAuthor

Abdurrahheem commentedNov 3, 2023
edited
Loading

@fengyuentau I propose to rerun the benchmark locally and update the PR. CI runs perf tests with single iteration and concurrently with other builds. The numbers are not reliable.

I am only able to test on Ubuntu locally currently due to lack of different platforms

@Abdurrahheem
Copy link
ContributorAuthor

Updated the table with performance results.

@AbdurrahheemAbdurrahheem mentioned this pull requestNov 8, 2023
6 tasks
Copy link
Member

@fengyuentaufengyuentau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thank you!

@asmorkalovasmorkalov merged commit9d0c8a9 intoopencv:4.xNov 8, 2023
IskXCr pushed a commit to Haosonn/opencv that referenced this pull requestDec 20, 2023
Einsum Layer Performance Testopencv#24445## This PR adds performance tests for Einsum Layer. See below results of performance test on different inputs**Notation:**- WX: windows10_x64- MX: macos_x64- MA: macos_arm64- UX: ubuntu_x64- UA: ubuntu_arm64All data in ms (milliseconds).Gemm is backend for matrix multiplication---Benchmarks:| Equation                | Inputs Mat Dims                   | UX (ms)        | UA (ms) | MX (ms) | MA (ms) | WX (ms) ||-------------------------|-----------------------------------|----------------|---------|---------|---------|---------|| "ij, jk -> ik"          | [2, 3], [3,2]                     | 0.04 ± 0.00    | -       | -       | -       | -       || "ij, jk -> ik"          | [20, 30], [30,20]                 | 0.08 ± 0.00    | -       | -       | -       | -       || "ij, jk -> ik"          | [113, 127], [127,113]             | 2.41 ± 0.05    | -       | -       | -       | -       || "imkj, injs -> imnks"   | [1, 4, 7, 9], [1, 5, 9, 8]        | 0.11 ± 0.00    | -       | -       | -       | -       || "imkj, injs -> imnks"   | [1, 4, 70, 90], [1, 5, 90, 80]    | 15.49 ± 0.46   | -       | -       | -       | -       || "imkj, injs -> imnks"   | [1, 4, 73, 91], [1, 5, 91, 57]    | 11.53 ± 0.06   | -       | -       | -       | -       || "ij -> i"               | [30, 40]                          | 0.03 ± 0.00    | -       | -       | -       | -       || "ij -> i"               | [113, 374]                        | 0.13 ± 0.00    | -       | -       | -       | -       || "...ij -> ...i"         | [30, 40]                          | 0.03 ± 0.00    | -       | -       | -       | -       || "...ij -> ...i"         | [113, 374]                        | 0.13 ± 0.00    | -       | -       | -       | -       || "...ij, ...jk -> ...ik" | [40, 50], [50,80]                 | 0.37 ± 0.01    | -       | -       | -       | -       || "...ij, ...jk -> ...ik" | [47, 51], [51, 83]                | 0.43 ± 0.01    | -       | -       | -       | -       |-----### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [x] The feature is well documented and sample code can be built with the project CMake
thewoz pushed a commit to thewoz/opencv that referenced this pull requestJan 4, 2024
Einsum Layer Performance Testopencv#24445## This PR adds performance tests for Einsum Layer. See below results of performance test on different inputs**Notation:**- WX: windows10_x64- MX: macos_x64- MA: macos_arm64- UX: ubuntu_x64- UA: ubuntu_arm64All data in ms (milliseconds).Gemm is backend for matrix multiplication---Benchmarks:| Equation                | Inputs Mat Dims                   | UX (ms)        | UA (ms) | MX (ms) | MA (ms) | WX (ms) ||-------------------------|-----------------------------------|----------------|---------|---------|---------|---------|| "ij, jk -> ik"          | [2, 3], [3,2]                     | 0.04 ± 0.00    | -       | -       | -       | -       || "ij, jk -> ik"          | [20, 30], [30,20]                 | 0.08 ± 0.00    | -       | -       | -       | -       || "ij, jk -> ik"          | [113, 127], [127,113]             | 2.41 ± 0.05    | -       | -       | -       | -       || "imkj, injs -> imnks"   | [1, 4, 7, 9], [1, 5, 9, 8]        | 0.11 ± 0.00    | -       | -       | -       | -       || "imkj, injs -> imnks"   | [1, 4, 70, 90], [1, 5, 90, 80]    | 15.49 ± 0.46   | -       | -       | -       | -       || "imkj, injs -> imnks"   | [1, 4, 73, 91], [1, 5, 91, 57]    | 11.53 ± 0.06   | -       | -       | -       | -       || "ij -> i"               | [30, 40]                          | 0.03 ± 0.00    | -       | -       | -       | -       || "ij -> i"               | [113, 374]                        | 0.13 ± 0.00    | -       | -       | -       | -       || "...ij -> ...i"         | [30, 40]                          | 0.03 ± 0.00    | -       | -       | -       | -       || "...ij -> ...i"         | [113, 374]                        | 0.13 ± 0.00    | -       | -       | -       | -       || "...ij, ...jk -> ...ik" | [40, 50], [50,80]                 | 0.37 ± 0.01    | -       | -       | -       | -       || "...ij, ...jk -> ...ik" | [47, 51], [51, 83]                | 0.43 ± 0.01    | -       | -       | -       | -       |-----### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [x] The feature is well documented and sample code can be built with the project CMake
@asmorkalovasmorkalov mentioned this pull requestJan 19, 2024
thewoz pushed a commit to thewoz/opencv that referenced this pull requestMay 29, 2024
Einsum Layer Performance Testopencv#24445## This PR adds performance tests for Einsum Layer. See below results of performance test on different inputs**Notation:**- WX: windows10_x64- MX: macos_x64- MA: macos_arm64- UX: ubuntu_x64- UA: ubuntu_arm64All data in ms (milliseconds).Gemm is backend for matrix multiplication---Benchmarks:| Equation                | Inputs Mat Dims                   | UX (ms)        | UA (ms) | MX (ms) | MA (ms) | WX (ms) ||-------------------------|-----------------------------------|----------------|---------|---------|---------|---------|| "ij, jk -> ik"          | [2, 3], [3,2]                     | 0.04 ± 0.00    | -       | -       | -       | -       || "ij, jk -> ik"          | [20, 30], [30,20]                 | 0.08 ± 0.00    | -       | -       | -       | -       || "ij, jk -> ik"          | [113, 127], [127,113]             | 2.41 ± 0.05    | -       | -       | -       | -       || "imkj, injs -> imnks"   | [1, 4, 7, 9], [1, 5, 9, 8]        | 0.11 ± 0.00    | -       | -       | -       | -       || "imkj, injs -> imnks"   | [1, 4, 70, 90], [1, 5, 90, 80]    | 15.49 ± 0.46   | -       | -       | -       | -       || "imkj, injs -> imnks"   | [1, 4, 73, 91], [1, 5, 91, 57]    | 11.53 ± 0.06   | -       | -       | -       | -       || "ij -> i"               | [30, 40]                          | 0.03 ± 0.00    | -       | -       | -       | -       || "ij -> i"               | [113, 374]                        | 0.13 ± 0.00    | -       | -       | -       | -       || "...ij -> ...i"         | [30, 40]                          | 0.03 ± 0.00    | -       | -       | -       | -       || "...ij -> ...i"         | [113, 374]                        | 0.13 ± 0.00    | -       | -       | -       | -       || "...ij, ...jk -> ...ik" | [40, 50], [50,80]                 | 0.37 ± 0.01    | -       | -       | -       | -       || "...ij, ...jk -> ...ik" | [47, 51], [51, 83]                | 0.43 ± 0.01    | -       | -       | -       | -       |-----### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [x] The feature is well documented and sample code can be built with the project CMake
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@asmorkalovasmorkalovasmorkalov approved these changes

@fengyuentaufengyuentaufengyuentau approved these changes

@dkurtdkurtAwaiting requested review from dkurt

Assignees

@AbdurrahheemAbdurrahheem

Projects

None yet

Milestone

4.9.0

Development

Successfully merging this pull request may close these issues.

3 participants

@Abdurrahheem@fengyuentau@asmorkalov

[8]ページ先頭

©2009-2025 Movatter.jp