NotificationsYou must be signed in to change notification settings
Fork56.4k
Star85.3k

Einsum Layer Performance Test#24445

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Merged

asmorkalov merged 6 commits intoopencv:4.xfromAbdurrahheem:ash/dev_einsum_pref

Nov 8, 2023

Merged

Einsum Layer Performance Test#24445

asmorkalov merged 6 commits intoopencv:4.xfromAbdurrahheem:ash/dev_einsum_pref

Nov 8, 2023

Conversation

Copy link

Contributor

Abdurrahheem commentedOct 24, 2023•
edited
Loading

This PR adds performance tests for Einsum Layer. See below results of performance test on different inputs

Notation:

WX: windows10_x64
MX: macos_x64
MA: macos_arm64
UX: ubuntu_x64
UA: ubuntu_arm64

All data in ms (milliseconds).
Gemm is backend for matrix multiplication

Benchmarks:

Equation	Inputs Mat Dims	UX (ms)	UA (ms)	MX (ms)	MA (ms)	WX (ms)
"ij, jk -> ik"	[2, 3], [3,2]	0.04 ± 0.00	-	-	-	-
"ij, jk -> ik"	[20, 30], [30,20]	0.08 ± 0.00	-	-	-	-
"ij, jk -> ik"	[113, 127], [127,113]	2.41 ± 0.05	-	-	-	-
"imkj, injs -> imnks"	[1, 4, 7, 9], [1, 5, 9, 8]	0.11 ± 0.00	-	-	-	-
"imkj, injs -> imnks"	[1, 4, 70, 90], [1, 5, 90, 80]	15.49 ± 0.46	-	-	-	-
"imkj, injs -> imnks"	[1, 4, 73, 91], [1, 5, 91, 57]	11.53 ± 0.06	-	-	-	-
"ij -> i"	[30, 40]	0.03 ± 0.00	-	-	-	-
"ij -> i"	[113, 374]	0.13 ± 0.00	-	-	-	-
"...ij -> ...i"	[30, 40]	0.03 ± 0.00	-	-	-	-
"...ij -> ...i"	[113, 374]	0.13 ± 0.00	-	-	-	-
"...ij, ...jk -> ...ik"	[40, 50], [50,80]	0.37 ± 0.01	-	-	-	-
"...ij, ...jk -> ...ik"	[47, 51], [51, 83]	0.43 ± 0.01	-	-	-	-

Pull Request Readiness Checklist

See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

Abdurrahheem added2 commits

October 24, 2023 19:18

1st iteration of perf_einsum

65cdc6d

working perf tests

ea84c2b

Abdurrahheem added the category: dnn label

Oct 24, 2023

Abdurrahheem requested review fromasmorkalov,dkurt andfengyuentau

October 24, 2023 18:32

Abdurrahheem self-assigned this

Oct 24, 2023

Abdurrahheem marked this pull request as ready for review

October 24, 2023 18:33

Copy link

Member

fengyuentau commentedOct 25, 2023

IIRC,cv::gemm is used in einsum layer. Could you usefastGemm and make a comparison?

Copy link

Contributor

asmorkalov commentedOct 25, 2023

[ RUN      ] Layer_Einsum.einsum/5, where GetParam() = Eqiation: imkj, injs -> imnksInputSize: 2OutputSize: 1InputShape 0: 100 400 700 900 InputShape 1: 100 500 900 800 [ERROR:0@1279.725] global net_impl.cpp:1197 getLayerShapesRecursively OPENCV/DNN: []:(_input): getMemoryShapes() post validation failed. inputs=2 outputs=2/2 blobs=0 inplace=0[ERROR:0@1279.728] global net_impl.cpp:1204 getLayerShapesRecursively     input[0] = [ 100 400 700 900 ][ERROR:0@1279.728] global net_impl.cpp:1204 getLayerShapesRecursively     input[1] = [ 100 500 900 800 ][ERROR:0@1279.728] global net_impl.cpp:1208 getLayerShapesRecursively     output[0] = [ 100 400 700 900 ][ERROR:0@1279.728] global net_impl.cpp:1208 getLayerShapesRecursively     output[1] = [ 100 500 900 800 ][ERROR:0@1279.728] global net_impl.cpp:1214 getLayerShapesRecursively Exception message: OpenCV(4.8.0-dev) /home/ci/opencv/modules/dnn/src/net_impl.cpp:1193: error: (-2:Unspecified error) in function 'void cv::dnn::dnn4_v20230620::Net::Impl::getLayerShapesRecursively(int, cv::dnn::dnn4_v20230620::Net::Impl::LayersShapesMap&)'>  (expected: 'total(os[i]) > 0'), where>     'total(os[i])' is -569803776> must be greater than>     '0' is 0/home/ci/opencv/modules/ts/src/ts_perf.cpp:1965: FailureFailedExpected: PerfTestBody() doesn't throw an exception.  Actual: it throws cv::Exception:  OpenCV(4.8.0-dev) /home/ci/opencv/modules/dnn/src/net_impl.cpp:1193: error: (-2:Unspecified error) in function 'void cv::dnn::dnn4_v20230620::Net::Impl::getLayerShapesRecursively(int, cv::dnn::dnn4_v20230620::Net::Impl::LayersShapesMap&)'>  (expected: 'total(os[i]) > 0'), where>     'total(os[i])' is -569803776> must be greater than>     '0' is 0params    = Eqiation: imkj, injs -> imnksInputSize: 2OutputSize: 1InputShape 0: 100 400 700 900 InputShape 1: 100 500 900 800 termination reason:  unhandled exceptionbytesIn   =          0bytesOut  =          0samples   =          0 of 1outliers  =          0frequency =          0[  FAILED  ] Layer_Einsum.einsum/5, where GetParam() = Eqiation: imkj, injs -> imnksInputSize: 2OutputSize: 1InputShape 0: 100 400 700 900 InputShape 1: 100 500 900 800  (1288349 ms)

Copy link

Contributor

asmorkalov commentedOct 25, 2023

Abduragim will add fastGemm with the next iteration.

added new tests

fac3d71

asmorkalov added the test label

Oct 26, 2023

asmorkalov reviewed

Oct 26, 2023

View reviewed changes

modules/dnn/perf/perf_einsum.cpp OutdatedShow resolvedHide resolved

asmorkalov added this to the4.9.0 milestone

Oct 26, 2023

fix stdout format & more tests

44c322f

asmorkalov approved these changes

Oct 26, 2023

View reviewed changes

Copy link

Contributor

asmorkalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

👍

Copy link

Contributor

asmorkalov commentedOct 26, 2023

@dkurt @fengyuentau I want to merge the PR. fastGem will be integrated with the next one to simplify performance comparison. Do you have any concerns?

fengyuentau reviewed

Oct 26, 2023

View reviewed changes

Copy link

Member

fengyuentau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Also what is the time cost on CI for these tests? is it tolerable (< 1000ms for example)?

modules/dnn/perf/perf_einsum.cpp OutdatedShow resolvedHide resolved

Copy link

ContributorAuthor

Abdurrahheem commentedOct 26, 2023

Also what is the time cost on CI for these tests? is it tolerable (< 1000ms for example)?

@asmorkalov

PR fix

f6f543f

Copy link

Contributor

asmorkalov commentedOct 26, 2023

On my old PC without AVX2: 17 tests from 1 test case ran. (15615 ms total)
The longest case is:

[ RUN      ] Layer_Einsum.einsum/7, where GetParam() = Eqiation=imkj, injs -> imnks, InputSize=2, OutputSize=1, InputShape={{1, 4, 700, 900}, {1, 5, 900, 800}}.[ PERFSTAT ]    (samples=10   mean=1273.04   median=1274.26   min=1261.88   stddev=6.78 (0.5%))

Copy link

Member

fengyuentau commentedOct 27, 2023

@Abdurrahheem Could you collect the perf results fromdetail pages and fill your table in the first comment?

Copy link

Contributor

asmorkalov commentedOct 27, 2023

@fengyuentau I propose to rerun the benchmark locally and update the PR. CI runs perf tests with single iteration and concurrently with other builds. The numbers are not reliable.

Copy link

Member

fengyuentau commentedOct 30, 2023

ARM64: ~3.5s

[ RUN ] Layer_Einsum.einsum/7, where GetParam() = Eqiation=imkj, injs -> imnks, InputSize=2, OutputSize=1, InputShape={{1, 4, 700, 900}, {1, 5, 900, 800}}
[ PERFSTAT ] (samples=1 mean=3578.70 median=3578.70 min=3578.70 stddev=0.00 (0.0%))

X64: ~1.8s

[ RUN ] Layer_Einsum.einsum/7, where GetParam() = Eqiation=imkj, injs -> imnks, InputSize=2, OutputSize=1, InputShape={{1, 4, 700, 900}, {1, 5, 900, 800}}
[ PERFSTAT ] (samples=1 mean=1797.72 median=1797.72 min=1797.72 stddev=0.00 (0.0%))

Win-X64: ~7.6s

[ RUN ] Layer_Einsum.einsum/7, where GetParam() = Eqiation=imkj, injs -> imnks, InputSize=2, OutputSize=1, InputShape={{1, 4, 700, 900}, {1, 5, 900, 800}}
[ PERFSTAT ] (samples=1 mean=7660.13 median=7660.13 min=7660.13 stddev=0.00 (0.0%))

I propose to make it a smaller scale.

Copy link

Contributor

asmorkalov commentedNov 3, 2023

@Abdurrahheem friendly reminder.

removed long tests

a0f90f6

Copy link

ContributorAuthor

Abdurrahheem commentedNov 3, 2023•
edited
Loading

@fengyuentau I propose to rerun the benchmark locally and update the PR. CI runs perf tests with single iteration and concurrently with other builds. The numbers are not reliable.

I am only able to test on Ubuntu locally currently due to lack of different platforms

Copy link

ContributorAuthor

Abdurrahheem commentedNov 7, 2023

Updated the table with performance results.

Abdurrahheem mentioned this pull request

Nov 8, 2023

Fast gemm for einsum#24509

Merged

6 tasks

fengyuentau approved these changes

Nov 8, 2023

View reviewed changes

Copy link

Member

fengyuentau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thank you!

asmorkalov merged commit9d0c8a9 intoopencv:4.x

Nov 8, 2023

IskXCr pushed a commit to Haosonn/opencv that referenced this pull request

Dec 20, 2023

Merge pull requestopencv#24445from Abdurrahheem:ash/dev_einsum_pref

4fc7ecf

Einsum Layer Performance Testopencv#24445## This PR adds performance tests for Einsum Layer. See below results of performance test on different inputs**Notation:**- WX: windows10_x64- MX: macos_x64- MA: macos_arm64- UX: ubuntu_x64- UA: ubuntu_arm64All data in ms (milliseconds).Gemm is backend for matrix multiplication---Benchmarks:| Equation                | Inputs Mat Dims                   | UX (ms)        | UA (ms) | MX (ms) | MA (ms) | WX (ms) ||-------------------------|-----------------------------------|----------------|---------|---------|---------|---------|| "ij, jk -> ik"          | [2, 3], [3,2]                     | 0.04 ± 0.00    | -       | -       | -       | -       || "ij, jk -> ik"          | [20, 30], [30,20]                 | 0.08 ± 0.00    | -       | -       | -       | -       || "ij, jk -> ik"          | [113, 127], [127,113]             | 2.41 ± 0.05    | -       | -       | -       | -       || "imkj, injs -> imnks"   | [1, 4, 7, 9], [1, 5, 9, 8]        | 0.11 ± 0.00    | -       | -       | -       | -       || "imkj, injs -> imnks"   | [1, 4, 70, 90], [1, 5, 90, 80]    | 15.49 ± 0.46   | -       | -       | -       | -       || "imkj, injs -> imnks"   | [1, 4, 73, 91], [1, 5, 91, 57]    | 11.53 ± 0.06   | -       | -       | -       | -       || "ij -> i"               | [30, 40]                          | 0.03 ± 0.00    | -       | -       | -       | -       || "ij -> i"               | [113, 374]                        | 0.13 ± 0.00    | -       | -       | -       | -       || "...ij -> ...i"         | [30, 40]                          | 0.03 ± 0.00    | -       | -       | -       | -       || "...ij -> ...i"         | [113, 374]                        | 0.13 ± 0.00    | -       | -       | -       | -       || "...ij, ...jk -> ...ik" | [40, 50], [50,80]                 | 0.37 ± 0.01    | -       | -       | -       | -       || "...ij, ...jk -> ...ik" | [47, 51], [51, 83]                | 0.43 ± 0.01    | -       | -       | -       | -       |-----### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [x] The feature is well documented and sample code can be built with the project CMake

thewoz pushed a commit to thewoz/opencv that referenced this pull request

Jan 4, 2024

Merge pull requestopencv#24445from Abdurrahheem:ash/dev_einsum_pref

5914dfd

Einsum Layer Performance Testopencv#24445## This PR adds performance tests for Einsum Layer. See below results of performance test on different inputs**Notation:**- WX: windows10_x64- MX: macos_x64- MA: macos_arm64- UX: ubuntu_x64- UA: ubuntu_arm64All data in ms (milliseconds).Gemm is backend for matrix multiplication---Benchmarks:| Equation                | Inputs Mat Dims                   | UX (ms)        | UA (ms) | MX (ms) | MA (ms) | WX (ms) ||-------------------------|-----------------------------------|----------------|---------|---------|---------|---------|| "ij, jk -> ik"          | [2, 3], [3,2]                     | 0.04 ± 0.00    | -       | -       | -       | -       || "ij, jk -> ik"          | [20, 30], [30,20]                 | 0.08 ± 0.00    | -       | -       | -       | -       || "ij, jk -> ik"          | [113, 127], [127,113]             | 2.41 ± 0.05    | -       | -       | -       | -       || "imkj, injs -> imnks"   | [1, 4, 7, 9], [1, 5, 9, 8]        | 0.11 ± 0.00    | -       | -       | -       | -       || "imkj, injs -> imnks"   | [1, 4, 70, 90], [1, 5, 90, 80]    | 15.49 ± 0.46   | -       | -       | -       | -       || "imkj, injs -> imnks"   | [1, 4, 73, 91], [1, 5, 91, 57]    | 11.53 ± 0.06   | -       | -       | -       | -       || "ij -> i"               | [30, 40]                          | 0.03 ± 0.00    | -       | -       | -       | -       || "ij -> i"               | [113, 374]                        | 0.13 ± 0.00    | -       | -       | -       | -       || "...ij -> ...i"         | [30, 40]                          | 0.03 ± 0.00    | -       | -       | -       | -       || "...ij -> ...i"         | [113, 374]                        | 0.13 ± 0.00    | -       | -       | -       | -       || "...ij, ...jk -> ...ik" | [40, 50], [50,80]                 | 0.37 ± 0.01    | -       | -       | -       | -       || "...ij, ...jk -> ...ik" | [47, 51], [51, 83]                | 0.43 ± 0.01    | -       | -       | -       | -       |-----### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [x] The feature is well documented and sample code can be built with the project CMake

asmorkalov mentioned this pull request

Jan 19, 2024

5.x merge 4.x#24862

Merged

thewoz pushed a commit to thewoz/opencv that referenced this pull request

May 29, 2024

Merge pull requestopencv#24445from Abdurrahheem:ash/dev_einsum_pref

3563a60

Einsum Layer Performance Testopencv#24445## This PR adds performance tests for Einsum Layer. See below results of performance test on different inputs**Notation:**- WX: windows10_x64- MX: macos_x64- MA: macos_arm64- UX: ubuntu_x64- UA: ubuntu_arm64All data in ms (milliseconds).Gemm is backend for matrix multiplication---Benchmarks:| Equation                | Inputs Mat Dims                   | UX (ms)        | UA (ms) | MX (ms) | MA (ms) | WX (ms) ||-------------------------|-----------------------------------|----------------|---------|---------|---------|---------|| "ij, jk -> ik"          | [2, 3], [3,2]                     | 0.04 ± 0.00    | -       | -       | -       | -       || "ij, jk -> ik"          | [20, 30], [30,20]                 | 0.08 ± 0.00    | -       | -       | -       | -       || "ij, jk -> ik"          | [113, 127], [127,113]             | 2.41 ± 0.05    | -       | -       | -       | -       || "imkj, injs -> imnks"   | [1, 4, 7, 9], [1, 5, 9, 8]        | 0.11 ± 0.00    | -       | -       | -       | -       || "imkj, injs -> imnks"   | [1, 4, 70, 90], [1, 5, 90, 80]    | 15.49 ± 0.46   | -       | -       | -       | -       || "imkj, injs -> imnks"   | [1, 4, 73, 91], [1, 5, 91, 57]    | 11.53 ± 0.06   | -       | -       | -       | -       || "ij -> i"               | [30, 40]                          | 0.03 ± 0.00    | -       | -       | -       | -       || "ij -> i"               | [113, 374]                        | 0.13 ± 0.00    | -       | -       | -       | -       || "...ij -> ...i"         | [30, 40]                          | 0.03 ± 0.00    | -       | -       | -       | -       || "...ij -> ...i"         | [113, 374]                        | 0.13 ± 0.00    | -       | -       | -       | -       || "...ij, ...jk -> ...ik" | [40, 50], [50,80]                 | 0.37 ± 0.01    | -       | -       | -       | -       || "...ij, ...jk -> ...ik" | [47, 51], [51, 83]                | 0.43 ± 0.01    | -       | -       | -       | -       |-----### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable      Patch to opencv_extra has the same branch name.- [x] The feature is well documented and sample code can be built with the project CMake

Labels

category: dnn test

Movatterモバイル変換

Uh oh!

Einsum Layer Performance Test#24445

Einsum Layer Performance Test#24445

Uh oh!

Conversation

Abdurrahheem commentedOct 24, 2023• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

This PR adds performance tests for Einsum Layer. See below results of performance test on different inputs

Pull Request Readiness Checklist

Uh oh!

fengyuentau commentedOct 25, 2023

Uh oh!

asmorkalov commentedOct 25, 2023

Uh oh!

asmorkalov commentedOct 25, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

asmorkalov left a comment

Choose a reason for hiding this comment

Uh oh!

asmorkalov commentedOct 26, 2023

Uh oh!

fengyuentau left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Abdurrahheem commentedOct 26, 2023

Uh oh!

asmorkalov commentedOct 26, 2023

Uh oh!

fengyuentau commentedOct 27, 2023

Uh oh!

asmorkalov commentedOct 27, 2023

Uh oh!

fengyuentau commentedOct 30, 2023

Uh oh!

asmorkalov commentedNov 3, 2023

Uh oh!

Abdurrahheem commentedNov 3, 2023• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

Abdurrahheem commentedNov 7, 2023

Uh oh!

fengyuentau left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Abdurrahheem commentedOct 24, 2023•
edited
Loading

Abdurrahheem commentedNov 3, 2023•
edited
Loading