Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Improved Blur & Box filter with combined row+column separable filter#28224

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
madanm3 wants to merge4 commits intoopencv:4.x
base:4.x
Choose a base branch
Loading
fromamd:fast_boxFilter_simd

Conversation

@madanm3
Copy link
Contributor

  • Combined row+Column separable filter for small filter sizes(3x3 and 5x5) added to avoid intermediate row buffer storage.
  • FilterEngine has been modified to use combined rowColumnFilter.
  • rowColumnFilter added to boxFilter as a first step. (other filters to be modified soon)
  • compute scaling with shift operation when kernel size is power of 2.

Pull Request Readiness Checklist

See details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

- Combined row+Column separable filter for small filter sizes(3x3 and 5x5)added to avoid intermediate row buffer storage.- FilterEngine has been modified to use combined rowColumnFilter.- rowColumnFilter added to boxFilter as a first step. (other filters to be modified soon)- compute scaling with shift operation when kernel size is power of 2.
@madanm3
Copy link
ContributorAuthor

madanm3 commentedDec 18, 2025
edited
Loading

Tested on AMD-turin:                                                                                 (Median time in ms)                        Name of Test                                         ref   patchref/patch                                                                                                    (gain in x)box::Size_MatType_BorderType_ksize::(1280x720, 16SC1, BORDER_CONSTANT, 3)          0.140.081.75xbox::Size_MatType_BorderType_ksize::(1280x720, 16SC1, BORDER_CONSTANT, 5)          0.180.1 1.80xbox::Size_MatType_BorderType_ksize::(1280x720, 16SC1, BORDER_REFLECT, 3)           0.150.072.14xbox::Size_MatType_BorderType_ksize::(1280x720, 16SC1, BORDER_REFLECT, 5)           0.180.111.64xbox::Size_MatType_BorderType_ksize::(1280x720, 16SC1, BORDER_REFLECT101, 3)        0.150.072.14xbox::Size_MatType_BorderType_ksize::(1280x720, 16SC1, BORDER_REFLECT101, 5)        0.180.111.64xbox::Size_MatType_BorderType_ksize::(1280x720, 16SC1, BORDER_REPLICATE, 3)         0.150.081.88xbox::Size_MatType_BorderType_ksize::(1280x720, 16SC1, BORDER_REPLICATE, 5)         0.180.131.38xbox::Size_MatType_BorderType_ksize::(1280x720, 32FC1, BORDER_CONSTANT, 3)          0.250.171.47xbox::Size_MatType_BorderType_ksize::(1280x720, 32FC1, BORDER_CONSTANT, 5)          0.370.191.95xbox::Size_MatType_BorderType_ksize::(1280x720, 32FC1, BORDER_REFLECT, 3)           0.250.181.39xbox::Size_MatType_BorderType_ksize::(1280x720, 32FC1, BORDER_REFLECT, 5)           0.370.211.76xbox::Size_MatType_BorderType_ksize::(1280x720, 32FC1, BORDER_REFLECT101, 3)        0.250.171.47xbox::Size_MatType_BorderType_ksize::(1280x720, 32FC1, BORDER_REFLECT101, 5)        0.370.211.76xbox::Size_MatType_BorderType_ksize::(1280x720, 32FC1, BORDER_REPLICATE, 3)         0.250.181.39xbox::Size_MatType_BorderType_ksize::(1280x720, 32FC1, BORDER_REPLICATE, 5)         0.370.2 1.85xbox::Size_MatType_BorderType_ksize::(1280x720, 32FC3, BORDER_CONSTANT, 3)          0.730.551.33xbox::Size_MatType_BorderType_ksize::(1280x720, 32FC3, BORDER_CONSTANT, 5)          1.080.631.71xbox::Size_MatType_BorderType_ksize::(1280x720, 32FC3, BORDER_REFLECT, 3)           0.730.561.30xbox::Size_MatType_BorderType_ksize::(1280x720, 32FC3, BORDER_REFLECT, 5)           1.080.621.74xbox::Size_MatType_BorderType_ksize::(1280x720, 32FC3, BORDER_REFLECT101, 3)        0.730.531.38xbox::Size_MatType_BorderType_ksize::(1280x720, 32FC3, BORDER_REFLECT101, 5)        1.080.621.74xbox::Size_MatType_BorderType_ksize::(1280x720, 32FC3, BORDER_REPLICATE, 3)         0.730.561.30xbox::Size_MatType_BorderType_ksize::(1280x720, 32FC3, BORDER_REPLICATE, 5)         1.080.621.74xbox::Size_MatType_BorderType_ksize::(1280x720, 32SC1, BORDER_CONSTANT, 3)          0.190.121.58xbox::Size_MatType_BorderType_ksize::(1280x720, 32SC1, BORDER_CONSTANT, 5)          0.230.151.53xbox::Size_MatType_BorderType_ksize::(1280x720, 32SC1, BORDER_REFLECT, 3)           0.190.111.73xbox::Size_MatType_BorderType_ksize::(1280x720, 32SC1, BORDER_REFLECT, 5)           0.240.151.60xbox::Size_MatType_BorderType_ksize::(1280x720, 32SC1, BORDER_REFLECT101, 3)        0.2 0.121.67xbox::Size_MatType_BorderType_ksize::(1280x720, 32SC1, BORDER_REFLECT101, 5)        0.240.151.60xbox::Size_MatType_BorderType_ksize::(1280x720, 32SC1, BORDER_REPLICATE, 3)         0.190.121.58xbox::Size_MatType_BorderType_ksize::(1280x720, 32SC1, BORDER_REPLICATE, 5)         0.240.151.60xbox::Size_MatType_BorderType_ksize::(1280x720, 8UC1, BORDER_CONSTANT, 3)           0.110.042.75xbox::Size_MatType_BorderType_ksize::(1280x720, 8UC1, BORDER_CONSTANT, 5)           0.130.071.86xbox::Size_MatType_BorderType_ksize::(1280x720, 8UC1, BORDER_REFLECT, 3)            0.110.042.75xbox::Size_MatType_BorderType_ksize::(1280x720, 8UC1, BORDER_REFLECT, 5)            0.130.071.86xbox::Size_MatType_BorderType_ksize::(1280x720, 8UC1, BORDER_REFLECT101, 3)         0.110.042.75xbox::Size_MatType_BorderType_ksize::(1280x720, 8UC1, BORDER_REFLECT101, 5)         0.130.062.17xbox::Size_MatType_BorderType_ksize::(1280x720, 8UC1, BORDER_REPLICATE, 3)          0.110.052.20xbox::Size_MatType_BorderType_ksize::(1280x720, 8UC1, BORDER_REPLICATE, 5)          0.130.071.86xbox::Size_MatType_BorderType_ksize::(320x240, 16SC1, BORDER_CONSTANT, 3)           0.010.011.00xbox::Size_MatType_BorderType_ksize::(320x240, 16SC1, BORDER_CONSTANT, 5)           0.020.012.00xbox::Size_MatType_BorderType_ksize::(320x240, 16SC1, BORDER_REFLECT, 3)            0.010.011.00xbox::Size_MatType_BorderType_ksize::(320x240, 16SC1, BORDER_REFLECT, 5)            0.020.012.00xbox::Size_MatType_BorderType_ksize::(320x240, 16SC1, BORDER_REFLECT101, 3)         0.010.011.00xbox::Size_MatType_BorderType_ksize::(320x240, 16SC1, BORDER_REFLECT101, 5)         0.020.012.00xbox::Size_MatType_BorderType_ksize::(320x240, 16SC1, BORDER_REPLICATE, 3)          0.010.011.00xbox::Size_MatType_BorderType_ksize::(320x240, 16SC1, BORDER_REPLICATE, 5)          0.020.012.00xbox::Size_MatType_BorderType_ksize::(320x240, 32FC1, BORDER_CONSTANT, 3)           0.020.021.00xbox::Size_MatType_BorderType_ksize::(320x240, 32FC1, BORDER_CONSTANT, 5)           0.040.022.00xbox::Size_MatType_BorderType_ksize::(320x240, 32FC1, BORDER_REFLECT, 3)            0.020.021.00xbox::Size_MatType_BorderType_ksize::(320x240, 32FC1, BORDER_REFLECT, 5)            0.040.022.00xbox::Size_MatType_BorderType_ksize::(320x240, 32FC1, BORDER_REFLECT101, 3)         0.020.021.00xbox::Size_MatType_BorderType_ksize::(320x240, 32FC1, BORDER_REFLECT101, 5)         0.040.022.00xbox::Size_MatType_BorderType_ksize::(320x240, 32FC1, BORDER_REPLICATE, 3)          0.020.021.00xbox::Size_MatType_BorderType_ksize::(320x240, 32FC1, BORDER_REPLICATE, 5)          0.030.021.50xbox::Size_MatType_BorderType_ksize::(320x240, 32FC3, BORDER_CONSTANT, 3)           0.060.051.20xbox::Size_MatType_BorderType_ksize::(320x240, 32FC3, BORDER_CONSTANT, 5)           0.1 0.061.67xbox::Size_MatType_BorderType_ksize::(320x240, 32FC3, BORDER_REFLECT, 3)            0.060.041.50xbox::Size_MatType_BorderType_ksize::(320x240, 32FC3, BORDER_REFLECT, 5)            0.1 0.052.00xbox::Size_MatType_BorderType_ksize::(320x240, 32FC3, BORDER_REFLECT101, 3)         0.060.051.20xbox::Size_MatType_BorderType_ksize::(320x240, 32FC3, BORDER_REFLECT101, 5)         0.1 0.061.67xbox::Size_MatType_BorderType_ksize::(320x240, 32FC3, BORDER_REPLICATE, 3)          0.060.051.20xbox::Size_MatType_BorderType_ksize::(320x240, 32FC3, BORDER_REPLICATE, 5)          0.1 0.052.00xbox::Size_MatType_BorderType_ksize::(320x240, 32SC1, BORDER_CONSTANT, 3)           0.020.012.00xbox::Size_MatType_BorderType_ksize::(320x240, 32SC1, BORDER_CONSTANT, 5)           0.020.012.00xbox::Size_MatType_BorderType_ksize::(320x240, 32SC1, BORDER_REFLECT, 3)            0.010.011.00xbox::Size_MatType_BorderType_ksize::(320x240, 32SC1, BORDER_REFLECT, 5)            0.020.012.00xbox::Size_MatType_BorderType_ksize::(320x240, 32SC1, BORDER_REFLECT101, 3)         0.020.012.00xbox::Size_MatType_BorderType_ksize::(320x240, 32SC1, BORDER_REFLECT101, 5)         0.020.012.00xbox::Size_MatType_BorderType_ksize::(320x240, 32SC1, BORDER_REPLICATE, 3)          0.010.011.00xbox::Size_MatType_BorderType_ksize::(320x240, 32SC1, BORDER_REPLICATE, 5)          0.020.012.00xbox::Size_MatType_BorderType_ksize::(320x240, 8UC1, BORDER_CONSTANT, 3)            0.010.011.00xbox::Size_MatType_BorderType_ksize::(320x240, 8UC1, BORDER_CONSTANT, 5)            0.010.011.00xbox::Size_MatType_BorderType_ksize::(320x240, 8UC1, BORDER_REFLECT, 3)             0.010.011.00xbox::Size_MatType_BorderType_ksize::(320x240, 8UC1, BORDER_REFLECT, 5)             0.010.011.00xbox::Size_MatType_BorderType_ksize::(320x240, 8UC1, BORDER_REFLECT101, 3)          0.010.011.00xbox::Size_MatType_BorderType_ksize::(320x240, 8UC1, BORDER_REFLECT101, 5)          0.010.011.00xbox::Size_MatType_BorderType_ksize::(320x240, 8UC1, BORDER_REPLICATE, 3)           0.010.011.00xbox::Size_MatType_BorderType_ksize::(320x240, 8UC1, BORDER_REPLICATE, 5)           0.010.011.00xbox::Size_MatType_BorderType_ksize::(640x480, 16SC1, BORDER_CONSTANT, 3)           0.040.031.33xbox::Size_MatType_BorderType_ksize::(640x480, 16SC1, BORDER_CONSTANT, 5)           0.060.041.50xbox::Size_MatType_BorderType_ksize::(640x480, 16SC1, BORDER_REFLECT, 3)            0.050.031.67xbox::Size_MatType_BorderType_ksize::(640x480, 16SC1, BORDER_REFLECT, 5)            0.060.041.50xbox::Size_MatType_BorderType_ksize::(640x480, 16SC1, BORDER_REFLECT101, 3)         0.050.031.67xbox::Size_MatType_BorderType_ksize::(640x480, 16SC1, BORDER_REFLECT101, 5)         0.060.041.50xbox::Size_MatType_BorderType_ksize::(640x480, 16SC1, BORDER_REPLICATE, 3)          0.050.031.67xbox::Size_MatType_BorderType_ksize::(640x480, 16SC1, BORDER_REPLICATE, 5)          0.060.041.50xbox::Size_MatType_BorderType_ksize::(640x480, 32FC1, BORDER_CONSTANT, 3)           0.090.061.50xbox::Size_MatType_BorderType_ksize::(640x480, 32FC1, BORDER_CONSTANT, 5)           0.130.071.86xbox::Size_MatType_BorderType_ksize::(640x480, 32FC1, BORDER_REFLECT, 3)            0.090.061.50xbox::Size_MatType_BorderType_ksize::(640x480, 32FC1, BORDER_REFLECT, 5)            0.130.071.86xbox::Size_MatType_BorderType_ksize::(640x480, 32FC1, BORDER_REFLECT101, 3)         0.080.061.33xbox::Size_MatType_BorderType_ksize::(640x480, 32FC1, BORDER_REFLECT101, 5)         0.130.071.86xbox::Size_MatType_BorderType_ksize::(640x480, 32FC1, BORDER_REPLICATE, 3)          0.090.061.50xbox::Size_MatType_BorderType_ksize::(640x480, 32FC1, BORDER_REPLICATE, 5)          0.130.071.86xbox::Size_MatType_BorderType_ksize::(640x480, 32FC3, BORDER_CONSTANT, 3)           0.250.161.56xbox::Size_MatType_BorderType_ksize::(640x480, 32FC3, BORDER_CONSTANT, 5)           0.370.211.76xbox::Size_MatType_BorderType_ksize::(640x480, 32FC3, BORDER_REFLECT, 3)            0.250.171.47xbox::Size_MatType_BorderType_ksize::(640x480, 32FC3, BORDER_REFLECT, 5)            0.370.211.76xbox::Size_MatType_BorderType_ksize::(640x480, 32FC3, BORDER_REFLECT101, 3)         0.250.161.56xbox::Size_MatType_BorderType_ksize::(640x480, 32FC3, BORDER_REFLECT101, 5)         0.370.2 1.85xbox::Size_MatType_BorderType_ksize::(640x480, 32FC3, BORDER_REPLICATE, 3)          0.250.171.47xbox::Size_MatType_BorderType_ksize::(640x480, 32FC3, BORDER_REPLICATE, 5)          0.370.211.76xbox::Size_MatType_BorderType_ksize::(640x480, 32SC1, BORDER_CONSTANT, 3)           0.060.041.50xbox::Size_MatType_BorderType_ksize::(640x480, 32SC1, BORDER_CONSTANT, 5)           0.070.051.40xbox::Size_MatType_BorderType_ksize::(640x480, 32SC1, BORDER_REFLECT, 3)            0.060.041.50xbox::Size_MatType_BorderType_ksize::(640x480, 32SC1, BORDER_REFLECT, 5)            0.070.051.40xbox::Size_MatType_BorderType_ksize::(640x480, 32SC1, BORDER_REFLECT101, 3)         0.060.041.50xbox::Size_MatType_BorderType_ksize::(640x480, 32SC1, BORDER_REFLECT101, 5)         0.070.051.40xbox::Size_MatType_BorderType_ksize::(640x480, 32SC1, BORDER_REPLICATE, 3)          0.050.041.25xbox::Size_MatType_BorderType_ksize::(640x480, 32SC1, BORDER_REPLICATE, 5)          0.070.051.40xbox::Size_MatType_BorderType_ksize::(640x480, 8UC1, BORDER_CONSTANT, 3)            0.040.022.00xbox::Size_MatType_BorderType_ksize::(640x480, 8UC1, BORDER_CONSTANT, 5)            0.050.031.67xbox::Size_MatType_BorderType_ksize::(640x480, 8UC1, BORDER_REFLECT, 3)             0.040.022.00xbox::Size_MatType_BorderType_ksize::(640x480, 8UC1, BORDER_REFLECT, 5)             0.050.031.67xbox::Size_MatType_BorderType_ksize::(640x480, 8UC1, BORDER_REFLECT101, 3)          0.040.022.00xbox::Size_MatType_BorderType_ksize::(640x480, 8UC1, BORDER_REFLECT101, 5)          0.050.031.67xbox::Size_MatType_BorderType_ksize::(640x480, 8UC1, BORDER_REPLICATE, 3)           0.040.022.00xbox::Size_MatType_BorderType_ksize::(640x480, 8UC1, BORDER_REPLICATE, 5)           0.050.031.67xbox_CV8U_CV16U::Size_ksize_BorderType::(1280x720, 15, BORDER_CONSTANT)             0.340.360.94xbox_CV8U_CV16U::Size_ksize_BorderType::(1280x720, 15, BORDER_REPLICATE)            0.340.370.92xbox_CV8U_CV16U::Size_ksize_BorderType::(1280x720, 3, BORDER_CONSTANT)              0.130.1 1.30xbox_CV8U_CV16U::Size_ksize_BorderType::(1280x720, 3, BORDER_REPLICATE)             0.130.111.18xbox_CV8U_CV16U::Size_ksize_BorderType::(1280x720, 5, BORDER_CONSTANT)              0.140.141.00xbox_CV8U_CV16U::Size_ksize_BorderType::(1280x720, 5, BORDER_REPLICATE)             0.150.131.15xbox_CV8U_CV16U::Size_ksize_BorderType::(320x240, 15, BORDER_CONSTANT)              0.040.041.00xbox_CV8U_CV16U::Size_ksize_BorderType::(320x240, 15, BORDER_REPLICATE)             0.030.040.75xbox_CV8U_CV16U::Size_ksize_BorderType::(320x240, 3, BORDER_CONSTANT)               0.010.020.50xbox_CV8U_CV16U::Size_ksize_BorderType::(320x240, 3, BORDER_REPLICATE)              0.010.011.00xbox_CV8U_CV16U::Size_ksize_BorderType::(320x240, 5, BORDER_CONSTANT)               0.010.020.50xbox_CV8U_CV16U::Size_ksize_BorderType::(320x240, 5, BORDER_REPLICATE)              0.010.020.50xbox_CV8U_CV16U::Size_ksize_BorderType::(640x480, 15, BORDER_CONSTANT)              0.120.121.00xbox_CV8U_CV16U::Size_ksize_BorderType::(640x480, 15, BORDER_REPLICATE)             0.120.130.92xbox_CV8U_CV16U::Size_ksize_BorderType::(640x480, 3, BORDER_CONSTANT)               0.040.041.00xbox_CV8U_CV16U::Size_ksize_BorderType::(640x480, 3, BORDER_REPLICATE)              0.040.041.00xbox_CV8U_CV16U::Size_ksize_BorderType::(640x480, 5, BORDER_CONSTANT)               0.050.051.00xbox_CV8U_CV16U::Size_ksize_BorderType::(640x480, 5, BORDER_REPLICATE)              0.050.051.00xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 16SC1, BORDER_CONSTANT, 3)  0.140.131.08xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 16SC1, BORDER_CONSTANT, 5)  0.180.151.20xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 16SC1, BORDER_REFLECT, 3)   0.150.131.15xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 16SC1, BORDER_REFLECT, 5)   0.180.161.13xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 16SC1, BORDER_REFLECT101, 3)0.140.131.08xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 16SC1, BORDER_REFLECT101, 5)0.180.161.13xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 16SC1, BORDER_REPLICATE, 3) 0.150.131.15xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 16SC1, BORDER_REPLICATE, 5) 0.180.151.20xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32FC1, BORDER_CONSTANT, 3)  0.290.171.71xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32FC1, BORDER_CONSTANT, 5)  0.410.261.58xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32FC1, BORDER_REFLECT, 3)   0.280.171.65xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32FC1, BORDER_REFLECT, 5)   0.410.261.58xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32FC1, BORDER_REFLECT101, 3)0.290.171.71xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32FC1, BORDER_REFLECT101, 5)0.410.271.52xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32FC1, BORDER_REPLICATE, 3) 0.290.171.71xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32FC1, BORDER_REPLICATE, 5) 0.410.261.58xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32FC3, BORDER_CONSTANT, 3)  0.860.6 1.43xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32FC3, BORDER_CONSTANT, 5)  1.210.871.39xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32FC3, BORDER_REFLECT, 3)   0.860.6 1.43xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32FC3, BORDER_REFLECT, 5)   1.220.861.42xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32FC3, BORDER_REFLECT101, 3)0.860.6 1.43xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32FC3, BORDER_REFLECT101, 5)1.210.871.39xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32FC3, BORDER_REPLICATE, 3) 0.860.591.46xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32FC3, BORDER_REPLICATE, 5) 1.220.871.40xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32SC1, BORDER_CONSTANT, 3)  0.180.171.06xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32SC1, BORDER_CONSTANT, 5)  0.220.191.16xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32SC1, BORDER_REFLECT, 3)   0.190.161.19xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32SC1, BORDER_REFLECT, 5)   0.230.2 1.15xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32SC1, BORDER_REFLECT101, 3)0.190.161.19xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32SC1, BORDER_REFLECT101, 5)0.230.191.21xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32SC1, BORDER_REPLICATE, 3) 0.190.161.19xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 32SC1, BORDER_REPLICATE, 5) 0.230.2 1.15xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 8UC1, BORDER_CONSTANT, 3)   0.110.071.57xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 8UC1, BORDER_CONSTANT, 5)   0.130.091.44xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 8UC1, BORDER_REFLECT, 3)    0.110.071.57xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 8UC1, BORDER_REFLECT, 5)    0.130.081.63xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 8UC1, BORDER_REFLECT101, 3) 0.110.071.57xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 8UC1, BORDER_REFLECT101, 5) 0.130.091.44xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 8UC1, BORDER_REPLICATE, 3)  0.110.071.57xbox_inplace::Size_MatType_BorderType_ksize::(1280x720, 8UC1, BORDER_REPLICATE, 5)  0.130.081.63xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 16SC1, BORDER_CONSTANT, 3)   0.010.011.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 16SC1, BORDER_CONSTANT, 5)   0.020.021.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 16SC1, BORDER_REFLECT, 3)    0.010.011.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 16SC1, BORDER_REFLECT, 5)    0.020.021.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 16SC1, BORDER_REFLECT101, 3) 0.010.011.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 16SC1, BORDER_REFLECT101, 5) 0.020.021.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 16SC1, BORDER_REPLICATE, 3)  0.010.011.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 16SC1, BORDER_REPLICATE, 5)  0.020.021.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32FC1, BORDER_CONSTANT, 3)   0.020.012.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32FC1, BORDER_CONSTANT, 5)   0.030.021.50xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32FC1, BORDER_REFLECT, 3)    0.020.012.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32FC1, BORDER_REFLECT, 5)    0.030.021.50xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32FC1, BORDER_REFLECT101, 3) 0.020.012.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32FC1, BORDER_REFLECT101, 5) 0.030.021.50xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32FC1, BORDER_REPLICATE, 3)  0.020.012.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32FC1, BORDER_REPLICATE, 5)  0.030.021.50xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32FC3, BORDER_CONSTANT, 3)   0.070.041.75xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32FC3, BORDER_CONSTANT, 5)   0.1 0.071.43xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32FC3, BORDER_REFLECT, 3)    0.070.041.75xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32FC3, BORDER_REFLECT, 5)    0.1 0.071.43xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32FC3, BORDER_REFLECT101, 3) 0.070.041.75xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32FC3, BORDER_REFLECT101, 5) 0.1 0.071.43xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32FC3, BORDER_REPLICATE, 3)  0.070.041.75xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32FC3, BORDER_REPLICATE, 5)  0.1 0.071.43xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32SC1, BORDER_CONSTANT, 3)   0.010.020.50xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32SC1, BORDER_CONSTANT, 5)   0.020.021.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32SC1, BORDER_REFLECT, 3)    0.010.020.50xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32SC1, BORDER_REFLECT, 5)    0.020.021.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32SC1, BORDER_REFLECT101, 3) 0.010.020.50xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32SC1, BORDER_REFLECT101, 5) 0.020.021.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32SC1, BORDER_REPLICATE, 3)  0.010.020.50xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 32SC1, BORDER_REPLICATE, 5)  0.020.021.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 8UC1, BORDER_CONSTANT, 3)    0.010.011.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 8UC1, BORDER_CONSTANT, 5)    0.010.011.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 8UC1, BORDER_REFLECT, 3)     0.010.011.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 8UC1, BORDER_REFLECT, 5)     0.010.011.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 8UC1, BORDER_REFLECT101, 3)  0.010.011.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 8UC1, BORDER_REFLECT101, 5)  0.010.011.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 8UC1, BORDER_REPLICATE, 3)   0.010.011.00xbox_inplace::Size_MatType_BorderType_ksize::(320x240, 8UC1, BORDER_REPLICATE, 5)   0.010.011.00xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 16SC1, BORDER_CONSTANT, 3)   0.040.050.80xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 16SC1, BORDER_CONSTANT, 5)   0.060.051.20xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 16SC1, BORDER_REFLECT, 3)    0.040.050.80xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 16SC1, BORDER_REFLECT, 5)    0.060.051.20xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 16SC1, BORDER_REFLECT101, 3) 0.050.051.00xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 16SC1, BORDER_REFLECT101, 5) 0.060.061.00xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 16SC1, BORDER_REPLICATE, 3)  0.050.051.00xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 16SC1, BORDER_REPLICATE, 5)  0.060.051.20xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32FC1, BORDER_CONSTANT, 3)   0.1 0.061.67xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32FC1, BORDER_CONSTANT, 5)   0.140.091.56xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32FC1, BORDER_REFLECT, 3)    0.090.061.50xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32FC1, BORDER_REFLECT, 5)    0.130.091.44xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32FC1, BORDER_REFLECT101, 3) 0.1 0.061.67xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32FC1, BORDER_REFLECT101, 5) 0.140.091.56xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32FC1, BORDER_REPLICATE, 3)  0.090.051.80xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32FC1, BORDER_REPLICATE, 5)  0.130.091.44xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32FC3, BORDER_CONSTANT, 3)   0.3 0.2 1.50xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32FC3, BORDER_CONSTANT, 5)   0.410.271.52xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32FC3, BORDER_REFLECT, 3)    0.290.2 1.45xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32FC3, BORDER_REFLECT, 5)    0.410.281.46xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32FC3, BORDER_REFLECT101, 3) 0.290.191.53xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32FC3, BORDER_REFLECT101, 5) 0.410.271.52xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32FC3, BORDER_REPLICATE, 3)  0.290.2 1.45xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32FC3, BORDER_REPLICATE, 5)  0.410.271.52xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32SC1, BORDER_CONSTANT, 3)   0.050.051.00xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32SC1, BORDER_CONSTANT, 5)   0.070.061.17xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32SC1, BORDER_REFLECT, 3)    0.060.051.20xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32SC1, BORDER_REFLECT, 5)    0.070.061.17xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32SC1, BORDER_REFLECT101, 3) 0.060.051.20xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32SC1, BORDER_REFLECT101, 5) 0.070.061.17xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32SC1, BORDER_REPLICATE, 3)  0.060.051.20xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 32SC1, BORDER_REPLICATE, 5)  0.070.061.17xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 8UC1, BORDER_CONSTANT, 3)    0.040.031.33xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 8UC1, BORDER_CONSTANT, 5)    0.050.041.25xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 8UC1, BORDER_REFLECT, 3)     0.040.031.33xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 8UC1, BORDER_REFLECT, 5)     0.050.041.25xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 8UC1, BORDER_REFLECT101, 3)  0.040.031.33xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 8UC1, BORDER_REFLECT101, 5)  0.050.041.25xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 8UC1, BORDER_REPLICATE, 3)   0.040.031.33xbox_inplace::Size_MatType_BorderType_ksize::(640x480, 8UC1, BORDER_REPLICATE, 5)   0.050.041.25x

@madanm3
Copy link
ContributorAuthor

madanm3 commentedDec 18, 2025
edited
Loading

Tested on AMD-Turin:                                                                            (Median time in ms)                            Name of Test                     ref  patch  ref/patch                                                                                           (gain in x)blur16x16::Size_MatType_BorderType::(1280x720, 16SC1, BORDER_CONSTANT)   0.440.5 0.88xblur16x16::Size_MatType_BorderType::(1280x720, 16SC1, BORDER_REFLECT)    0.450.490.92xblur16x16::Size_MatType_BorderType::(1280x720, 16SC1, BORDER_REFLECT101) 0.460.490.94xblur16x16::Size_MatType_BorderType::(1280x720, 16SC1, BORDER_REPLICATE)  0.450.5 0.90xblur16x16::Size_MatType_BorderType::(1280x720, 16UC1, BORDER_CONSTANT)   0.360.380.95xblur16x16::Size_MatType_BorderType::(1280x720, 16UC1, BORDER_REFLECT)    0.360.370.97xblur16x16::Size_MatType_BorderType::(1280x720, 16UC1, BORDER_REFLECT101) 0.370.371.00xblur16x16::Size_MatType_BorderType::(1280x720, 16UC1, BORDER_REPLICATE)  0.370.380.97xblur16x16::Size_MatType_BorderType::(1280x720, 32FC1, BORDER_CONSTANT)   0.710.7 1.01xblur16x16::Size_MatType_BorderType::(1280x720, 32FC1, BORDER_REFLECT)    0.730.7 1.04xblur16x16::Size_MatType_BorderType::(1280x720, 32FC1, BORDER_REFLECT101) 0.720.711.01xblur16x16::Size_MatType_BorderType::(1280x720, 32FC1, BORDER_REPLICATE)  0.730.711.03xblur16x16::Size_MatType_BorderType::(1280x720, 8UC1, BORDER_CONSTANT)    0.320.350.91xblur16x16::Size_MatType_BorderType::(1280x720, 8UC1, BORDER_REFLECT)     0.320.350.91xblur16x16::Size_MatType_BorderType::(1280x720, 8UC1, BORDER_REFLECT101)  0.320.350.91xblur16x16::Size_MatType_BorderType::(1280x720, 8UC1, BORDER_REPLICATE)   0.320.350.91xblur16x16::Size_MatType_BorderType::(1280x720, 8UC4, BORDER_CONSTANT)    1.111.410.79xblur16x16::Size_MatType_BorderType::(1280x720, 8UC4, BORDER_REFLECT)     1.1 1.420.77xblur16x16::Size_MatType_BorderType::(1280x720, 8UC4, BORDER_REFLECT101)  1.111.420.78xblur16x16::Size_MatType_BorderType::(1280x720, 8UC4, BORDER_REPLICATE)   1.111.430.78xblur16x16::Size_MatType_BorderType::(640x480, 16SC1, BORDER_CONSTANT)    0.150.170.88xblur16x16::Size_MatType_BorderType::(640x480, 16SC1, BORDER_REFLECT)     0.160.170.94xblur16x16::Size_MatType_BorderType::(640x480, 16SC1, BORDER_REFLECT101)  0.160.170.94xblur16x16::Size_MatType_BorderType::(640x480, 16SC1, BORDER_REPLICATE)   0.160.170.94xblur16x16::Size_MatType_BorderType::(640x480, 16UC1, BORDER_CONSTANT)    0.130.121.08xblur16x16::Size_MatType_BorderType::(640x480, 16UC1, BORDER_REFLECT)     0.130.131.00xblur16x16::Size_MatType_BorderType::(640x480, 16UC1, BORDER_REFLECT101)  0.130.131.00xblur16x16::Size_MatType_BorderType::(640x480, 16UC1, BORDER_REPLICATE)   0.130.131.00xblur16x16::Size_MatType_BorderType::(640x480, 32FC1, BORDER_CONSTANT)    0.230.231.00xblur16x16::Size_MatType_BorderType::(640x480, 32FC1, BORDER_REFLECT)     0.240.231.04xblur16x16::Size_MatType_BorderType::(640x480, 32FC1, BORDER_REFLECT101)  0.240.231.04xblur16x16::Size_MatType_BorderType::(640x480, 32FC1, BORDER_REPLICATE)   0.240.231.04xblur16x16::Size_MatType_BorderType::(640x480, 8UC1, BORDER_CONSTANT)     0.110.130.85xblur16x16::Size_MatType_BorderType::(640x480, 8UC1, BORDER_REFLECT)      0.120.130.92xblur16x16::Size_MatType_BorderType::(640x480, 8UC1, BORDER_REFLECT101)   0.110.130.85xblur16x16::Size_MatType_BorderType::(640x480, 8UC1, BORDER_REPLICATE)    0.120.130.92xblur16x16::Size_MatType_BorderType::(640x480, 8UC4, BORDER_CONSTANT)     0.370.480.77xblur16x16::Size_MatType_BorderType::(640x480, 8UC4, BORDER_REFLECT)      0.380.490.78xblur16x16::Size_MatType_BorderType::(640x480, 8UC4, BORDER_REFLECT101)   0.380.490.78xblur16x16::Size_MatType_BorderType::(640x480, 8UC4, BORDER_REPLICATE)    0.380.490.78xblur3x3::Size_MatType_BorderType3x3::(1280x720, 16SC1, BORDER_CONSTANT)  0.140.081.75xblur3x3::Size_MatType_BorderType3x3::(1280x720, 16SC1, BORDER_REPLICATE) 0.140.081.75xblur3x3::Size_MatType_BorderType3x3::(1280x720, 16UC1, BORDER_CONSTANT)  0.150.091.67xblur3x3::Size_MatType_BorderType3x3::(1280x720, 16UC1, BORDER_REPLICATE) 0.150.1 1.50xblur3x3::Size_MatType_BorderType3x3::(1280x720, 32FC1, BORDER_CONSTANT)  0.260.181.44xblur3x3::Size_MatType_BorderType3x3::(1280x720, 32FC1, BORDER_REPLICATE) 0.260.181.44xblur3x3::Size_MatType_BorderType3x3::(1280x720, 8UC1, BORDER_CONSTANT)   0.1 0.071.43xblur3x3::Size_MatType_BorderType3x3::(1280x720, 8UC1, BORDER_REPLICATE)  0.1 0.071.43xblur3x3::Size_MatType_BorderType3x3::(1280x720, 8UC4, BORDER_CONSTANT)   0.390.271.44xblur3x3::Size_MatType_BorderType3x3::(1280x720, 8UC4, BORDER_REPLICATE)  0.370.271.37xblur5x5::Size_MatType_BorderType::(1280x720, 16SC1, BORDER_CONSTANT)     0.210.121.75xblur5x5::Size_MatType_BorderType::(1280x720, 16SC1, BORDER_REFLECT)      0.190.131.46xblur5x5::Size_MatType_BorderType::(1280x720, 16SC1, BORDER_REFLECT101)   0.2 0.131.54xblur5x5::Size_MatType_BorderType::(1280x720, 16SC1, BORDER_REPLICATE)    0.190.131.46xblur5x5::Size_MatType_BorderType::(1280x720, 16UC1, BORDER_CONSTANT)     0.2 0.131.54xblur5x5::Size_MatType_BorderType::(1280x720, 16UC1, BORDER_REFLECT)      0.190.141.36xblur5x5::Size_MatType_BorderType::(1280x720, 16UC1, BORDER_REFLECT101)   0.2 0.141.43xblur5x5::Size_MatType_BorderType::(1280x720, 16UC1, BORDER_REPLICATE)    0.190.141.36xblur5x5::Size_MatType_BorderType::(1280x720, 32FC1, BORDER_CONSTANT)     0.360.211.71xblur5x5::Size_MatType_BorderType::(1280x720, 32FC1, BORDER_REFLECT)      0.370.211.76xblur5x5::Size_MatType_BorderType::(1280x720, 32FC1, BORDER_REFLECT101)   0.370.211.76xblur5x5::Size_MatType_BorderType::(1280x720, 32FC1, BORDER_REPLICATE)    0.370.191.95xblur5x5::Size_MatType_BorderType::(1280x720, 32FC3, BORDER_CONSTANT)     1.050.621.69xblur5x5::Size_MatType_BorderType::(1280x720, 32FC3, BORDER_REFLECT)      1.060.621.71xblur5x5::Size_MatType_BorderType::(1280x720, 32FC3, BORDER_REFLECT101)   1.060.591.80xblur5x5::Size_MatType_BorderType::(1280x720, 32FC3, BORDER_REPLICATE)    1.060.631.68xblur5x5::Size_MatType_BorderType::(1280x720, 8UC1, BORDER_CONSTANT)      0.120.1 1.20xblur5x5::Size_MatType_BorderType::(1280x720, 8UC1, BORDER_REFLECT)       0.120.1 1.20xblur5x5::Size_MatType_BorderType::(1280x720, 8UC1, BORDER_REFLECT101)    0.120.1 1.20xblur5x5::Size_MatType_BorderType::(1280x720, 8UC1, BORDER_REPLICATE)     0.120.1 1.20xblur5x5::Size_MatType_BorderType::(1280x720, 8UC4, BORDER_CONSTANT)      0.460.411.12xblur5x5::Size_MatType_BorderType::(1280x720, 8UC4, BORDER_REFLECT)       0.460.411.12xblur5x5::Size_MatType_BorderType::(1280x720, 8UC4, BORDER_REFLECT101)    0.470.411.15xblur5x5::Size_MatType_BorderType::(1280x720, 8UC4, BORDER_REPLICATE)     0.460.411.12xblur5x5::Size_MatType_BorderType::(640x480, 16SC1, BORDER_CONSTANT)      0.060.051.20xblur5x5::Size_MatType_BorderType::(640x480, 16SC1, BORDER_REFLECT)       0.060.051.20xblur5x5::Size_MatType_BorderType::(640x480, 16SC1, BORDER_REFLECT101)    0.060.051.20xblur5x5::Size_MatType_BorderType::(640x480, 16SC1, BORDER_REPLICATE)     0.060.051.20xblur5x5::Size_MatType_BorderType::(640x480, 16UC1, BORDER_CONSTANT)      0.060.051.20xblur5x5::Size_MatType_BorderType::(640x480, 16UC1, BORDER_REFLECT)       0.060.051.20xblur5x5::Size_MatType_BorderType::(640x480, 16UC1, BORDER_REFLECT101)    0.060.051.20xblur5x5::Size_MatType_BorderType::(640x480, 16UC1, BORDER_REPLICATE)     0.060.051.20xblur5x5::Size_MatType_BorderType::(640x480, 32FC1, BORDER_CONSTANT)      0.130.081.63xblur5x5::Size_MatType_BorderType::(640x480, 32FC1, BORDER_REFLECT)       0.130.071.86xblur5x5::Size_MatType_BorderType::(640x480, 32FC1, BORDER_REFLECT101)    0.130.081.63xblur5x5::Size_MatType_BorderType::(640x480, 32FC1, BORDER_REPLICATE)     0.130.071.86xblur5x5::Size_MatType_BorderType::(640x480, 32FC3, BORDER_CONSTANT)      0.360.211.71xblur5x5::Size_MatType_BorderType::(640x480, 32FC3, BORDER_REFLECT)       0.370.2 1.85xblur5x5::Size_MatType_BorderType::(640x480, 32FC3, BORDER_REFLECT101)    0.370.191.95xblur5x5::Size_MatType_BorderType::(640x480, 32FC3, BORDER_REPLICATE)     0.370.211.76xblur5x5::Size_MatType_BorderType::(640x480, 8UC1, BORDER_CONSTANT)       0.040.041.00xblur5x5::Size_MatType_BorderType::(640x480, 8UC1, BORDER_REFLECT)        0.040.041.00xblur5x5::Size_MatType_BorderType::(640x480, 8UC1, BORDER_REFLECT101)     0.040.041.00xblur5x5::Size_MatType_BorderType::(640x480, 8UC1, BORDER_REPLICATE)      0.040.041.00xblur5x5::Size_MatType_BorderType::(640x480, 8UC4, BORDER_CONSTANT)       0.160.141.14xblur5x5::Size_MatType_BorderType::(640x480, 8UC4, BORDER_REFLECT)        0.160.141.14xblur5x5::Size_MatType_BorderType::(640x480, 8UC4, BORDER_REFLECT101)     0.160.141.14xblur5x5::Size_MatType_BorderType::(640x480, 8UC4, BORDER_REPLICATE)      0.160.151.07xstackblur101x101::Size_MatType::(1280x720, 16SC1)                        1.010.941.07xstackblur101x101::Size_MatType::(1280x720, 16UC1)                        0.870.861.01xstackblur101x101::Size_MatType::(1280x720, 32FC1)                        2.942.9 1.01xstackblur101x101::Size_MatType::(1280x720, 8UC1)                         0.960.970.99xstackblur101x101::Size_MatType::(1280x720, 8UC4)                         3.463.461.00xstackblur101x101::Size_MatType::(1920x1080, 16SC1)                       2.262.191.03xstackblur101x101::Size_MatType::(1920x1080, 16UC1)                       1.861.861.00xstackblur101x101::Size_MatType::(1920x1080, 32FC1)                       6.556.481.01xstackblur101x101::Size_MatType::(1920x1080, 8UC1)                        2.072.071.00xstackblur101x101::Size_MatType::(1920x1080, 8UC4)                        7.357.281.01xstackblur101x101::Size_MatType::(3840x2160, 16SC1)                       8.928.751.02xstackblur101x101::Size_MatType::(3840x2160, 16UC1)                       7.357.281.01xstackblur101x101::Size_MatType::(3840x2160, 32FC1)                       27.227.340.99xstackblur101x101::Size_MatType::(3840x2160, 8UC1)                        7.937.881.01xstackblur101x101::Size_MatType::(3840x2160, 8UC4)                        29.729.351.01xstackblur3x3::Size_MatType::(1280x720, 16SC1)                            0.230.240.96xstackblur3x3::Size_MatType::(1280x720, 16UC1)                            0.240.241.00xstackblur3x3::Size_MatType::(1280x720, 32FC1)                            0.260.261.00xstackblur3x3::Size_MatType::(1280x720, 8UC1)                             0.270.261.04xstackblur3x3::Size_MatType::(1280x720, 8UC4)                             1   0.991.01xstackblur3x3::Size_MatType::(1920x1080, 16SC1)                           0.530.521.02xstackblur3x3::Size_MatType::(1920x1080, 16UC1)                           0.530.531.00xstackblur3x3::Size_MatType::(1920x1080, 32FC1)                           0.530.531.00xstackblur3x3::Size_MatType::(1920x1080, 8UC1)                            0.580.571.02xstackblur3x3::Size_MatType::(1920x1080, 8UC4)                            2.262.271.00xstackblur3x3::Size_MatType::(3840x2160, 16SC1)                           1.992.010.99xstackblur3x3::Size_MatType::(3840x2160, 16UC1)                           2.332.360.99xstackblur3x3::Size_MatType::(3840x2160, 32FC1)                           3.183.220.99xstackblur3x3::Size_MatType::(3840x2160, 8UC1)                            2.292.291.00xstackblur3x3::Size_MatType::(3840x2160, 8UC4)                            8.6 8.6 1.00x

@asmorkalov
Copy link
Contributor

The first version of the patch affects accuracy. I see the following perf test failures:

[  FAILED  ] 8 tests, listed below:[  FAILED  ] OCL_HoughLinesFixture_HoughLines.HoughLines/0, where GetParam() = (640x480, 0.1, 0.0174533)[  FAILED  ] OCL_HoughLinesFixture_HoughLines.HoughLines/1, where GetParam() = (640x480, 0.1, 0.1)[  FAILED  ] OCL_HoughLinesFixture_HoughLines.HoughLines/4, where GetParam() = (1280x720, 0.1, 0.0174533)[  FAILED  ] OCL_HoughLinesFixture_HoughLines.HoughLines/5, where GetParam() = (1280x720, 0.1, 0.1)[  FAILED  ] OCL_HoughLinesFixture_HoughLines.HoughLines/8, where GetParam() = (1920x1080, 0.1, 0.0174533)[  FAILED  ] OCL_HoughLinesFixture_HoughLines.HoughLines/9, where GetParam() = (1920x1080, 0.1, 0.1)[  FAILED  ] OCL_HoughLinesFixture_HoughLines.HoughLines/12, where GetParam() = (3840x2160, 0.1, 0.0174533)[  FAILED  ] OCL_HoughLinesFixture_HoughLines.HoughLines/13, where GetParam() = (3840x2160, 0.1, 0.1)

Example:

[==========] Running 1 test from 1 test case.[----------] Global test environment set-up.[----------] 1 test from OCL_HoughLinesFixture_HoughLines[ RUN      ] OCL_HoughLinesFixture_HoughLines.HoughLines/0, where GetParam() = (640x480, 0.1, 0.0174533)/home/alexander/opencv/modules/ts/src/ts_perf.cpp:567: FailureFailed  Difference (=0.0500030517578125) between argument1 "result" and expected value is greater than 9.9999999999999995e-07params    = (640x480, 0.1, 0.0174533)termination reason:  unknownbytesIn   =     307200bytesOut  =          8samples   =         25 of 100outliers  =          2frequency = 1000000000min       =    8067566 = 8.07msmedian    =    8194967 = 8.19msgmean     =    8192538 = 8.19msgstddev   = 0.01127785 = 0.55ms for 97% dispersion intervalmean      =    8193037 = 8.19msstddev    =      92563 = 0.09ms[  FAILED  ] OCL_HoughLinesFixture_HoughLines.HoughLines/0, where GetParam() = (640x480, 0.1, 0.0174533) (218 ms)[----------] 1 test from OCL_HoughLinesFixture_HoughLines (218 ms total)[----------] Global test environment tear-down[==========] 1 test from 1 test case ran. (218 ms total)[  PASSED  ] 0 tests.[  FAILED  ] 1 test, listed below:[  FAILED  ] OCL_HoughLinesFixture_HoughLines.HoughLines/0, where GetParam() = (640x480, 0.1, 0.0174533) 1 FAILED TESTalexander@asmorkalov-pc:~/opencv-build-2$ ./bin/opencv_perf_imgproc --gtest_filter=OCL_HoughLinesFixture_HoughLines.HoughLines/1TEST: Skip tests with tags: 'mem_6gb', 'verylong'CTEST_FULL_OUTPUTOpenCV version: 4.13.0-devOpenCV VCS version: 4.12.0-233-g5ed59ee59fBuild type: ReleaseCompiler: /usr/bin/c++  (ver 11.4.0)Algorithm hint: ALGO_HINT_ACCURATEHAL: YES (ipp (ver 0.0.1))Parallel framework: pthreads (nthreads=16)CPU features: SSE SSE2 SSE3 *SSE4.1 *SSE4.2 *AVX *FP16 *AVX2 *AVX512-SKX?Intel(R) IPP version: ippIP AVX2 (l9) 2022.2.0 (-) Jul 30 2025Intel(R) IPP features code: 0x8000OpenCL is disabledNote: Google Test filter = OCL_HoughLinesFixture_HoughLines.HoughLines/1[==========] Running 1 test from 1 test case.[----------] Global test environment set-up.[----------] 1 test from OCL_HoughLinesFixture_HoughLines[ RUN      ] OCL_HoughLinesFixture_HoughLines.HoughLines/1, where GetParam() = (640x480, 0.1, 0.1) Expected: [100, 0; 200, 0; 400, 0] Actual:[99.950005, 0; 199.95, 0; 399.95001, 0]/home/alexander/opencv/modules/ts/src/ts_perf.cpp:567: FailureFailed  Difference (=0.0500030517578125) between argument1 "result" and expected value is greater than 9.9999999999999995e-07params    = (640x480, 0.1, 0.1)termination reason:  unknownbytesIn   =     307200bytesOut  =          8samples   =         25 of 100outliers  =          2frequency = 1000000000min       =    1385366 = 1.39msmedian    =    1391437 = 1.39msgmean     =    1391003 = 1.39msgstddev   = 0.00320729 = 0.03ms for 97% dispersion intervalmean      =    1391010 = 1.39msstddev    =       4466 = 0.00ms[  FAILED  ] OCL_HoughLinesFixture_HoughLines.HoughLines/1, where GetParam() = (640x480, 0.1, 0.1) (38 ms)[----------] 1 test from OCL_HoughLinesFixture_HoughLines (38 ms total)[----------] Global test environment tear-down[==========] 1 test from 1 test case ran. (38 ms total)[  PASSED  ] 0 tests.[  FAILED  ] 1 test, listed below:[  FAILED  ] OCL_HoughLinesFixture_HoughLines.HoughLines/1, where GetParam() = (640x480, 0.1, 0.1)
madanm3 reacted with eyes emoji

@asmorkalov
Copy link
Contributor

Also I see several performance degradation on my AMD Ryzen 7 2700X like this:

blur3x3::Size_MatType_BorderType3x3::(127x61, 8UC1, BORDER_CONSTANT) 0.008 0.015 0.50blur3x3::Size_MatType_BorderType3x3::(127x61, 8UC1, BORDER_REPLICATE) 0.007 0.015 0.49blur3x3::Size_MatType_BorderType3x3::(127x61, 16UC1, BORDER_CONSTANT) 0.007 0.009 0.74blur3x3::Size_MatType_BorderType3x3::(127x61, 16UC1, BORDER_REPLICATE) 0.007 0.009 0.75blur3x3::Size_MatType_BorderType3x3::(127x61, 16SC1, BORDER_CONSTANT) 0.007 0.009 0.75blur3x3::Size_MatType_BorderType3x3::(127x61, 16SC1, BORDER_REPLICATE) 0.006 0.009 0.73blur3x3::Size_MatType_BorderType3x3::(127x61, 32FC1, BORDER_CONSTANT) 0.008 0.010 0.85blur3x3::Size_MatType_BorderType3x3::(127x61, 32FC1, BORDER_REPLICATE) 0.008 0.009 0.86blur3x3::Size_MatType_BorderType3x3::(127x61, 8UC4, BORDER_CONSTANT) 0.013 0.020 0.66blur3x3::Size_MatType_BorderType3x3::(127x61, 8UC4, BORDER_REPLICATE) 0.015 0.020 0.75blur3x3::Size_MatType_BorderType3x3::(320x240, 8UC1, BORDER_CONSTANT) 0.030 0.044 0.70blur3x3::Size_MatType_BorderType3x3::(320x240, 8UC1, BORDER_REPLICATE) 0.031 0.044 0.70blur3x3::Size_MatType_BorderType3x3::(320x240, 16UC1, BORDER_CONSTANT) 0.042 0.053 0.79blur3x3::Size_MatType_BorderType3x3::(320x240, 16UC1, BORDER_REPLICATE) 0.041 0.053 0.79blur3x3::Size_MatType_BorderType3x3::(320x240, 16SC1, BORDER_CONSTANT) 0.038 0.052 0.74blur3x3::Size_MatType_BorderType3x3::(320x240, 16SC1, BORDER_REPLICATE) 0.038 0.051 0.74blur3x3::Size_MatType_BorderType3x3::(320x240, 32FC1, BORDER_CONSTANT) 0.064 0.060 1.05blur3x3::Size_MatType_BorderType3x3::(320x240, 32FC1, BORDER_REPLICATE) 0.062 0.059 1.05blur3x3::Size_MatType_BorderType3x3::(320x240, 8UC4, BORDER_CONSTANT) 0.102 0.129 0.79blur3x3::Size_MatType_BorderType3x3::(320x240, 8UC4, BORDER_REPLICATE) 0.101 0.128 0.79blur3x3::Size_MatType_BorderType3x3::(640x480, 8UC1, BORDER_CONSTANT) 0.106 0.128 0.83blur3x3::Size_MatType_BorderType3x3::(640x480, 8UC1, BORDER_REPLICATE) 0.105 0.128 0.82blur3x3::Size_MatType_BorderType3x3::(640x480, 16UC1, BORDER_CONSTANT) 0.154 0.186 0.83blur3x3::Size_MatType_BorderType3x3::(640x480, 16UC1, BORDER_REPLICATE) 0.155 0.186 0.83blur3x3::Size_MatType_BorderType3x3::(640x480, 16SC1, BORDER_CONSTANT) 0.142 0.178 0.80blur3x3::Size_MatType_BorderType3x3::(640x480, 16SC1, BORDER_REPLICATE) 0.142 0.178 0.80blur3x3::Size_MatType_BorderType3x3::(640x480, 32FC1, BORDER_CONSTANT) 0.234 0.201 1.16blur3x3::Size_MatType_BorderType3x3::(640x480, 32FC1, BORDER_REPLICATE) 0.232 0.200 1.16blur3x3::Size_MatType_BorderType3x3::(640x480, 8UC4, BORDER_CONSTANT) 0.379 0.465 0.81blur3x3::Size_MatType_BorderType3x3::(640x480, 8UC4, BORDER_REPLICATE) 0.378 0.463 0.82blur3x3::Size_MatType_BorderType3x3::(1280x720, 8UC1, BORDER_CONSTANT) 0.300 0.349 0.86blur3x3::Size_MatType_BorderType3x3::(1280x720, 8UC1, BORDER_REPLICATE) 0.299 0.346 0.86blur3x3::Size_MatType_BorderType3x3::(1280x720, 16UC1, BORDER_CONSTANT) 0.436 0.549 0.79blur3x3::Size_MatType_BorderType3x3::(1280x720, 16UC1, BORDER_REPLICATE) 0.436 0.551 0.79blur3x3::Size_MatType_BorderType3x3::(1280x720, 16SC1, BORDER_CONSTANT) 0.400 0.499 0.80blur3x3::Size_MatType_BorderType3x3::(1280x720, 16SC1, BORDER_REPLICATE) 0.397 0.498 0.80blur3x3::Size_MatType_BorderType3x3::(1280x720, 32FC1, BORDER_CONSTANT) 0.690 0.599 1.15blur3x3::Size_MatType_BorderType3x3::(1280x720, 32FC1, BORDER_REPLICATE) 0.690 0.578 1.19blur3x3::Size_MatType_BorderType3x3::(1280x720, 8UC4, BORDER_CONSTANT) 1.195 1.343 0.89blur3x3::Size_MatType_BorderType3x3::(1280x720, 8UC4, BORDER_REPLICATE) 1.191 1.339 0.89
CornerMinEigenVal::OCL_CornerMinEigenValFixture::(640x480, 8UC1) 2.040 2.502 0.82CornerMinEigenVal::OCL_CornerMinEigenValFixture::(640x480, 32FC1) 2.169 2.602 0.83CornerMinEigenVal::OCL_CornerMinEigenValFixture::(1280x720, 8UC1) 7.318 8.776 0.83CornerMinEigenVal::OCL_CornerMinEigenValFixture::(1280x720, 32FC1) 7.688 8.942 0.86CornerMinEigenVal::OCL_CornerMinEigenValFixture::(1920x1080, 8UC1) 16.327 19.487 0.84CornerMinEigenVal::OCL_CornerMinEigenValFixture::(1920x1080, 32FC1) 16.917 19.850 0.85CornerMinEigenVal::OCL_CornerMinEigenValFixture::(3840x2160, 8UC1) 108.483 120.385 0.90CornerMinEigenVal::OCL_CornerMinEigenValFixture::(3840x2160, 32FC1) 109.921 120.955 0.91

The regressions are stable, I see it in several invocations.

@madanm3
Copy link
ContributorAuthor

0.062 0.059 1.05
blur3x3::Size_MatType_BorderType3x3::(320x240, 8UC4, BORDER_CONSTANT) 0.102 0.129 0.79
blur3x3::Size_MatType_BorderType3x3::(320x240, 8UC4, BORDER_REPLICATE) 0.101 0.128 0.79
blur3x3::Size_MatType_BorderType3x3::(640x480, 8UC1, BORDER_CONSTANT) 0.106 0.128 0.83
blur3x3::Size_MatType_BorderType3x3::(640x480, 8UC1, BORDER_REPLICATE) 0.105 0.128 0.82
blur3x3::Size_MatType_BorderType3x3::(640x480, 16UC1, BORDER_CONSTANT) 0.154 0.186 0.83
blur3x3::Size_MatType_BorderType3x3::(640x480, 16UC1, BORDER_REPLICATE) 0.155 0.186 0.83
blur3x3::Size_MatType_BorderType3x3::(640x480, 16SC1, BORDER_CONSTANT) 0.142 0.178 0.80
blur3x3::Size_MatType_BorderType3x3::(640x480, 16SC1, BORDER_REPLICATE) 0.142 0.178 0.80
blur3x3::Size_MatType_BorderType3x3::(640x480, 32FC1, BORDER_CONSTANT) 0.234 0.201 1.16
blur3x3::Size_MatType_BorderType3x3::(640x480, 32FC1, BORDER_REPLICATE) 0.232 0.200 1.16
blur3x3::Size_MatType_BorderType3x3::(640x480, 8UC4, BORDER_CONSTANT) 0.379 0.465 0.81
blur3x3::Size_MatType_BorderType3x3::(640x480, 8UC4, BORDER_REPLICATE) 0.378 0.463 0.82
blur3x3::Size_MatType_BorderType3x3::(1280x720, 8UC1, BORDER_CONSTANT) 0.300 0.349 0.86
blur3x3::Size_MatType_BorderType3x3::(1280x720, 8UC1, BORDER_REPLICATE) 0.299 0.346 0.86
blur3x3::Size_MatType_BorderType3x3::(1280x720, 16UC1, BORDER_CONSTANT) 0.436 0.549 0.79
blur3x3::Size_MatType_BorderType3x3::(1280x720, 16UC1, BORDER_REPLICATE) 0.436 0.551 0.79
blur3x3::Size_MatType_BorderType3x3::(1280x720, 16SC1, BORDER_CONSTANT) 0.400 0.499 0.80
blur3x3::Size_MatType_BorderType3x3::(1280x720, 16SC1, BORDER_REPLICATE) 0.397 0.498 0.80
blur3x3::Size_MatType_BorderType3x3::(1280x720, 32FC1, BORDER_CONSTANT) 0.690 0.599 1.15
blur3x3::Size_MatType_BorderType3x3::(1280x720, 32FC1, BORDER_REPLICATE) 0.690 0.578 1.19
blur3x3::Size_MatType_BorderType3x3::(1280x720, 8UC4, BORDER_CONSTANT) 1.195 1.343 0.89
blur3x3::Size_MatType_BorderType3x3::(1280x720, 8UC4, BORDER_REPLICATE) 1.191 1.339 0.89

@asmorkalov, Thanks for evaluating. I will review, if this can be improved for zen+ arch.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@asmorkalovasmorkalovasmorkalov left review comments

Assignees

No one assigned

Projects

None yet

Milestone

4.13.0

Development

Successfully merging this pull request may close these issues.

2 participants

@madanm3@asmorkalov

[8]ページ先頭

©2009-2025 Movatter.jp