Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit601778e

Browse files
authored
Merge pull request#22750 from zihaomu:improve_blobFromImage
DNN: Add New API blobFromImageParam#22750The purpose of this PR:1. Add new API `blobFromImageParam` to extend `blobFromImage` API. It can support the different data layout (NCHW or NHWC), and letter_box.2. ~~`blobFromImage` can output `CV_16F`~~### Pull Request Readiness ChecklistSee details athttps://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request- [x] I agree to contribute to the project under Apache 2 License.- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV- [x] The PR is proposed to the proper branch- [ ] There is a reference to the original bug report and related work- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name.- [ ] The feature is well documented and sample code can be built with the project CMake
1 parent810096c commit601778e

File tree

6 files changed

+402
-146
lines changed

6 files changed

+402
-146
lines changed

‎modules/dnn/include/opencv2/dnn/dnn.hpp‎

Lines changed: 90 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,21 @@ CV__DNN_INLINE_NS_BEGIN
108108
DNN_TARGET_NPU,
109109
};
110110

111+
/**
112+
* @brief Enum of data layout for model inference.
113+
* @see Image2BlobParams
114+
*/
115+
enum DataLayout
116+
{
117+
DNN_LAYOUT_UNKNOWN =0,
118+
DNN_LAYOUT_ND =1,//!< OpenCV data layout for 2D data.
119+
DNN_LAYOUT_NCHW =2,//!< OpenCV data layout for 4D data.
120+
DNN_LAYOUT_NCDHW =3,//!< OpenCV data layout for 5D data.
121+
DNN_LAYOUT_NHWC =4,//!< Tensorflow-like data layout for 4D data.
122+
DNN_LAYOUT_NDHWC =5,//!< Tensorflow-like data layout for 5D data.
123+
DNN_LAYOUT_PLANAR =6,//!< Tensorflow-like data layout, it should only be used at tf or tflite model parsing.
124+
};
125+
111126
CV_EXPORTS std::vector< std::pair<Backend, Target> >getAvailableBackends();
112127
CV_EXPORTS_W std::vector<Target>getAvailableTargets(dnn::Backend be);
113128

@@ -1111,10 +1126,10 @@ CV__DNN_INLINE_NS_BEGIN
11111126
/** @brief Creates 4-dimensional blob from image. Optionally resizes and crops @p image from center,
11121127
* subtract @p mean values, scales values by @p scalefactor, swap Blue and Red channels.
11131128
* @param image input image (with 1-, 3- or 4-channels).
1129+
* @param scalefactor multiplier for @p images values.
11141130
* @param size spatial size for output image
11151131
* @param mean scalar with mean values which are subtracted from channels. Values are intended
11161132
* to be in (mean-R, mean-G, mean-B) order if @p image has BGR ordering and @p swapRB is true.
1117-
* @param scalefactor multiplier for @p image values.
11181133
* @param swapRB flag which indicates that swap first and last channels
11191134
* in 3-channel image is necessary.
11201135
* @param crop flag which indicates whether image will be cropped after resize or not
@@ -1123,6 +1138,9 @@ CV__DNN_INLINE_NS_BEGIN
11231138
* dimension in @p size and another one is equal or larger. Then, crop from the center is performed.
11241139
* If @p crop is false, direct resize without cropping and preserving aspect ratio is performed.
11251140
* @returns 4-dimensional Mat with NCHW dimensions order.
1141+
*
1142+
* @note
1143+
* The order and usage of `scalefactor` and `mean` are (input - mean) * scalefactor.
11261144
*/
11271145
CV_EXPORTS_W MatblobFromImage(InputArray image,double scalefactor=1.0,const Size& size = Size(),
11281146
const Scalar& mean = Scalar(), bool swapRB=false, bool crop=false,
@@ -1153,6 +1171,9 @@ CV__DNN_INLINE_NS_BEGIN
11531171
* dimension in @p size and another one is equal or larger. Then, crop from the center is performed.
11541172
* If @p crop is false, direct resize without cropping and preserving aspect ratio is performed.
11551173
* @returns 4-dimensional Mat with NCHW dimensions order.
1174+
*
1175+
* @note
1176+
* The order and usage of `scalefactor` and `mean` are (input - mean) * scalefactor.
11561177
*/
11571178
CV_EXPORTS_W MatblobFromImages(InputArrayOfArrays images,double scalefactor=1.0,
11581179
Size size = Size(),const Scalar& mean = Scalar(), bool swapRB=false, bool crop=false,
@@ -1167,6 +1188,74 @@ CV__DNN_INLINE_NS_BEGIN
11671188
const Scalar& mean = Scalar(), bool swapRB=false, bool crop=false,
11681189
int ddepth=CV_32F);
11691190

1191+
/**
1192+
* @brief Enum of image processing mode.
1193+
* To facilitate the specialization pre-processing requirements of the dnn model.
1194+
* For example, the `letter box` often used in the Yolo series of models.
1195+
* @see Image2BlobParams
1196+
*/
1197+
enum ImagePaddingMode
1198+
{
1199+
DNN_PMODE_NULL =0,// !< Default. Resize to required input size without extra processing.
1200+
DNN_PMODE_CROP_CENTER =1,// !< Image will be cropped after resize.
1201+
DNN_PMODE_LETTERBOX =2,// !< Resize image to the desired size while preserving the aspect ratio of original image.
1202+
};
1203+
1204+
/** @brief Processing params of image to blob.
1205+
*
1206+
* It includes all possible image processing operations and corresponding parameters.
1207+
*
1208+
* @see blobFromImageWithParams
1209+
*
1210+
* @note
1211+
* The order and usage of `scalefactor` and `mean` are (input - mean) * scalefactor.
1212+
* The order and usage of `scalefactor`, `size`, `mean`, `swapRB`, and `ddepth` are consistent
1213+
* with the function of @ref blobFromImage.
1214+
*/
1215+
structCV_EXPORTS_W_SIMPLE Image2BlobParams
1216+
{
1217+
CV_WRAPImage2BlobParams();
1218+
CV_WRAPImage2BlobParams(const Scalar& scalefactor,const Size& size = Size(),const Scalar& mean = Scalar(),
1219+
bool swapRB = false, int ddepth = CV_32F, DataLayout datalayout = DNN_LAYOUT_NCHW,
1220+
ImagePaddingMode mode = DNN_PMODE_NULL);
1221+
1222+
CV_PROP_RW Scalar scalefactor;//!< scalefactor multiplier for input image values.
1223+
CV_PROP_RW Size size;//!< Spatial size for output image.
1224+
CV_PROP_RW Scalar mean;//!< Scalar with mean values which are subtracted from channels.
1225+
CV_PROP_RWbool swapRB;//!< Flag which indicates that swap first and last channels
1226+
CV_PROP_RWint ddepth;//!< Depth of output blob. Choose CV_32F or CV_8U.
1227+
CV_PROP_RW DataLayout datalayout;//!< Order of output dimensions. Choose DNN_LAYOUT_NCHW or DNN_LAYOUT_NHWC.
1228+
CV_PROP_RW ImagePaddingMode paddingmode;//!< Image padding mode. @see ImagePaddingMode.
1229+
};
1230+
1231+
/** @brief Creates 4-dimensional blob from image with given params.
1232+
*
1233+
* @details This function is an extension of @ref blobFromImage to meet more image preprocess needs.
1234+
* Given input image and preprocessing parameters, and function outputs the blob.
1235+
*
1236+
* @param image input image (all with 1-, 3- or 4-channels).
1237+
* @param param struct of Image2BlobParams, contains all parameters needed by processing of image to blob.
1238+
* @return 4-dimensional Mat.
1239+
*/
1240+
CV_EXPORTS_W MatblobFromImageWithParams(InputArray image,const Image2BlobParams& param = Image2BlobParams());
1241+
1242+
/** @overload*/
1243+
CV_EXPORTS_WvoidblobFromImageWithParams(InputArray image, OutputArray blob,const Image2BlobParams& param = Image2BlobParams());
1244+
1245+
/** @brief Creates 4-dimensional blob from series of images with given params.
1246+
*
1247+
* @details This function is an extension of @ref blobFromImages to meet more image preprocess needs.
1248+
* Given input image and preprocessing parameters, and function outputs the blob.
1249+
*
1250+
* @param images input image (all with 1-, 3- or 4-channels).
1251+
* @param param struct of Image2BlobParams, contains all parameters needed by processing of image to blob.
1252+
* @returns 4-dimensional Mat.
1253+
*/
1254+
CV_EXPORTS_W MatblobFromImagesWithParams(InputArrayOfArrays images,const Image2BlobParams& param = Image2BlobParams());
1255+
1256+
/** @overload*/
1257+
CV_EXPORTS_WvoidblobFromImagesWithParams(InputArrayOfArrays images, OutputArray blob,const Image2BlobParams& param = Image2BlobParams());
1258+
11701259
/** @brief Parse a 4D blob and output the images it contains as 2D arrays through a simpler data structure
11711260
* (std::vector<cv::Mat>).
11721261
* @param[in] blob_ 4 dimensional array (images, channels, height, width) in floating point precision (CV_32F) from

‎modules/dnn/misc/python/test/test_dnn.py‎

Lines changed: 36 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,7 @@ def checkIETarget(self, backend, target):
119119
inp=np.random.standard_normal([1,2,10,11]).astype(np.float32)
120120
net.setInput(inp)
121121
net.forward()
122-
exceptBaseExceptionase:
122+
exceptBaseException:
123123
returnFalse
124124
returnTrue
125125

@@ -153,6 +153,41 @@ def test_blobFromImage(self):
153153
target=target.transpose(2,0,1).reshape(1,3,height,width)# to NCHW
154154
normAssert(self,blob,target)
155155

156+
deftest_blobFromImageWithParams(self):
157+
np.random.seed(324)
158+
159+
width=6
160+
height=7
161+
stddev=np.array([0.2,0.3,0.4])
162+
scalefactor=1.0/127.5*stddev
163+
mean= (10,20,30)
164+
165+
# Test arguments names.
166+
img=np.random.randint(0,255, [4,5,3]).astype(np.uint8)
167+
168+
param=cv.dnn.Image2BlobParams()
169+
param.scalefactor=scalefactor
170+
param.size= (6,7)
171+
param.mean=mean
172+
param.swapRB=True
173+
param.datalayout=cv.dnn.DNN_LAYOUT_NHWC
174+
175+
blob=cv.dnn.blobFromImageWithParams(img,param)
176+
blob_args=cv.dnn.blobFromImageWithParams(img,cv.dnn.Image2BlobParams(scalefactor=scalefactor,size=(6,7),mean=mean,
177+
swapRB=True,datalayout=cv.dnn.DNN_LAYOUT_NHWC))
178+
normAssert(self,blob,blob_args)
179+
180+
target2=cv.resize(img, (width,height),interpolation=cv.INTER_LINEAR).astype(np.float32)
181+
target2=target2[:,:,[2,1,0]]# BGR2RGB
182+
target2[:,:,0]-=mean[0]
183+
target2[:,:,1]-=mean[1]
184+
target2[:,:,2]-=mean[2]
185+
186+
target2[:,:,0]*=scalefactor[0]
187+
target2[:,:,1]*=scalefactor[1]
188+
target2[:,:,2]*=scalefactor[2]
189+
target2=target2.reshape(1,height,width,3)# to NHWC
190+
normAssert(self,blob,target2)
156191

157192
deftest_model(self):
158193
img_path=self.find_dnn_file("dnn/street.png")

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp