CN107491459A

Movatterモバイル変換

Info

Publication number: CN107491459A
Application number: CN201610414781.8A
Authority: CN
Inventors: 孙修宇; 李�昊; 华先胜
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2016-06-13
Filing date: 2016-06-13
Publication date: 2017-12-19

Abstract

Description

The search method and device of three-dimensional image

Technical field

The application is related to field of computer technology, more particularly to the search method and device of a kind of three-dimensional image.

Background technology

With the high speed development of internet, increasing user begins to use in a manner of scheming to search figure to obtain oneself instituteThe information needed.At present, image indexing system primarily directed to two dimensional image extract Expressive Features (such as CNN features, SIFT feature,Color histogram feature, two dimensional image Expressive Features etc.), the higher figure of similarity is obtained to match by foregoing description featurePicture.But for the retrieval of 3-D view, the difference of same object iamge description feature corresponding under different angle is veryGreatly, if continuing to continue to use traditional method, the retrieval result that may result in acquisition is not accurate enough.

Apply for content

The application is intended to one of technical problem at least solving in correlation technique to a certain extent.Therefore, the applicationOne purpose is to propose a kind of search method of three-dimensional image, can effectively improve acquisition three-dimensional image and correspond toRetrieval result the degree of accuracy, so as to lift user experience.

Second purpose of the application is the retrieval device for proposing a kind of three-dimensional image.

To achieve these goals, the application first aspect embodiment proposes a kind of retrieval side of three-dimensional imageMethod, including：Determine the colouring information and depth information of three-dimensional image to be retrieved；By the color of the three-dimensional imageInformation and depth information are inputted to the convolutional neural networks model of training in advance, wherein, the convolutional neural networks model is rootEstablished according to the colouring information and depth information of three-dimensional image sample；By described in convolutional neural networks model outputThe characteristics of image of three-dimensional image；Retrieval result is obtained according to described image feature.

The search method of the three-dimensional image of the embodiment of the present application, by the face for determining three-dimensional image to be retrievedColor information and depth information, the colouring information of the three-dimensional image and depth information are inputted to the convolution god of training in advanceThrough network model, and pass through the characteristics of image of the convolutional neural networks model output three-dimensional image, final basisCharacteristics of image obtains retrieval result, can effectively improve the degree of accuracy for obtaining retrieval result corresponding to three-dimensional image, fromAnd lift user experience.

The application second aspect embodiment proposes a kind of retrieval device of three-dimensional image, including：Determining module, useIn it is determined that the colouring information and depth information of three-dimensional image to be retrieved；Input module, for by the 3 dimensional drawingThe colouring information and depth information of picture are inputted to the convolutional neural networks model of training in advance, wherein, the convolutional neural networksModel is established according to the colouring information and depth information of three-dimensional image sample；Output module, for passing through the volumeProduct neural network model exports the characteristics of image of the three-dimensional image；Acquisition module, for being obtained according to described image featureTake retrieval result.

The retrieval device of the three-dimensional image of the embodiment of the present application, by the face for determining three-dimensional image to be retrievedColor information and depth information, the colouring information of the three-dimensional image and depth information are inputted to the convolution god of training in advanceThrough network model, and pass through the characteristics of image of the convolutional neural networks model output three-dimensional image, final basisCharacteristics of image obtains retrieval result, can effectively improve the degree of accuracy for obtaining retrieval result corresponding to three-dimensional image, fromAnd lift user experience.

Brief description of the drawings

Fig. 1 is the first pass figure according to the search method of the three-dimensional image of the application one embodiment；

Fig. 2 is the second flow chart according to the search method of the three-dimensional image of the application one embodiment；

Fig. 3 is the flow chart for establishing convolutional neural networks model according to the application one embodiment；

Fig. 4 is the first structure schematic diagram according to the retrieval device of the three-dimensional image of the application one embodiment；

Fig. 5 is the second structural representation according to the retrieval device of the three-dimensional image of the application one embodiment；

Fig. 6 is the 3rd structural representation according to the retrieval device of the three-dimensional image of the application one embodiment.

Embodiment

Embodiments herein is described below in detail, the example of the embodiment is shown in the drawings, wherein from beginning to endSame or similar label represents same or similar element or the element with same or like function.Below with reference to attachedThe embodiment of figure description is exemplary, it is intended to for explaining the application, and it is not intended that limitation to the application.

Below with reference to the accompanying drawings the search method and device of the three-dimensional image of the embodiment of the present application are described.

Fig. 1 is the first pass figure according to the search method of the three-dimensional image of the application one embodiment.

As shown in figure 1, the search method of three-dimensional image may include：

S1, the colouring information and depth information for determining three-dimensional image to be retrieved.

Specifically, the three-dimensional image of user's input can first be received.Wherein, three-dimensional image can be by such as3D camera grabs as Kinect obtain.Then, the colouring information and depth information of three-dimensional image can be obtained.

Three-dimensional image is specifically to be described by colouring information and depth information.Wherein, colouring information can be RGBColor mode or YUV color modes.In the present embodiment, rgb color pattern is mainly taken to illustrate.In rgb colorIn pattern, it may include describe the channel B of the R passages of red, the G passages of description green and description blueness.The value of each passageScope is between 0 to 255, that is to say, that and 256 grades of rgb color can be combined into about 16,780,000 kinds of colors altogether, i.e., 256 ×256 × 256=16777216.Therefore, the color of certain point in image can be described by the numerical value of three above passage.

Depth information is the information of the distance of each point and lens plane described in three-dimensional image.

S2, the colouring information of three-dimensional image and depth information inputted to the convolutional neural networks mould of training in advanceType.

Wherein, convolutional neural networks model is established according to the colouring information and depth information of three-dimensional image sample's.

S3, the characteristics of image by convolutional neural networks model output three-dimensional image.

S4, according to characteristics of image obtain retrieval result.

Specifically, the distance between data characteristics of candidate image in characteristics of image and database can be calculated.Then may be usedCandidate image is ranked up according to distance order from small to large, finally can using the candidate image sorted positioned at preceding N names asRetrieval result.Wherein, database is the database for preserving three-dimensional image pre-established.Wherein, distance can beEuclidean distance or COS distance.

It should be appreciated that the similarity between the nearlyer expression image of distance is higher, thus, candidate image can be arrangedSequence, so as to obtain more accurately retrieval result.

In addition, as shown in Fig. 2 embodiments herein may also include step S5.

S5, before colouring information and depth information to be inputted to the convolutional neural networks model to training in advance, to three-dimensionalThe colouring information and depth information of stereo-picture are normalized.

First, first the colouring information of three-dimensional image can be normalized.

Specifically, can obtain in three-dimensional image R passage numerical value j, G passage numerical value k and channel B numerical value l a little,Then respectively by R passage numerical value j, G passage numerical value k, channel B numerical value l divided by 255, so as to obtain the R passage numerical value after normalizationJ ', G passage numerical value k ' and channel B numerical value l '.Because j, k, l span are between 0 to 255, therefore, corresponding j ', k 'And l ' span is between 0 to 1.

Then, then to the depth information of three-dimensional image it is normalized.

Specifically, can obtain in three-dimensional image depth value h, minimum-depth numerical value a a little (put down apart from camera lensThe nearest value in face) and depth capacity numerical value b (value farthest apart from lens plane).Wherein, a≤h≤b.Then, depth can be obtainedThe first difference between numerical value h and minimum-depth numerical value a, then obtain it is second poor between depth capacity numerical value b and depth value hValue, finally by the first difference divided by the second difference, the final depth value obtained after normalization.Depth value after normalizationBetween span is 0 to 1.

Colouring information and depth information after normalized be typically represented by the image such as 256 of a fixed dimension ×The two dimensional image of 256 pixels.

The process for establishing convolutional neural networks model is described in detail below.

Specifically, as shown in Figure 3, it may include following steps：

S31, the colouring information and depth information for extracting three-dimensional image sample.

S32, colouring information and depth information to three-dimensional image sample are normalized, with corresponding to generationNormalized image sample.

First, first the colouring information of three-dimensional image sample can be normalized.

Specifically, can obtain in three-dimensional image sample R passages numerical value, G passages numerical value and channel B number a littleValue, then respectively by R passages numerical value, G passages numerical value, channel B numerical value divided by 255, so as to obtain the R port numbers after normalizationValue, G passages numerical value and channel B numerical value.Between the span of each passage numerical value after normalization is 0 to 1.

Then, the depth information of three-dimensional image sample can be normalized.

Specifically, can obtain in three-dimensional image sample depth value a little, minimum-depth numerical value (apart from camera lensThe nearest value of plane) and depth capacity numerical value (value farthest apart from lens plane).Then, depth value can be obtained and minimum is deepThe first difference between number of degrees value, then the second difference between depth capacity numerical value and depth value is obtained, it is finally poor by firstValue divided by the second difference, the final depth value obtained after normalization.The span of depth value after normalization is 0 to 1Between.

After this, normalized image sample can be generated according to the colouring information and depth information after normalization.For convenienceCalculate, normalized image sample can typically may be scaled to a fixed dimension such as 256 × 256 pixels.

S33, normalized image sample is trained, to establish convolutional neural networks model.

Specifically, the parameter of convolutional neural networks model can be trained based on multi-task learning method, to improve volumeThe identification precision of product neural network model.Wherein, task can be to the classification task of image pattern, can also be to imageSorting task of sample etc..

The search method of the three-dimensional image of the embodiment of the present application, by the face for determining three-dimensional image to be retrievedColor information and depth information, the colouring information of three-dimensional image and depth information are inputted to the convolutional Neural net of training in advanceNetwork model, and by the characteristics of image of convolutional neural networks model output three-dimensional image, finally obtained according to characteristics of imageRetrieval result, the degree of accuracy for obtaining retrieval result corresponding to three-dimensional image can be effectively improved, is made so as to lift userWith experience.

To achieve the above object, the application also proposes a kind of retrieval device of three-dimensional image.

Fig. 4 is the first structure schematic diagram according to the retrieval device of the three-dimensional image of the application one embodiment.

As shown in figure 4, the retrieval device of three-dimensional image may include：Determining module 110, input module 120, output mouldBlock 130 and acquisition module 140.

Determining module 110 is used for the colouring information and depth information for determining three-dimensional image to be retrieved.Specifically, may be usedFirst receive the three-dimensional image of user's input.Wherein, three-dimensional image can pass through the 3D video cameras as KinectCatch and obtain.Then, the colouring information and depth information of three-dimensional image can be obtained.

Input module 120 is used to input the colouring information of three-dimensional image and depth information to the convolution of training in advanceNeural network model.Wherein, convolutional neural networks model is the colouring information and depth information according to three-dimensional image sampleEstablish.

Output module 130 is used for the characteristics of image that three-dimensional image is exported by convolutional neural networks model.

Acquisition module 140 is used to obtain retrieval result according to characteristics of image.Specifically, characteristics of image and database can be calculatedIn the distance between the data characteristics of candidate image.Then candidate image can be arranged according to the order of distance from small to largeSequence, the candidate image that finally can be located at preceding N names using sorting is as retrieval result.Wherein, database is to pre-establish for protectingDeposit the database of three-dimensional image.Wherein, distance can be Euclidean distance or COS distance.

In addition, as shown in figure 5, the retrieval device of three-dimensional image may also include normalization module 150.

Normalization module 150 is used to input to the convolutional neural networks mould of training in advance by colouring information and depth informationBefore type, the colouring information and depth information of three-dimensional image are normalized.

In addition, as shown in fig. 6, the retrieval device of three-dimensional image may also include extraction module 160, generation module 170With establish module 180.

Extraction module 160 is used for the colouring information and depth information for extracting three-dimensional image sample.

Generation module 170 is used to the colouring information and depth information of three-dimensional image sample be normalized,With normalized image sample corresponding to generation.

Establish module 180 to be used to be trained normalized image sample, to establish convolutional neural networks model.SpecificallyGround, the parameter of convolutional neural networks model can be trained based on multi-task learning method, to improve convolutional neural networks mouldThe identification precision of type.Wherein, task can be to the classification task of image pattern, can also be that sequence to image pattern is appointedBusiness etc..

The retrieval device of the three-dimensional image of the embodiment of the present application, by the face for determining three-dimensional image to be retrievedColor information and depth information, the colouring information of three-dimensional image and depth information are inputted to the convolutional Neural net of training in advanceNetwork model, and by the characteristics of image of convolutional neural networks model output three-dimensional image, finally obtained according to characteristics of imageRetrieval result, the degree of accuracy for obtaining retrieval result corresponding to three-dimensional image can be effectively improved, is made so as to lift userWith experience.

In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically showThe description of example " or " some examples " etc. means specific features, structure, material or the spy for combining the embodiment or example descriptionPoint is contained at least one embodiment or example of the application.In this manual, to the schematic representation of above-mentioned term notIdentical embodiment or example must be directed to.Moreover, specific features, structure, material or the feature of description can be with officeCombined in an appropriate manner in one or more embodiments or example.In addition, in the case of not conflicting, the skill of this areaArt personnel can be tied the different embodiments or example and the feature of different embodiments or example described in this specificationClose and combine.

Although embodiments herein has been shown and described above, it is to be understood that above-described embodiment is exampleProperty, it is impossible to the limitation to the application is interpreted as, one of ordinary skill in the art within the scope of application can be to above-mentionedEmbodiment is changed, changed, replacing and modification.

Claims

1. a kind of search method of three-dimensional image, it is characterised in that comprise the following steps：

Determine the colouring information and depth information of three-dimensional image to be retrieved；

The colouring information of the three-dimensional image and depth information are inputted to the convolutional neural networks model of training in advance, itsIn, the convolutional neural networks model is established according to the colouring information and depth information of three-dimensional image sample；

The characteristics of image of the three-dimensional image is exported by the convolutional neural networks model；

Retrieval result is obtained according to described image feature.

2. the method as described in claim 1, it is characterised in that retrieval result is obtained according to described image feature, including：

Calculate the distance between data characteristics of candidate image in described image feature and database；

The candidate image is ranked up according to the order of distance from small to large；

The candidate image for being located at preceding N names using sorting is as the retrieval result.

3. method as claimed in claim 2, it is characterised in that the distance includes Euclidean distance, COS distance.

4. the method as described in claim 1, it is characterised in that believed according to the colouring information of three-dimensional image sample and depthBreath establishes the convolutional neural networks model, including：

Extract the colouring information and depth information of the three-dimensional image sample；

The colouring information and depth information of the three-dimensional image sample are normalized, to generate corresponding normalizingChange image pattern；

The normalized image sample is trained, to establish the convolutional neural networks model.

5. the method as described in claim 1, it is characterised in that inputted by the colouring information and the depth information to pre-Before the convolutional neural networks model first trained, in addition to：

The colouring information and depth information of the three-dimensional image are normalized.

A kind of 6. retrieval device of three-dimensional image, it is characterised in that including：

Determining module, for determining the colouring information and depth information of three-dimensional image to be retrieved；

Input module, for the colouring information of the three-dimensional image and depth information to be inputted to the convolution god of training in advanceThrough network model, wherein, the convolutional neural networks model is believed according to the colouring information and depth of three-dimensional image sampleWhat breath was established；

Output module, for exporting the characteristics of image of the three-dimensional image by the convolutional neural networks model；

Acquisition module, for obtaining retrieval result according to described image feature.

7. device as claimed in claim 6, it is characterised in that the acquisition module, be used for：

8. device as claimed in claim 7, it is characterised in that the distance includes Euclidean distance, COS distance.

9. device as claimed in claim 6, it is characterised in that described device also includes：

Extraction module, for extracting the colouring information and depth information of the three-dimensional image sample；

Generation module, it is normalized for the colouring information to the three-dimensional image sample and depth information, withNormalized image sample corresponding to generation；

Module is established, for being trained to the normalized image sample, to establish the convolutional neural networks model.

10. device as claimed in claim 6, it is characterised in that described device also includes：

Module is normalized, for being inputted by the colouring information and the depth information to the convolutional neural networks of training in advanceBefore model, the colouring information and depth information of the three-dimensional image are normalized.