Image shape descriptor generation methodTechnical Field
The invention relates to the field of target feature representation of image information, in particular to an image shape descriptor generation method.
Background
And carrying out feature description on image information based on the identification and retrieval of the image to form a normalized descriptor. Common features include color, texture, shape, and the like. For practical convenience, descriptors are usually required to have invariance to translation, rotation, and scale. The color and texture characteristics have high requirements on imaging conditions, and the invariance of the descriptor is difficult to guarantee. The shape characteristic is to process the contour information of the target, has low requirements on imaging quality and is a more reliable characteristic description object.
A method based on geometric feature point shape descriptors (patent No. CN103208003), describes a shape descriptor based on new geometric invariants, combining global features and local features. A floating point type triangle feature description method (patent No. CN105184786A) uses a single triangle as a feature unit, and combines the triangle corner relation with the triangle local area information to construct a 38-dimensional feature descriptor. The methods described in these two patents are susceptible to image quality and do not form effective descriptors for non-uniform illumination, pixel-less images. An image connected region description method and an application method thereof in image registration (patent number CN104156938A) complete the feature description of an image by depicting the local shape feature and the global relative position information of the current connected region of the image based on the connected region description method.
Although the deep learning method based on the neural network has great success in the field of image recognition, the deep learning method is insufficient for application scenes with rare training samples and extremely high requirements on recognition success rate. Artificially designed shape descriptors still have an irreplaceable role.
Disclosure of Invention
The invention aims to solve the defects in the prior art and provide a method for generating an image shape descriptor, wherein the generated descriptor has the invariance of translation, rotation, scale and illumination.
In order to achieve the purpose, the invention is implemented according to the following technical scheme:
an image shape descriptor generation method, comprising the steps of:
s1, firstly, collecting an image, carrying out binarization processing on the image, and distinguishing a target area and a background area;
s2, extracting the contour of the target area;
s3, performing distance transformation on the target area, and calculating the central point of the target area;
and S4, calculating a skeleton length diagram according to the distance path of each pixel point of the target, forming a radial distance histogram from the skeleton length diagram, and taking the radial distance histogram as an image shape descriptor.
Further, the acquiring of the image in step S1 includes acquiring a sample image and an identification image, and the target area and the background area when the image is acquired have better distinction.
Further, the binarizing processing on the image in step S1 specifically includes: performing histogram analysis on the acquired image, completing binarization of the image by using a classical OTSU algorithm, wherein the formed image is I (m, n), wherein m is the row number of the image pixel, n is the column number of the image pixel, and the high-brightness region of the binarized image is the target region.
Further, the step S2 of extracting the contour of the target area specifically includes:
s21, using one or more closed outlines C of the target shapei,j(s, t), where the symbol C represents the set of contour pixels, i is the serial number of the contour, and the value starts from 0; j is the pixel serial number in the contour set, and the value is started from 0; s is the line coordinate of the jth contour pixel in the image, and the value is started from 0; t is the column coordinate of the jth contour pixel in the image, and the value is started from 0; each pixel in the set C can find two pixels in the same set as the neighbors of the pixel, so that the difference between the row coordinate and the column coordinate is at most 1; extracting the contour by adopting a classical watershed algorithm;
s22, after contour extraction, dividing the contour of the target into an outer contour and an inner contour; wherein, the outer contour surrounds all pixels of the target area, namely, the pixels with high brightness value in the binary image I (m, n) exist; the pixels surrounded by the inner contour are all low luminance values in the binarized image I (m, n).
Further, the distance transformation of the target area in step S3, and the calculating the center point of the target area specifically includes:
s31, binarizing the image I (m, n) and the contour Ci,j(s, t) is input, the fast transformation algorithm or the pruning simplified algorithm is adopted to carry out the distance transformation of the target area, and the target is calculatedObtaining a distance map D (m, n), namely a length value of a shortest path from a target pixel to any contour point, wherein the path is a set formed by adjacent pixels;
s32, obtaining the distance map D (m, n), and counting the maximum distance value DmaxThe target pixel point having the maximum distance value forms a shape center point (m)0,n0) Wherein m is0Is the line coordinate of the center point of the shape, n0Is the column coordinate of the center point of the shape.
Further, the specific step of calculating the skeleton length map according to the distance path of each pixel point of the target in step S4 is as follows:
using the distance map D (m, n) and the shape center point (m)0,n0) For input, calculating a skeleton length graph S (m, n) of each target pixel point:
1) initializing a skeleton length map S (m, n) to 0, and setting a shape center point (m)0,n0) Putting a framework set SK;
2) expanding adjacent points from the framework set SK according to eight neighborhoods, and completing the following operations for each adjacent point: taking the expanded pixel points (m, n) which do not belong to the skeleton set SK at present as the circle center, taking the distance conversion value represented by the distance graph D (m, n) as the radius to make a circle, and if all the outlines C are in the same circlei,jIn the (s, t), if more than or equal to 2 contour points belong to the circle, the pixel point (m, n) belongs to the skeleton set SK; inserting the pixel points (m, n) into the skeleton set SK, and updating the skeleton length graph S (m, n) according to the adjacency plus one relation;
3) if the operation of adding the skeleton point is performed in the step 2), the operation 2) is performed again, or the step 4) is performed;
4) searching the shortest path of the skeleton set for all target pixels not belonging to the skeleton set SK, and requiring all points on the shortest path to belong to the target pixel set;
5) the skeleton length map S (m, n) is updated, supplementing the skeleton length of non-skeleton points, which is equal to the shortest path length to the skeleton set plus the skeleton length of the skeleton point reached.
Further, the specific step of forming the radial distance histogram from the skeleton length map in step S4 is as follows:
searching S (m, n) to obtain the maximum value SmaxDividing the image into 20 sections, forming a histogram of the values of the skeleton length diagram S (m, n), and normalizing each section to be SmaxAnd 20, the skeleton length map S (m, n) is the radial distance from each pixel of the target to the center point of the shape, all the histograms are called radial distance histograms, and the radial distance histograms are the image shape descriptors.
Preferably, the step S2 places the target in a uniform background, and then performs image acquisition.
Compared with the prior art, the invention designs the descriptor with translation, rotation, scale and illumination invariance, and the descriptor is taken as the basis, so that the reliable characteristic of the target can be formed under the condition of a single sample and is used for the identification and retrieval application of the image target; the shape descriptors described in this invention are also quite invariant to the object of hinge distortion.
Drawings
FIG. 1 is a flow chart of the image shape descriptor generation of the present invention.
Fig. 2 is a result diagram of a target image after binarization.
Fig. 3 is a distance map of an object.
FIG. 4 is a skeletal assembly diagram of a target.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. The specific embodiments described herein are merely illustrative of the invention and do not limit the invention.
As shown in fig. 1, the present embodiment provides an image shape descriptor generating method, which includes the following specific steps:
1. image acquisition
The image includes the acquisition of a sample image and the acquisition of an identification image. Because the shape of the target is described, the target area in the acquired image is required to be well distinguished from the background area, and the target area can be reliably segmented through an algorithm. By segmentation, the complete shape of the object can be preserved. In order to ensure the distinction between the foreground and the background, the target can be placed in a uniform and consistent background, and then image acquisition is carried out.
In a scene with a consistent background area, performing histogram analysis on an acquired image, completing binarization of the image by using a classical OTSU algorithm, and forming an image as I (m, n), wherein m is a row number of an image pixel, and n is a column number of the image pixel. The high-brightness region of the binarized image is a target region. The binarization operation of the image is carried out, the brightness change of the target area caused by uneven illumination is shielded, and no misjudgment is formed as long as the imaging brightness of the target area has enough distance with the brightness of the background area. The above-mentioned imaging conditions are easily satisfied by adjusting the color, material, or brightness of the background region. FIG. 2 is a diagram of a target image after binarization; wherein the white area is the target area.
2. Contour extraction of target region
The binarized image I (m, n) contains the complete shape information of the object. Due to the complexity of the target type, the target shape may be defined by one or more closed contours Ci,j(s, t), where the symbol C represents the set of contour pixels, i is the serial number of the contour, and the value starts from 0; j is the pixel serial number in the contour set, and the value is started from 0; s is the line coordinate of the jth contour pixel in the image, and the value is started from 0; t is the column coordinate of the jth contour pixel in the image, and the value is from 0. Each pixel in set C must be able to find two pixels in the same set as its neighbors so that their row and column coordinates differ by at most 1. According to the above contour definition, we use classical watershed algorithm to extract contour. Other algorithms, such as chain code method, boundary search method, are also feasible.
After contour extraction, the contour of the target can be divided into an outer contour and an inner contour. Wherein, the outer contour surrounds all pixels of the target area, namely, the pixels with high brightness value in the binary image I (m, n) exist; while the pixels surrounded by the inner contour are in the binarized image I (m, n)All low luminance values. In describing the target shape in the present invention, the inner contour and the outer contour have equivalent functions, so C is not distinguished in the following descriptioni,j(s, t) is the outer contour or the inner contour.
3. Calculating a center point of a target area
By binarizing the image I (m, n) and the contour Ci,j(s, t) is input, the fast transformation algorithm or the pruning simplification algorithm is adopted to carry out distance transformation on the target area, the minimum distance from each pixel in the target area to the contour is calculated, and a distance map D (m, n) is obtained, namely the length value of the shortest path from the target pixel to any contour point is obtained, wherein the path is a set formed by adjacent pixels. Obviously, all pixels on the shortest path belong to the target area. Fig. 3 is a distance graph of the object in fig. 2, in which a dot represents a center point of the shape of the object, a bright area represents a distance from a pixel point of the object to an outline of the object, and a higher brightness represents a larger distance.
After obtaining the distance map D (m, n), the maximum distance value D is countedmax. The target pixel point having the maximum distance value forms a shape center point (m)0,n0) Wherein m is0Is the line coordinate of the center point of the shape, n0Is the column coordinate of the center point of the shape. The test shows that the shape central point has uniqueness to a specific target shape and invariance in translation, rotation and scale transformation. In particular, in the case of scaling, since the distance map is scaled as a whole, the length of the minimum distance path is increased or decreased in proportion to each other, and therefore the relative position of the center point of the shape does not change.
4. And calculating a skeleton length graph according to the distance path of each pixel point of the target.
Using the distance map D (m, n) and the shape center point (m)0,n0) For input, a skeleton length map S (m, n) of each target pixel point is calculated. The method comprises the following specific steps:
a) initializing a skeleton length map S (m, n) to 0, and setting a shape center point (m)0,n0) Putting a framework set SK;
b) expanding adjacent points from the framework set SK according to eight neighborhoods, and finishing for each adjacent pointThe following operations: taking the expanded pixel points (m, n) which do not belong to the skeleton set SK at present as the circle center, taking the distance conversion value represented by the distance graph D (m, n) as the radius to make a circle, and if all the outlines C are in the same circlei,jIn the (s, t), if more than or equal to 2 contour points belong to the circle, the pixel point (m, n) belongs to the skeleton set SK. Inserting the pixel points (m, n) into the skeleton set SK, and updating the skeleton length graph S (m, n) according to the adjacency plus one relation;
c) if the operation of adding the skeleton point in the step b) is carried out, the operation b) is carried out again, otherwise, the step d) is carried out;
d) searching the shortest path of the skeleton set for all target pixels not belonging to the skeleton set SK, and requiring all points on the shortest path to belong to the target pixel set;
e) the skeleton length map S (m, n) is updated, supplementing the skeleton length of non-skeleton points, which is equal to the shortest path length to the skeleton set plus the skeleton length of the skeleton point reached. As shown in fig. 4.
5. Forming a radial distance histogram
The skeleton length map S (m, n) obtained by calculation in the above steps includes skeleton points and non-skeleton points to the shape center point (m)0,n0) The distance of (c). The length map is not a euclidean distance map but a geodesic distance map including object shape information, and represents the inherent geometric information of the object.
Searching S (m, n) to obtain the maximum value Smax. Dividing the interval into 20 intervals, forming a histogram of the values of the skeleton length diagram S (m, n), and normalizing each interval to be Smax/20. The division of the number of intervals can be determined according to actual problems. According to the steps described in this embodiment, the skeleton length map S (m, n) is the radial distance from each pixel of the object to the center point of the shape, and all this histogram is called radial distance histogram. The radial distance histogram is the descriptor of the image shape described in this embodiment.
From the forming step of the shape descriptor, geodesic distance is used, and invariance is provided for translation and rotation transformation of the image. In calculating the radial distance histogram, a normalized bin width operation is employed. When the image has scale transformation, the width of each interval is automatically scaled in an equal proportion, so that the formed shape descriptor has invariance to the scale transformation. For a scene with uneven illumination, the background constraint and the image binarization processing in the step 1 can eliminate the influence of illumination, so that the shape descriptor has illumination invariance.
The technical solution of the present invention is not limited to the limitations of the above specific embodiments, and all technical modifications made according to the technical solution of the present invention fall within the protection scope of the present invention.