Disclosure of Invention
Aiming at the technical problems of difficult registration of infrared and visible light images and large calculation amount of a SIFT algorithm, the invention provides a SIFT registration method based on characteristic points, provides an infrared and optical image registration method based on combination of image entropy and the improved SIFT algorithm, optimizes the flow of the SIFT algorithm, improves the speed and the accuracy of the SIFT algorithm to a certain extent, and particularly relates to a registration method and a registration device between the visible light images and the infrared images of a circuit board.
The invention adopts the following technical scheme that the method for registering the visible light image and the infrared image of the circuit board comprises the following steps:
Step one, taking a visible light image and an infrared image of the circuit board as input images;
Step two, traversing the two input images by adopting non-overlapping sliding windows, dividing the windows, calculating the information entropy of the window area after division, defining the image local area higher than a given preset information entropy threshold as a high entropy area and the image local area lower than the given information entropy threshold as a low entropy area according to the histogram formed by the acquired information entropy, wherein the high entropy area is used for subsequent algorithm feature extraction to participate in feature point detection, and the low entropy area does not participate in feature point detection;
detecting characteristic points of the high-entropy areas screened out of the infrared image and the high-entropy areas screened out of the visible light image by adopting a SIFT+FAST algorithm, and screening out representative points as respective SIFT characteristic points;
Step four, respectively constructing annular descriptors for SIFT feature points detected by the two images, performing PCA dimension reduction processing, and respectively acquiring 64-dimensional feature vector descriptors of the visible light images and 64-dimensional feature vector descriptors of the infrared images;
And fifthly, taking Euclidean distance and cosine similarity as similarity measurement indexes of the two images, calculating Euclidean distance and cosine similarity of feature point feature vectors on the two images, adopting a nearest neighbor/secondary neighbor FLANN algorithm to perform initial matching on the reference image and the image to be matched, adopting a RANSAC algorithm to remove incorrect matching, and finally realizing the precise matching between the visible light image and the infrared image.
As a further improvement of the above solution, in the second step, the method for screening the high entropy area and the low entropy area by using the information entropy threshold includes the following steps:
Firstly, dividing a visible light image and an infrared image by adopting non-overlapping sliding windows, traversing each image by using a plurality of non-overlapping sliding windows, dividing each image according to the size of the window, and calculating the information entropy of each window area;
And secondly, according to a histogram formed by the acquired information entropy, setting a segmentation threshold, namely the information entropy threshold, screening a window area for calculating the information entropy, reserving a window area larger than the set information entropy threshold, extracting the characteristic points of a subsequent SIFT+FAST algorithm, and not detecting the characteristic points of the window area smaller than the information entropy threshold.
As a further improvement of the above scheme, for a two-dimensional image in discrete form, the calculation formula of the information entropy Pi,j is:
Pi,j=f(i,j)/W·h
Wherein W, H is the width and height of the picture respectively, (i, j) is a binary group, i represents the gray value of the center in a certain sliding window, j is the gray average value of the pixels except the center in the window, f (i, j) represents the number of times that the binary group appears in the whole image, and H is the two-dimensional gray entropy of the image.
As a further improvement of the above solution, in step three, a detection method for detecting feature points in a high entropy area screened out from each image by using sift+fast algorithm includes the following steps:
firstly, constructing a Gaussian scale space;
the gaussian scale space of an image is defined as a function L (x, y, σ):
L(x,y,σ)=G(x,y,σ)*I(x,y)
Wherein I (x, y) is an input image, G (x, y, sigma) is a variable-scale Gaussian function, (x, y) is a point coordinate on the image, sigma is a Gaussian blur coefficient, adjacent layers in each group are subtracted to obtain a Gaussian differential pyramid DOG, the subsequent extraction of characteristic points is carried out on the DOG pyramid, and the formula of a DOG operator D (x, y, sigma) is as follows:
D(x,y,σ)=(G(x,y,kσ)-G(x,y,σ))*I(x,y)=L(x,y,kσ)-L(x,y,σ)
wherein k is a proportionality coefficient;
secondly, detecting and accurately positioning Gaussian scale space feature points;
Searching all scales and image positions in a Gaussian scale space, positioning extreme points on each layer of image of all scales, and determining that a circle is drawn with the radius of 3 by taking the point as the center, wherein when at least 12 pixel points in 16 pixel points on the edge are satisfied and are larger than Ix+T1 or are all smaller than Ix-T1, the point is regarded as a key point, and then the position and the scale of the key point are accurately determined by fitting a three-dimensional quadratic function, wherein Ix is the pixel value of a detection point, and T1 is a pixel range threshold;
Then, removing the points with low contrast and the points positioned at the edges of the image;
Removing the two unstable points by setting a contrast threshold and a Hessian matrix;
finally, calculating the direction of the feature points;
The method comprises the steps of utilizing gradient direction characteristics of neighborhood pixels of key points to realize rotation invariance of an image, sampling in a plurality of neighborhood windows taking characteristic points as centers, counting gradient directions of the neighborhood pixels by using a histogram, dividing the histogram into 8 directions by using the gradient histogram with the range of 0-360 degrees and one direction of every 45 degrees, namely, 8 gradient direction information of each characteristic point, wherein a peak value of the histogram represents a main direction of the neighborhood gradient at the characteristic point, namely, the main direction of the characteristic point, and smoothing the histogram by using a Gaussian function to reduce the influence of mutation, wherein in the gradient direction histogram, when another peak value which is equal to 80% of energy of a main peak value exists, the direction is regarded as an auxiliary direction of the characteristic point, and one characteristic point can be designated to have a plurality of directions, namely, one direction and more than one auxiliary direction, so as to enhance matching robustness.
Further, when removing points with low contrast and points located at the edges of the image, the extreme points are accurate to the sub-pixel level by using a fitting three-dimensional quadratic function, are substituted into the Taylor expansion, and only the first two terms are taken:
Wherein the method comprises the steps ofWherein represents the offset relative to the interpolated center coordinates (x, y);
the method comprises the steps of presetting a first contrast threshold, comparing and analyzing the contrast of an extreme point with the first contrast threshold, taking the extreme point with the contrast larger than the first contrast threshold as a feature point to be selected, presetting a second contrast threshold, wherein the second contrast threshold is larger than the first contrast threshold, and continuously storing the extreme point with the contrast larger than the second contrast threshold as the feature point to be selected;
Acquiring a Hessian matrix H (x, y) of the feature points to be selected:
Tr (H (x, y))=Dxx(x,y)+Dyy (x, y) represents the sum of characteristic values of the matrix H (x, y), det (H (x, y))=Dxx(x,y)Dyy(x,y)-(Dxy(x,y))2 represents a determinant of the matrix H (x, y), wherein the value of Dxx(x,y),Dxy(x,y),Dyy (x, y) is obtained by differentiating the corresponding positions of the neighborhood of candidate points, the principal curvature of Det (H (x, y)) is proportional to the characteristic values of H (x, y), and the formula is setRepresenting the ratio of the maximum characteristic value to the minimum characteristic value of H (x, y), thenIn order to detect whether the principal curvature is below a certain threshold T2, it is only necessary to detectIf the above formula is established, the feature point is rejected, otherwise, the feature point is reserved.
Further, the calculation method for calculating the direction of the feature point includes the steps of:
For the key points detected in the DOG pyramid, collecting the gradient and direction distribution characteristics of pixels in a 3 sigma neighborhood window of the Gaussian pyramid image, wherein the gradient has the following modulus value and direction:
θ(x,y)=tan-1((L(x,y+1)-L(x,y-1))/(L(x+1,y)-L(x-1,y)))
Wherein L (x, y) is a scale space value at (x, y) where the key point is located, L (x+1, y) is a scale space value at (x+1, y) where the key point is located, L (x-1, y) is a scale space value at (x-1, y) where the key point is located, L (x, y+1) is a scale space value at (x, y+1) where the key point is located, L (x, y-1) is a scale space value at (x, y-1) where the key point is located, m (x, y) is a gradient modulus value, and θ (x, y) is a gradient direction.
As a further improvement of the above solution, in step four, the method for acquiring the 64-dimensional annular feature vector descriptor of the two images includes the steps of:
For any one feature key point, taking the key point as the center of a circle in the scale space, and making a circle with a radius of 13, dividing the region into 8 concentric annular circles in a mode of radius of 2, 3, 4, 5, 6, 8, 10 and 13 pixel points as the gradient distribution weight of the pixel point farther from the center of the circle is smaller, so as to form 8 sub-regions, wherein the key point in each sub-region has 8 gradient directions, and therefore 8×8=64 data, namely 64-dimensional SIFT feature vectors are totally obtained.
As a further improvement of the above solution, in step five, the matching method for performing initial matching by using the FLANN algorithm combining euclidean distance and cosine similarity includes the following steps:
After SIFT feature vectors of the two images are generated, euclidean distance and cosine similarity between feature point feature vectors of the two images are calculated, the distance and direction between the feature vectors are used as similarity judging indexes, feature points with minimum distance and cosine similarity higher than a certain given threshold value are selected as initial matching points, a pair of correct matching points are judged according to the fact that the ratio of Euclidean distance between nearest neighbors to next nearest neighbors is smaller than a certain ratio threshold value T3 to be 0.77, error matching points are removed, and the matching points in the visible light image and the infrared image are connected through lines. Thereby achieving image registration.
The invention also provides a registration device between the visible light image and the infrared image of the circuit board, which comprises:
The acquisition module is used for acquiring an infrared image and a visible light image of the circuit board;
The entropy region distinguishing module is used for respectively removing low-entropy regions according to the respective image information entropy in the visible light image and the infrared image, and reserving high-entropy regions for subsequent feature point detection;
the construction module is used for constructing a Gaussian scale space for the high-entropy area and establishing an image Gaussian pyramid and a Gaussian differential pyramid;
the characteristic point screening module is used for acquiring extreme points in different scale spaces in the Gaussian differential pyramid by using a FAST+SIFT combination algorithm, and accurately positioning and screening the characteristic points according to the extreme points;
The removing module is used for screening and removing unstable points by adopting a threshold value method and a Hessian matrix method, and comprises points with low contrast and points positioned at the edge of an image;
The characteristic point direction calculation module is used for calculating and determining the characteristic point direction and constructing a key point 64-dimensional annular descriptor;
and the key point matching module is used for carrying out key point matching by using the Euclidean distance and cosine similarity between vectors as measurement indexes and applying a quick approximate nearest neighbor search FLANN, and eliminating mismatching by using a RANSAC random sampling consistency algorithm.
As a further improvement of the above scheme, the configuration device is also used for adopting the registration method between the visible light image and the infrared image of any circuit board.
Compared with the prior art, the invention has the following beneficial effects:
1. the image area blocking statistical information entropy is adopted, and the high entropy information area is extracted as a detection target image according to the threshold value, so that the accuracy of the matching point pair is improved, and the matching accuracy is higher than that of the traditional SIFT matching point pair.
2. By using the SIFT and FAST combined method, the problems of slower efficiency and low response strength when the traditional algorithm is used for extracting the feature points are solved, the accuracy of the matching point pairs is improved, and the matching accuracy is higher than that of the traditional SIFT matching point pairs.
3. The improved SIFT feature point annular descriptor is adopted, and the overall operation speed of the algorithm is improved on the premise of ensuring the registration quality.
According to the invention, when images are matched, the points which can better reflect the image characteristics are selected for matching, so that the problems of low efficiency and low response strength of the traditional SIFT algorithm when the characteristic points are extracted are solved while the matching precision is ensured, and the registration efficiency and the precision are obviously improved compared with the traditional algorithm. Therefore, the invention selects the visible light and the infrared image as experimental data, compares the experimental data with the traditional SIFT algorithm, obviously improves the registration efficiency and the precision compared with the traditional algorithm, and has wide application prospects in image fusion, remote sensing image processing, computer vision, vision field and power equipment diagnosis.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that when a component is referred to as being "mounted on" another component, it can be on the other component or intervening components may also be present. When an element is referred to as being "disposed on" another element, it can be disposed on the other element or intervening elements may also be present. When an element is referred to as being "fixed to" another element, it can be fixed to the other element or intervening elements may also be present.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The term "or/and" as used herein includes any and all combinations of one or more of the associated listed items.
The registration method between the optical image and the infrared image of the circuit board mainly comprises 5 steps.
And step one, taking the optical image and the infrared image of the circuit board as input images to be registered.
Step two, traversing the optical image and the infrared image by adopting non-overlapping sliding windows, dividing the windows, calculating the information entropy of the window area after division, defining an image local area with high information entropy higher than a preset information entropy threshold as a high entropy area, defining an image local area with low information entropy lower than the information entropy threshold as a low entropy area, wherein the high entropy area is used for subsequent algorithm feature extraction to participate in feature point detection, and the low entropy area is not used for feature point detection.
And thirdly, respectively detecting characteristic points of the high-entropy area screened out of the optical image and the high-entropy area screened out of the infrared image by adopting a SIFT+FAST algorithm, and respectively screening representative points as respective SIFT characteristic points.
And fourthly, respectively constructing annular descriptors and performing dimension reduction processing on SIFT feature points of the optical image and SIFT feature points of the infrared image to respectively acquire 64-dimensional feature vector descriptors of the optical image and 64-dimensional feature vector descriptors of the infrared image.
And fifthly, taking Euclidean distance and cosine similarity as similarity measurement indexes of the optical image and the infrared image, calculating Euclidean distance and cosine similarity of feature vectors of feature points on the two images, adopting a nearest neighbor/next neighbor FLANN algorithm to perform initial matching on the optical image and the infrared image, adopting a RANSAC algorithm to remove mismatching in the optical image and the infrared image, and finally realizing fine matching between the optical image and the infrared image.
Of course, referring to fig. 1, seven aspects can be summarized:
1) Acquiring an infrared image and a visible light image of a PCB;
2) Removing a low entropy region according to the size of the image information entropy, and reserving a high entropy region for subsequent feature point detection;
3) Constructing a Gaussian scale space for the image, and constructing an image Gaussian pyramid and a Gaussian differential pyramid;
4) Acquiring extreme points in different scale spaces in the Gaussian differential pyramid by using a FAST+SIFT combination algorithm, and accurately positioning and screening out characteristic points according to the extreme points;
5) The unstable points are removed by screening through a threshold method and a Hessian matrix method, wherein the unstable points comprise points with low contrast and points positioned at the edge of an image;
6) Calculating and determining the direction of the characteristic points, and constructing a 64-dimensional annular descriptor of the key points;
7) And (3) using the Euclidean distance between vectors and cosine similarity as measurement indexes, performing key point matching by using a quick approximate nearest neighbor search (FLANN), and eliminating mismatching by using a RANSAC random sampling consistency algorithm.
Each step is then analyzed in detail.
Aiming at the first step, acquiring an infrared image and a visible light image of a PCB (printed circuit board);
Aiming at the second step, the specific method for screening the high-entropy window area and the low-entropy window area by utilizing the image entropy threshold value is as follows:
Firstly, dividing a reference image and an image to be registered by adopting a non-overlapping sliding window, traversing the image by using a plurality of (such as 5*5) non-overlapping sliding windows, dividing the image according to the window size, and calculating the information entropy of each small window area. For a two-dimensional image in a discrete form, the information entropy is calculated according to the following formula:
Pi,j=f(i,j)/W·h
Wherein W, H is the width and height of the picture respectively, (i, j) is a binary group, i represents the gray value of the center in a certain sliding window, j is the gray average value of the pixels except the center in the window, f (i, j) represents the number of times that the binary group appears in the whole image, and H is the two-dimensional gray entropy of the image.
Secondly, according to the histogram (shown in fig. 2 and 3) formed by the acquired information entropy, a segmentation threshold is set, window areas for calculating the information entropy are screened, window areas larger than the set threshold are reserved for subsequent extraction of characteristic points of a SIFT+FAST algorithm, and window areas smaller than the threshold are not subjected to subsequent detection of the characteristic points.
For the third step, please refer to fig. 4, the specific method for detecting the feature points of the image high entropy region by adopting the sift+fast algorithm is as follows:
(1) Construction of a Gaussian scale space the Gaussian scale space of an image is defined as a functionAs a variable-scale gaussian function, the gaussian convolution kernel is the only linear kernel that implements the scale transformation. Wherein (x, y) is the coordinates of points on the image, σ is a gaussian blur coefficient, the size of σ determines the smoothness of the image, the large scale corresponds to the profile features of the image, the small scale corresponds to the detail features of the image, the output image is I (x, y), i.e., L (x, y, σ) =g (x, y, σ) ×i (x, y), wherein I (x, y) is the input image, and G (x, y, σ) is the variable scale gaussian function. After the image Gaussian pyramid is created, in order to effectively detect stable key points in a scale space, adjacent layers in each group are subtracted to obtain a Gaussian differential pyramid (DOG), and the subsequent feature point extraction is carried out on the DOG pyramid.
(2) Searching all scales and image positions in a Gaussian scale space, positioning extreme points on each layer of images of all scales, and determining a method of drawing a circle with the radius of 3 by taking the point as a center, wherein when at least 12 pixel points in 16 pixel points on the edge are satisfied and are larger than Ix+T1 or smaller than Ix-T1, the point is regarded as a key point, and then the position and the scale of the key point are accurately determined by fitting a three-dimensional quadratic function.
(3) The points of low contrast and the points at the edges of the image are removed by setting a contrast threshold and a Hessian matrix.
(4) Calculating the direction of the feature points by utilizing the gradient direction characteristics of the neighborhood pixels of the key points so as to realize rotation invariance of the image, sampling in a plurality of neighborhood windows such as 4 multiplied by 4 with the feature points as the centers, and counting the gradient directions of the neighborhood pixels by using a histogram, wherein the gradient histogram ranges from 0 to 360 degrees, and the histogram is divided into 8 directions, namely 8 gradient direction information are arranged on each feature point. The peak of the histogram represents the main direction of the neighborhood gradient at the feature point, i.e. the direction that is the feature point. And meanwhile, a Gaussian function is used for smoothing the histogram, the influence of mutation is reduced, and when another peak value equivalent to 80% of the energy of the main peak value exists in the gradient direction histogram, the direction is regarded as the auxiliary direction of the characteristic point. A feature point may be designated to have multiple directions, a primary direction, and more than one secondary direction for enhanced robustness of the match.
For step four, please refer to fig. 5, the specific method for obtaining the 64-dimensional annular feature vector descriptors of the reference image and the image to be matched is as follows:
For any one feature key point, taking the key point as the center of a circle in the scale space, and making a circle with a radius of 13, dividing the region into 8 concentric annular circles with the radius of 2, 3, 4, 5, 6, 8, 10 and 13 as the gradient distribution weight of the pixel points which are farther from the center of the circle is smaller, so as to form 8 sub-regions, wherein the key point in each sub-region has 8 gradient directions, and therefore 8×8=64 data, namely 64-dimensional SIFT feature vectors are totally obtained.
Aiming at the fifth step, the specific method for carrying out initial matching by utilizing a FLANN algorithm combining the Euclidean distance and the cosine similarity is as follows:
After SIFT feature vectors of the two images are generated, euclidean distance and cosine similarity of feature vectors of feature points on the two images are calculated, the distance and direction between the vectors are used as similarity judging indexes, feature points with the smallest distance and the cosine similarity being higher than a certain given threshold value are used as initial matching points, the matching points are judged to be a pair of matching points according to the fact that the ratio of Euclidean distance between the nearest neighbor and the next nearest neighbor is smaller than a certain ratio threshold value T and is 0.77, the error matching points are removed, and then the matching points in the reference image and the image to be registered are connected by lines, so that image registration is achieved. Please refer to fig. 6,7 and 8, wherein fig. 6 is a conventional sift algorithm registration chart, fig. 7 is a modified sift algorithm registration chart of fig. 1, and fig. 8 is a result comparison chart.
The registration method between the optical image and the infrared image of the circuit board can be designed into embedded software or non-embedded software when in application, but the registration device between the optical image and the infrared image of the circuit board can be designed independently.
The registration device comprises an acquisition module, an entropy region distinguishing module, a construction module, a characteristic point screening module, a removal module, a characteristic point direction calculation module and a key point matching module.
The acquisition module is used for acquiring an infrared image and a visible light image of the circuit board, taking the visible light image, namely the optical image, as a reference image and taking the infrared image as an image to be matched. The entropy region distinguishing module is used for respectively removing low-entropy regions according to the respective image information entropy of the reference image and the image to be matched, and reserving high-entropy regions for subsequent feature point detection. The construction module is used for constructing a Gaussian scale space for the high-entropy region and establishing an image Gaussian pyramid and a Gaussian differential pyramid. The feature point screening module is used for acquiring extreme points in different scale spaces in the Gaussian differential pyramid by using a FAST+SIFT combination algorithm, and accurately positioning and screening the feature points according to the extreme points. The removing module is used for screening and removing unstable points by adopting a threshold method and a Hessian matrix method, and comprises points with low contrast and points positioned at the edge of an image. The characteristic point direction calculation module is used for calculating and determining the characteristic point direction and constructing a key point 64-dimensional annular descriptor. The key point matching module is used for carrying out key point matching by using the Euclidean distance and cosine similarity between vectors as measurement indexes and applying a quick approximate nearest neighbor search FLANN, and eliminating mismatching by using a RANSAC random sampling consistency algorithm.
The image entropy is an estimate of how "busy" an image is, expressed as the average number of bits in the image gray level set, and also describes the average information content of the image source. The entropy of an image is a statistical form of characteristics, which reflects the quantity of average information in the image, and represents the aggregation characteristics of gray distribution of the image, and the larger the entropy of the image information is, the more characteristic points with high contrast and high quality are indicated, and vice versa.
The method comprises the steps of firstly traversing a visible light image and an infrared image of a circuit board by adopting a non-overlapping sliding window and dividing the window, calculating the information entropy of the divided window area, secondly setting a threshold value according to the information entropy of a plurality of local areas of the image, selecting a proper threshold value according to a histogram formed by the acquired information entropy to reserve the local area of the image with high information entropy, simultaneously removing the low information entropy image area, and extracting the characteristic points of the reserved image area by adopting an improved SIFT algorithm.
The specific flow of establishing the Gaussian scale space is that after two images of visible light and infrared are grayed, the two images are respectively doubled and then used as 1 st group of 1 st layers of the Gaussian pyramid, wherein the 1 st group of 1 st layers are positioned at the bottommost end of the Gaussian pyramid and are sequentially sampled upwards, the image of the 1 st group of 1 st layers after Gaussian convolution is used as the 2 nd layer of the 1 st group of pyramid, and the Gaussian convolution function is as follows: Then multiplying sigma by a proportionality coefficient k to obtain a new smoothing factor sigma=k sigma, smoothing the group 1 layer 2 image by using the new smoothing factor sigma=k sigma, taking the formed image result as the group 1 layer 3, repeating the operation to obtain the group 1 layer image, taking the group 1 reciprocal layer 3 image as downsampling of the proportionality factor 2 for the group 2 image, taking the obtained image as the group 2 layer 1, and then smoothing the group 2 layer 1 image by using the smoothing factor sigma to obtain the group 2 layer 2 image, wherein the sizes of the images in the same group are the same as the sizes of the images in the same group, but the smoothing scales of the images are different, and the corresponding smoothing coefficients are 0, sigma, k sigma, 2k sigma and k(L-2)σ respectively.
The Gaussian differential pyramid is constructed, feature point detection and accurate positioning are carried out, specifically, on the basis of the Gaussian pyramid of the image constructed in the previous step, the Gaussian differential pyramid (DOG) can be obtained by subtracting adjacent layers in each group, and subsequent SIFT feature point extraction is carried out on the DOG pyramid. The 1 st layer of the 1 st group of the DOG pyramid is obtained by subtracting the 1 st layer of the 1 st group from the 1 st layer of the Gaussian pyramid, and the steps are repeated to form a Gaussian differential pyramid, simplifying the Gaussian differential scale space, removing the 1 st layer scale space of the 1 st group in the Gaussian differential scale space, and detecting extreme points through the simplified Gaussian differential scale space. Whether a certain characteristic point K in a certain layer of image is a characteristic point or not can be judged, a circle can be drawn by taking the point as the center and the radius is 3 pixels, and 16 pixel points are arranged on a circumference arc line of the circle. By comparing the 16 pixel points with the pixel value of the center point to be measured, whether at least 12 continuous pixel points in the 16 pixel points arranged on the circumference are satisfied to be larger than Ik -t or smaller than Ik +t. If such a requirement is satisfied, K is determined to be a feature point. In order to reduce the feature point detection time, for each point, the pixel points of the positions 1, 5, 9 and 13 with the four positions of up, down, left and right of 90 degrees need to be detected, and if at least 3 of the 4 points meet the condition, the method for detecting the pixel points in 16 fields is continuously carried out on the point. Otherwise, judging that the point is a non-characteristic point, and directly eliminating.
The specific process for removing the points with low contrast and the points at the edges of the image comprises the steps of utilizing a fitting three-dimensional quadratic function to accurately reach the sub-pixel level, substituting Taylor expansion, and taking the first two terms: The method comprises the steps of presetting a first contrast threshold value, comparing and analyzing the contrast of an extreme point with the first contrast threshold value, taking the extreme point with the contrast larger than the first contrast threshold value as a feature point to be selected, presetting a second contrast threshold value, wherein the second contrast threshold value is larger than the first contrast threshold value, continuously storing the extreme point with the contrast larger than the second contrast threshold value as the feature point to be selected, and removing some unstable edge response points because the DOG operator generates stronger edge response. Acquiring a Hessian matrix of the feature points to be selected: Wherein the D value can be obtained by taking the difference between adjacent pixel points, and the characteristic value of H is in direct proportion to the main curvature of D. The unstable edge response points are removed, and the two unstable points are removed by setting a contrast threshold and a Hessian matrix.
In this embodiment, a Hessian matrix H (x, y) of the feature points to be selected is obtained:
Tr (H (x, y))=Dxx(x,y)+Dyy (x, y) represents the sum of characteristic values of the matrix H (x, y), det (H (x, y))=Dxx(x,y)Dyy(x,y)-(Dxy(x,y))2 represents a determinant of the matrix H (x, y), wherein the value of Dxx(x,y),Dxy(x,y),Dyy (x, y) is obtained by differentiating the corresponding positions of the neighborhood of candidate points, the principal curvature of Det (H (x, y)) is proportional to the characteristic values of H (x, y), and the formula is setRepresenting the ratio of the maximum characteristic value to the minimum characteristic value of H (x, y), thenIn order to detect whether the principal curvature is below a certain threshold T2, it is only necessary to detectIf the above formula is established, the feature point is rejected, otherwise, the feature point is reserved.
And calculating the direction of the key points, namely collecting the gradient and direction distribution characteristics of pixels in a 3 sigma neighborhood window of the Gaussian pyramid image where the key points are detected in the DOG pyramid. The modulus and direction of the gradient are as follows:
θ(x,y)=tan-1((L(x,y+1)-L(x,y-1))/(L(x+1,y)-L(x-1,y)))
Wherein L (x, y) is a scale space value at (x, y) where the key point is located, L (x+1, y) is a scale space value at (x+1, y) where the key point is located, L (x-1, y) is a scale space value at (x-1, y) where the key point is located, L (x, y+1) is a scale space value at (x, y+1) where the key point is located, L (x, y-1) is a scale space value at (x, y-1) where the key point is located, m (x, y) is a gradient modulus value, and θ (x, y) is a gradient direction.
And designating a direction parameter for each key point by utilizing the gradient direction distribution characteristic of the key point neighborhood pixels, so that the operator has rotation invariance. And (3) adopting a gradient histogram statistical method, counting, namely determining the direction of the key point by taking the key point as an original point and using the histogram to count the gradient and the direction of pixels in the neighborhood by using the image pixel points in the 3 sigma neighborhood window of the Gaussian pyramid image. The gradient histogram divides the range of directions from 0 to 360 degrees into 36 bins, with 10 degrees per bin. The peak direction of the histogram represents the main direction of the key point, the contribution of the key point to the histogram is reduced along with the field which is far from the central point, the histogram is smoothed by using a Gaussian function, the influence of mutation is reduced, when another peak value equivalent to 80% of the energy of the main peak value exists in the gradient direction histogram, the direction is regarded as the auxiliary direction of the characteristic point, and one characteristic point can be designated to have a plurality of directions, one main direction and more than one auxiliary direction for enhancing the matching robustness.
The key point descriptors are constructed, and feature vectors are formed by adopting annular descriptors, and the main directions of the feature points do not need to be determined because the annular has rotation invariance. Taking a key point as a circle center, taking a round window with a radius of 13 as a neighborhood range of a characteristic point, and dividing the neighborhood into 8 concentric circles, namely 8 sub-areas by respectively taking 2, 3, 4, 5, 6, 8, 10 and 13 pixels with the radius. And counting the pixel gradients and directions of 8 directions (one direction is every 45 degrees) of all the pixel points on each annular subarea. Therefore, the total is 8×8=64, the feature vectors are ordered and weighted by a gaussian window, and normalization is adopted to process the feature vectors in order to reduce negative images generated by the matching effect due to illumination transformation.
And (3) performing key point matching by using a FLANN algorithm, namely after SIFT 64-dimensional feature vectors of the two images are generated, calculating Euclidean distance and cosine similarity between feature point feature vectors of the two images as similarity measurement. The first two feature points closest to the Euclidean distance are found out from the reference image and are called the nearest neighbor and the next nearest neighbor. If the distance of the nearest neighbor feature point divided by the distance of the next nearest neighbor feature point is smaller than a preset proportional threshold value and the cosine similarity is higher than a certain given threshold value, the group of feature points are considered to be successfully matched, otherwise, the feature points are considered to be failed to be matched, namely no matching points exist, and then the matching points in the reference image and the image to be registered are connected through lines so as to realize image registration. After an initial match is made, a partial mismatch may occur in one image. In order to eliminate the mismatching, the RANSAC algorithm is adopted to eliminate the mismatching point pairs so as to realize the fine matching of the images.
Compared with the prior art, the invention has the following beneficial effects:
1. the image area blocking statistical information entropy is adopted, and the high entropy information area is extracted as a detection target image according to the threshold value, so that the accuracy of the matching point pair is improved, and the matching accuracy is higher than that of the traditional SIFT matching point pair.
2. By using the SIFT and FAST combined method, the problems of slower efficiency and low response strength when the traditional algorithm is used for extracting the feature points are solved, the accuracy of the matching point pairs is improved, and the matching accuracy is higher than that of the traditional SIFT matching point pairs.
3. The improved SIFT feature point annular descriptor is adopted, and the overall operation speed of the algorithm is improved on the premise of ensuring the registration quality.
The invention selects the visible light and the infrared image as experimental data, and compares the experimental data with the traditional SIFT algorithm, and compared with the traditional algorithm, the registration efficiency and the precision are obviously improved. The method has wide application prospect in image fusion, remote sensing image processing, computer vision, vision field and power equipment diagnosis.
The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples illustrate only a few embodiments of the invention, which are described in detail and are not to be construed as limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.