Disclosure of Invention
In order to solve the defects in the prior art, the invention provides a characteristic matching method, a system, a medium, a product and equipment based on K clustering, which adopts a dynamic image pyramid to extract characteristic points, adaptively adjusts the layer number of the image pyramid and the sampling scale of each layer aiming at different scenes and image contents, clusters the characteristic descriptors through the similarity of the characteristic descriptors to reduce the search space, performs geometric inspection on the screened matching point pairs, eliminates mismatching, and realizes efficient and accurate characteristic matching.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
in a first aspect, the invention provides a feature matching method based on K clustering.
A characteristic matching method based on K clustering comprises the following steps:
performing edge extraction on the preprocessed image to obtain an edge image;
extracting feature points on edge images of different scales through dynamic image pyramid self-adaption, and generating feature descriptors corresponding to the feature points;
K-means clustering is carried out on the feature descriptors, the feature descriptors are divided into K clusters according to similarity, feature points are matched in each cluster, euclidean distance between the feature descriptors is calculated, and optimal matching point pairs are obtained through screening;
And (3) carrying out secondary screening on the screened matching point pairs, and removing the error matching points inconsistent in the overall situation to obtain a final characteristic matching result.
As a further definition of the first aspect of the present invention, the preprocessing includes image denoising processing and image enhancement processing;
performing edge extraction on the preprocessed image, including:
calculating gradients of the image in horizontal and vertical directions by using a Sobel operator, performing non-maximum suppression and double-threshold screening on the basis of calculating the gradients by using an improved Canny algorithm, removing insignificant edges to obtain a preliminary edge image, and setting two thresholds for the preliminary edge imageAndObtaining the pixel value of the final edge imageIs when (when)Or when G is connected to a strong edge pixel,Is a number of 1, and is not limited by the specification,In the time-course of which the first and second contact surfaces,Is 0, wherein,Is the gradient magnitude.
As a further limitation of the first aspect of the present invention, extracting feature points on edge images of different scales by dynamic image pyramid adaptation includes:
acquiring image complexity, increasing the number of layers of the pyramid when the image complexity is larger than a set threshold value, and reducing the number of layers of the pyramid when the image complexity is lower than or equal to the set threshold value;
and increasing the scaling when the feature area included in the edge image is larger than a set threshold value, and reducing the scaling for the area containing complex or fine features.
As a further limitation of the first aspect of the present invention, the extraction result of the feature points in the current image layer tends to be stable, and no new significant features appear at a smaller scale, so that pyramid processing is stopped in advance;
Analyzing gradient distribution conditions of the image on different scales, judging density change of the feature points, and stopping pyramid processing in advance if the distribution of the feature points tends to be sparse under smaller scales.
As a further limitation of the first aspect of the present invention, the extracted feature descriptors are clustered, and the feature descriptors are divided into K clusters according to their similarity;
For each cluster, a corresponding cluster with the closest feature description is found in the first image and the second image, the feature points in the cluster K1 form a cluster in the first image, and a cluster is formed in the second image, so that the feature points in the corresponding cluster K1 are matched without traversing all the feature points, and in each pair of corresponding clusters, the feature matching is performed by using the similarity measurement between feature descriptors, and only the feature points in the same cluster are considered without cross-cluster matching.
As a further limitation of the first aspect of the present invention, the RANSAC algorithm is used to perform geometric verification on the screened matching point pairs, and mismatching points that are locally matched but not globally consistent with geometric consistency are removed.
In a second aspect, the invention provides a feature matching system based on K clustering.
A K-cluster based feature matching system, comprising:
The image preprocessing unit is configured to perform edge extraction on the preprocessed image to obtain an edge image;
The dynamic feature extraction unit is configured to extract feature points on edge images of different scales through dynamic image pyramid self-adaption and generate feature descriptors corresponding to the feature points;
the clustering unit is configured to perform K-means clustering on the feature descriptors, divide the feature descriptors into K clusters according to similarity, match feature points in each cluster, and screen to obtain optimal matching point pairs by calculating Euclidean distances among the feature descriptors;
And the secondary screening unit is configured to perform secondary screening on the screened matching point pairs, and remove the mismatching points inconsistent in the overall situation to obtain a final characteristic matching result.
In a third aspect, the present invention provides a computer device comprising a processor and a computer readable storage medium;
a processor adapted to execute a computer program;
a computer readable storage medium having stored therein a computer program which, when executed by the processor, implements the K-cluster based feature matching method according to the first aspect of the present invention.
In a fourth aspect, the present invention provides a computer readable storage medium storing a computer program adapted to be loaded by a processor and to perform the K-cluster based feature matching method according to the first aspect of the present invention.
In a fifth aspect, the present invention provides a computer program product comprising a computer program which, when executed by a processor, implements the K-cluster based feature matching method according to the first aspect of the present invention.
Compared with the prior art, the invention has the beneficial effects that:
1. The invention innovatively provides a dynamic image pyramid processing method, which is an improved multi-scale image processing technology, the number of layers of an image pyramid and the sampling scale of each layer can be adaptively adjusted according to different scenes and image contents, and more abundant characteristic points can be obtained in a complex scene by adaptively adjusting the number of layers and the proportion through analyzing the image contents.
2. The method creatively provides a characteristic matching method based on K-means clustering, which is characterized in that the K-means clustering is carried out on descriptors with higher similarity, the descriptors are divided into different clusters, the calculation amount in the subsequent matching is reduced by reducing the number of characteristic points of each cluster, then the characteristic matching is carried out in the clusters, the screened characteristic points are subjected to secondary screening by utilizing RANSAN algorithm, the searching range is greatly reduced, the mismatching is eliminated, and the matching accuracy is improved.
Additional aspects of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Detailed Description
The invention will be further described with reference to the drawings and examples.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
Embodiments of the invention and features of the embodiments may be combined with each other without conflict.
Example 1:
The conventional feature matching algorithm has a certain limitation in terms of instantaneity and precision, and therefore, the implementation manner provides a feature matching method based on K-means clustering, as shown in fig. 1, which comprises the following processes:
Preprocessing an input original image, including denoising and image enhancement, and then extracting edges to generate an edge image;
the dynamic image pyramid is constructed, so that characteristic points can be adaptively extracted on images with different scales according to different scenes and image contents, and corresponding characteristic descriptors are generated;
k-means clustering is carried out on the extracted feature descriptors, K cluster centers are initialized, and the feature descriptors are divided into corresponding K nearest clusters according to the similarity (calculated distance) of the feature descriptors;
Updating the cluster center, ending the algorithm when the iteration times are reached or the convergence condition is reached, continuing iteration when the iteration times are not reached or the convergence condition is not reached, and obtaining the characteristic point clusters in each image frame after the algorithm is ended;
Searching and matching the feature points in each corresponding cluster, and screening to obtain an optimal matching point pair for accurate matching by calculating Euclidean distance between feature descriptors;
And (3) carrying out secondary screening verification on the screened matching point pairs by using a RANSAC algorithm (random sampling consistency algorithm), removing the error matching points which are inconsistent in the whole, and further improving the matching accuracy.
In the implementation mode, K clustering (K-means clustering) is an unsupervised machine learning algorithm which is widely used for data classification and cluster analysis, can effectively group data and reduce calculation complexity, and has the core targets of dividing the data into K clusters (clusters), wherein each cluster consists of similar data points, the data points are divided into nearest cluster centers (centroids) through an iterative optimization method, and finally, the data points in the clusters are similar to each other, and the difference of the data points among the clusters is larger.
In the implementation mode, in order to improve the accuracy of feature extraction and matching, the algorithm is ensured to work stably under various complex conditions. The original image often contains noise, uneven illumination and other problems, which may lead to inaccurate or lost feature point extraction. Interference in the image can be reduced through denoising processing, key details are reserved, and feature points are ensured to be clearer; the image enhancement is to enhance the contrast and highlight the details so that the characteristic points can still be effectively detected by the images with different brightness and contrast. The edge detection can highlight the structural information of the edges and the outlines in the image, so that the feature points are easier to distribute around the areas, the effectiveness of the feature points and the rationality of distribution are improved, the image quality is improved through the preprocessing steps, the follow-up feature extraction and matching are more reliable and accurate, and the preprocessing process is shown in fig. 2.
The denoising process specifically comprises the following steps:
carrying out Gaussian filtering on an input image, carrying out smoothing processing on the image by applying a Gaussian filter, removing high-frequency noise, adjusting the kernel size of the filter, dynamically selecting a proper smoothing level according to the noise intensity, and expressing the two-dimensional form of the Gaussian filter as:
(1);
Wherein,AndIs the coordinates of the pixels in the image,Is the standard deviation of the gaussian kernel, and the filtering process is to convolve this gaussian function with the image:
(2);
Wherein,Is the pixel value of the original image,The size of the filter kernel is represented, u and v respectively represent the coordinate offset values of the pixel coordinates x and y in the Gaussian filter window relative to the current pixel, the value ranges are between-k and k,Representing the convolution result;
After Gaussian filtering, a bilateral filter is applied, edge details are reserved when the bilateral filtering is used for denoising, and edge blurring is avoided by combining the difference of the spatial distance and the pixel intensity, and the bilateral filtering formula is as follows:
(3);
Wherein,AndIs the location of a pixel in the image,Is a neighborhood around the pixel and,Is a spatial distance weight (typically a gaussian function),Is the pixel intensity difference weight,Is a normalization factor, ensuring that the sum of the weights is 1.
In this implementation manner, the image enhancement processing specifically includes the following steps:
Calculating gray distribution of the denoised image, generating a cumulative histogram, applying histogram equalization to enhance the contrast of bright and dark areas of the image, and adjusting the brightness and contrast of the whole image:
(4);
Wherein: Is the pixel value after the enhancement and,Is the number of gray levels of the image (typically 256),Is the total number of pixels in the image,Is a pixel value ofIs a frequency of (2);
And then, on the basis of histogram equalization, local self-adaptive equalization is applied, so that local contrast is further improved, enhancement is performed in a local area, excessive enhancement is avoided, and local details of an image are protected.
After the first two steps, the image at the moment can be used for extracting the feature points, and in order to improve the accuracy and stability of feature extraction, edge detection of the image is performed again so as to extract the features of the image better.
Firstly, calculating gradients in the horizontal direction and the vertical direction of an image by using a Sobel operator, generating an edge map according to gradient information, and calculating the gradients in the horizontal direction and the vertical direction of the image by using the Sobel operator, wherein the formula is as follows:
(5);
(6);
Wherein,AndRepresenting the gradient in the horizontal direction and the vertical direction respectively,Representing convolution operations, gradient magnitudesThe calculation formula is as follows:
(7);
and (3) performing non-maximum suppression and double-threshold screening on the basis of calculating the gradient by using an improved Canny algorithm, removing insignificant edges, and performing edge tracking after edge detection is completed to ensure the continuity of the edges. By setting two thresholdsAndThe gradient amplitude is calculatedThe classification is strong, weak and non-edge. Pixel values of final output edge imageIs determined by the following conditions whenOr when G is connected to a strong edge pixel,Is a number of 1, and is not limited by the specification,In the time-course of which the first and second contact surfaces,Is 0, wherein,Is the gradient magnitude.
The fixed scaling of the conventional image pyramid (as shown in fig. 3) may result in that suitable feature points cannot be fully extracted in some scenes, and the global unified scaling manner cannot adapt to the feature complexity of different areas in the image. For complex detail regions, the scaling may result in loss of fine features, while in simple regions, excessive refinement may be caused, and unnecessary computation is increased, and the conventional pyramid generally uses the same layer number in all image regions, for example, the layer number is uniformly used as 8 in ORB-SLAM2, so that the layer number cannot be adjusted according to the complexity of different regions of the image, for simple regions, redundant feature extraction and computation may be caused, and for complex regions, the layer number is insufficient and all important feature information may not be captured.
In view of the above problems, the present implementation manner proposes a dynamic pyramid processing method, as shown in fig. 4, which is an improved multi-scale image processing technology, and aims to adaptively adjust the number of layers of an image pyramid and the sampling scale of each layer, and adaptively select the number of pyramid layers and the sampling scale of each layer according to different scenes and image contents, such as the quality, texture complexity and illumination condition of an image, so that more abundant feature points can be obtained in complex scenes.
1) The method for processing the dynamic pyramid can adaptively adjust the layer number of the pyramid according to the complexity of the image content and the distribution condition of the feature points, and particularly dynamically adjust the layer level of the pyramid through analysis of the local complexity (such as gradient change, texture concentration and the like) of the image;
For complex image scenes (i.e., images with complexity greater than a set threshold, such as texture-dense), the number of pyramid layers is increased so that features are extracted at more scales to ensure that fine features in the complex scene are not ignored, and for simple image scenes (i.e., areas with complexity less than or equal to a set threshold, such as flat or simpler geometry), the number of pyramid layers is reduced, avoiding unnecessary feature extraction and computation, thereby improving efficiency.
2) The adaptive scaling is that the dynamic pyramid process selects the most suitable scaling according to the characteristics of the local area of the image, the scaling is not globally fixed but can be flexibly changed for different areas, the scaling can be larger for those areas containing large-size features (i.e. feature areas are larger than a set threshold, such as the outline or structure of a large object), redundancy of detail capture is reduced, meanwhile, the characteristics in a large range can be effectively identified, and smaller scaling is used for areas containing complex and fine features, so that the key features are captured in a finer scale.
3) The pyramid stopping condition based on the content comprises the steps of dynamically judging whether scaling is needed to be continued according to the image content by a dynamic pyramid method to avoid unnecessary processing, terminating processing when the situation that the pyramid is encountered, stopping pyramid processing in advance if the extraction result of feature points in a current image layer tends to be stable and no new significant features appear at smaller scales, saving calculation resources, judging density change of the feature points by analyzing gradient distribution conditions of the image at different scales, and eliminating the need of continuously shrinking the image scale if the distribution of the feature points tends to be sparse at the smaller scales.
In this implementation manner, the feature matching is performed by using K-means clustering, specifically, as shown in fig. 5, including:
1) Clustering the extracted feature descriptors, dividing the feature descriptors into K clusters according to the similarity, and when a convergence condition is reached, considering that the clusters are converged, namely the descriptors are divided into K clusters according to the similarity, terminating the algorithm, wherein the convergence condition has the following formula:
(8);
wherein, the term "euclidean distance" refers to the vector,Is a preset convergence threshold, t represents the current iteration round, t-1 represents the last iteration round,Representing feature descriptors in the kth cluster of the t iteration round,Representing feature descriptors in the kth cluster of the t-1 iteration round.
This clustering is based on the similarity of descriptors, but the spatial positions of the feature points are unchanged, as shown in fig. 6 and 7 (where x and y represent the pixel abscissa and the pixel ordinate in the image, respectively), after the feature points of the two images are clustered, K clusters in each image are obtained (fig. 7 only shows 3 clusters to illustrate, and only includes K1, K2, and K3), and the feature points in each cluster represent feature points similar in some feature description.
2) For each cluster, the corresponding cluster with the closest feature description is found in image a (i.e., the first image) and image B (i.e., the second image). For example, the feature points in the cluster K1 form a cluster in the image A and similar clusters in the image B, so that the matching of the feature points is only needed in the corresponding cluster K1 without traversing all the feature points;
3) And (3) performing geometric verification on the screened matching point pairs by using a RANSAC algorithm, removing mismatching points which are seemingly matched locally but not in accordance with geometric consistency globally, ensuring that the final matching result has global consistency, and further improving the matching accuracy.
Example 2:
as shown in fig. 8, the present implementation provides a feature matching system based on K clustering, including:
The image preprocessing unit is configured to perform edge extraction on the preprocessed image to obtain an edge image;
The dynamic feature extraction unit is configured to extract feature points on edge images of different scales through dynamic image pyramid self-adaption and generate feature descriptors corresponding to the feature points;
the clustering unit is configured to perform K-means clustering on the feature descriptors, divide the feature descriptors into K clusters according to similarity, match feature points in each cluster, and screen to obtain optimal matching point pairs by calculating Euclidean distances among the feature descriptors;
And the secondary screening unit is configured to perform secondary screening on the screened matching point pairs, and remove the mismatching points inconsistent in the overall situation to obtain a final characteristic matching result.
The specific operation method of each unit is described in embodiment 1, and will not be described here.
It will be understood that each of the above units may be combined into one or several other units separately or in total, or some (some) of the units may be further split into a plurality of units with smaller functions, which may achieve the same operation without affecting the implementation of the technical effects of the embodiments of the present application. The above units are divided based on logic functions, and in practical applications, the functions of one unit may be implemented by a plurality of units, or the functions of a plurality of units may be implemented by one unit. In other embodiments of the application, the system may also include other units, and in actual practice, these functions may be aided by other units and may be implemented by multiple units in concert.
According to another embodiment of the present application, the system described in this embodiment may be constructed by running a computer program (including a program code) capable of executing the steps involved in the corresponding method described in embodiment 1 on a general-purpose computing device such as a computer including a central processing unit (Central Processing Unit, CPU), an access storage medium (Random Access Memory, RAM), a Read Only Memory (ROM), and the like, and the storage element, and the method of embodiment 1 of the present application may be implemented, and the computer program may be recorded on, for example, a computer-readable recording medium, and loaded and run in the above-described computing device via the computer-readable recording medium.
Example 3:
As shown in fig. 9, the present implementation provides an electronic device that includes a processor 1001, a communication interface 1002, and a computer-readable storage medium 1003. Wherein the processor 1001, the communication interface 1002, and the computer-readable storage medium 1003 may be connected by a bus or other means.
Wherein the communication interface 1002 is for receiving and transmitting data, the computer readable storage medium 1003 may be stored in a memory of the electronic device, the computer readable storage medium 1003 is for storing a computer program comprising program instructions, and the processor 1001 is for executing the program instructions stored in the computer readable storage medium 1003.
The processor 1001, or CPU (Central Processing Unit )), is a computing core and a control core of the electronic device, which are adapted to implement one or more instructions, in particular to load and execute one or more instructions to implement a corresponding method flow or a corresponding function.
The processor 1001 is configured to perform the following:
performing edge extraction on the preprocessed image to obtain an edge image;
extracting feature points on edge images of different scales through dynamic image pyramid self-adaption, and generating feature descriptors corresponding to the feature points;
K-means clustering is carried out on the feature descriptors, the feature descriptors are divided into K clusters according to similarity, feature points are matched in each cluster, euclidean distance between the feature descriptors is calculated, and optimal matching point pairs are obtained through screening;
And (3) carrying out secondary screening on the screened matching point pairs, and removing the error matching points inconsistent in the overall situation to obtain a final characteristic matching result.
The specific working method is described in embodiment 1, and is not described here again.
Example 4:
The present implementation provides a computer-readable storage medium (Memory) that is a Memory device in an electronic device for storing programs and data. It is understood that the computer readable storage medium herein may include both built-in storage media in an electronic device and extended storage media supported by the electronic device. The computer readable storage medium provides a memory space that stores a processing system of the electronic device.
Also stored in the memory space are one or more instructions, which may be one or more computer programs (including program code), adapted to be loaded and executed by the processor. It should be noted that the computer readable storage medium may be a high speed RAM memory, or may be a non-volatile memory, such as at least one magnetic disk memory, or may alternatively be at least one computer readable storage medium located remotely from the foregoing processor.
In one embodiment, the computer readable storage medium has one or more instructions stored therein, and the processor loads and executes the one or more instructions stored in the computer readable storage medium to implement the following:
performing edge extraction on the preprocessed image to obtain an edge image;
extracting feature points on edge images of different scales through dynamic image pyramid self-adaption, and generating feature descriptors corresponding to the feature points;
K-means clustering is carried out on the feature descriptors, the feature descriptors are divided into K clusters according to similarity, feature points are matched in each cluster, euclidean distance between the feature descriptors is calculated, and optimal matching point pairs are obtained through screening;
And (3) carrying out secondary screening on the screened matching point pairs, and removing the error matching points inconsistent in the overall situation to obtain a final characteristic matching result.
The specific working method is described in embodiment 1, and is not described here again.
Example 5:
The present implementation provides a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the electronic device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to cause the electronic device to perform the following:
performing edge extraction on the preprocessed image to obtain an edge image;
extracting feature points on edge images of different scales through dynamic image pyramid self-adaption, and generating feature descriptors corresponding to the feature points;
K-means clustering is carried out on the feature descriptors, the feature descriptors are divided into K clusters according to similarity, feature points are matched in each cluster, euclidean distance between the feature descriptors is calculated, and optimal matching point pairs are obtained through screening;
And (3) carrying out secondary screening on the screened matching point pairs, and removing the error matching points inconsistent in the overall situation to obtain a final characteristic matching result.
The specific working method is described in embodiment 1, and is not described here again.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions in accordance with embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable devices. The computer instructions may be stored in or transmitted across a computer-readable storage medium. The computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). Computer readable storage media can be any available media that can be accessed by a computer or data processing device, such as a server, data center, or the like, that contains an integration of one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk (Solid STATE DISK, SSD)), or the like.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.