Disclosure of Invention
In order to solve the problems in the prior art, the invention provides an unsupervised polarized SAR image change detection method based on regional relative peak values and VIT. The technical problems to be solved by the invention are realized by the following technical scheme:
the invention provides an unsupervised polarization SAR image change detection method based on regional relative peak value and VIT, which comprises the following steps:
Step 1, acquiring dual-phase original polarized SAR images of the same region, and registering the dual-phase original polarized SAR images to obtain registered polarized SAR image pairs;
step 2, extracting edge information and regional relative peak values of the polarized SAR image pair;
step 3, correcting the relative peak value of each polarized SAR image by utilizing the edge information of each polarized SAR image to obtain an edge corrected relative peak value of the polarized SAR image pair;
step 4, reconstructing the relative peak value by utilizing the edge correction to obtain a multi-scale super-pixel reconstruction difference map of the polarized SAR image pair;
step 5, performing double-threshold segmentation marking on the multi-scale super-pixel reconstruction difference map of each polarized SAR image to obtain a pseudo-marking map;
the pseudo mark graph comprises a changed area, an unchanged area and an unknown area;
step 6, constructing a training sample marked with labels according to the changed area and the unchanged area in the pseudo-label graph and constructing a prediction sample without labels according to the unknown area in the pseudo-label graph;
Step 7, taking the training sample as input of a preset VIT network, taking a changed area and an unchanged area in the pseudo-marker graph as pixel class labels, and iteratively training the preset VIT network until reaching a training cut-off condition to obtain a trained VIT network;
And 8, inputting the prediction sample into a trained VIT network to obtain a detection result for predicting whether the original polarized SAR image of the double time phases changes.
Optionally, extracting edge information of the polarized SAR image pair in step 2 includes:
Step 2-11, calculating polarization covariance matrixes corresponding to all pixel points of each polarized SAR image in the polarized SAR image pair;
step 2-12, extracting the amplitude values of diagonal elements and upper triangle elements in each polarization covariance matrix to obtain a channel image of each polarization SAR image in the polarization SAR image pair;
step 2-13, calculating the edge value of each channel image by using an edge detection method to obtain multi-channel edge information of each channel image;
Step 2-14, calculating the average value of the multi-channel edge information of each channel image to obtain the channel average value edge information of each polarized SAR image;
And 2-15, averaging the channel mean value edge information of each polarized SAR image to obtain the edge information of the polarized SAR image pair.
Optionally, extracting the relative peak value of the polarized SAR image pair in step 2 includes:
Step 2-21, calculating the relative peak value of the region of each polarized SAR image in the polarized SAR image pair according to the rule that the polarized SAR image obeys K-wishart distribution, and obtaining a relative peak value set of each polarized SAR image;
And 2-22, averaging each relative peak value set to obtain the regional relative peak value of the polarized SAR image pair.
Optionally, the step 3 includes:
step 3-1, performing binarization segmentation on the edge information of the polarized SAR image pair by using a threshold binarization method to obtain a threshold corresponding to the edge information of the polarized SAR image pair;
and 3-2, correcting the relative peak value of the polarized SAR image pair by utilizing the edge information of each pixel point in the polarized SAR image pair to obtain the edge corrected relative peak value of the polarized SAR image pair after correction.
Optionally, the step4 includes:
step 4-1, calculating an edge correction relative peak value difference map by using a polarization covariance matrix corresponding to each polarized SAR image and the edge correction relative peak value;
step 4-2, performing super-pixel segmentation on the edge correction relative peak difference graph on a plurality of scales by using a super-pixel segmentation method to obtain a super-pixel set;
Step 4-3, counting the average value and the median value of all pixels of the edge correction relative peak difference map in the super pixel set range, and obtaining a mean value set and a median value set of the edge correction relative peak difference map corresponding to the super pixel set range;
And 4-4, reconstructing the multi-scale super-pixel reconstruction difference map of the polarized SAR image pair by utilizing the edge correction relative peak value of the polarized SAR image pair, the edge correction relative peak value difference map, the mean value set and the median value set.
Optionally, the step 6 includes:
Step 6-1, stacking pixel-by-pixel neighborhood of the value corresponding to the marked region in the polarization covariance matrix in the pseudo-marker graph and the difference value corresponding to the marked region in the multi-scale super-pixel reconstruction difference graph in the pseudo-marker graph, and taking the stacking result as a training sample;
and 6-2, stacking the values of the unknown areas in the pseudo-marker map in the polarization covariance matrix and the difference values of the unknown areas in the pseudo-marker map in the multi-scale super-pixel reconstruction difference map in a pixel-by-pixel neighborhood mode, and taking the stacking result as a prediction sample.
1. According to the invention, the edge correction area relative peak value and the multi-scale superpixel segmentation of the polarized SAR image of any two time phases are used for constructing the difference image between any two time phases, so that the defect that the change information is lost due to the fact that the difference image between the polarized SAR images of the two time phases is only measured by using partial polarization characteristics instead of a complete polarization scattering matrix, and the change area cannot be accurately obtained from the difference image due to the fact that the global threshold Otsu method is used for carrying out threshold processing on the difference image is avoided, the boundary positioning capability and the capability of keeping the change information are improved, and the overall accuracy of polarized SAR change detection and the internal consistency of the area are effectively improved.
2. According to the invention, by utilizing the VIT network to automatically extract the deep semantic features of the images from the multi-temporal polarized data and the difference map for recognition, the defect that the CNN adopted in the traditional TCD-Net technology can only capture short-range contexts in the polarized SAR image and is insufficient for fully exploring the data correlation between different slices, so that the change perception features cannot be effectively extracted from the multi-temporal polarized SAR image with complex backscattering characteristics is avoided, and the accuracy rate of polarized SAR change detection and the internal consistency of the region are effectively improved. Experimental results show that the method can obtain higher overall precision and regional internal consistency in the detection of the polarization SAR variation.
The present invention will be described in further detail with reference to the accompanying drawings and examples.
Detailed Description
The present invention will be described in further detail with reference to specific examples, but embodiments of the present invention are not limited thereto.
Before describing the present invention, the origin of the idea of the present invention will be described first.
The tranformer model based on attention mechanism proposed by google has been used in fields of natural language processing, image processing and the like and has obtained classical deep learning networks such as RNN, CNN and the like, in particular Vision Transformer (VIT) suitable for the field of computer vision.
Referring to fig. 1, the method for detecting the change of the unsupervised polarized SAR image based on the regional relative peak value and the VIT provided by the invention comprises the following steps:
Step 1, acquiring dual-phase original polarized SAR images of the same region, and registering the dual-phase original polarized SAR images to obtain registered polarized SAR image pairs;
It is worth to say that the polarized SAR image T1 of the first time phase in the same area is taken as a reference image, and the polarized SAR image T2 of the second time phase is registered with the reference image to obtain a registered polarized SAR image set T1'、T2', wherein the sizes of the polarized SAR images of the two time phases are H multiplied by L, H is more than or equal to 200, L is more than or equal to 200;
step 2, extracting edge information and relative peak values of the polarized SAR image pair;
As an optional embodiment of the present invention, extracting edge information of the polarized SAR image pair in the step 2 includes:
Step 2-11, calculating polarization covariance matrixes corresponding to all pixel points of each polarized SAR image in the polarized SAR image pair;
Extracting the amplitudes of the diagonal elements and the upper triangle elements of the polarization covariance matrix C1、C2 corresponding to all pixel points in the registered T1'、T2 'to obtain a 6-channel image corresponding to T1'、T2'
In the above step, the polarization covariance matrix C may be expressed as:
Wherein, (·)* represents complex conjugate, ·represents ensemble averaging, Shh represents horizontally transmitted and horizontally received echo data, Shv represents horizontally transmitted and vertically received echo data, and Svv represents vertically transmitted and vertically received echo data;
step 2-12, extracting the amplitude values of diagonal elements and upper triangle elements in each polarization covariance matrix to obtain a channel image of each polarization SAR image in the polarization SAR image pair;
step 2-13, calculating the edge value of each channel image by using an edge detection method to obtain multi-channel edge information of each channel image;
Step 2-14, calculating the average value of the multi-channel edge information of each channel image to obtain the channel average value edge information of each polarized SAR image;
And 2-15, averaging the channel mean value edge information of each polarized SAR image to obtain the edge information of the polarized SAR image pair.
It is worth to say that the edge value of each channel in I1、I2 is calculated by an edge detection method to obtain the 6-channel edge information corresponding to I1、I2Calculating the average value of the edge information of each channel in E1、E2 to obtain the average value edge information of the channel corresponding to E1、E2Calculation ofAnd (3) withTo obtain the edge information corresponding to T1'、T2The invention can select ROEWA operators suitable for polarized SAR images as an edge detection method.
As an alternative embodiment of the present invention, the extracting the relative peak value of the polarized SAR image pair in step 2 includes:
Step 2-21, calculating a relative peak value of each polarized SAR image in the polarized SAR image pair according to the rule that the polarized SAR images obey K-wishart distribution, and obtaining a relative peak value set of each polarized SAR image;
And 2-22, calculating an average value of each relative peak value set to obtain the relative peak value of the polarized SAR image pair.
According to the polarized SAR image obeying K-wishart distribution, the relative peak value R1(x,y)、R2 (x, y) of the pixel point demarcation region with the position (x, y) in T1'、T2 'is calculated, and a relative peak value set R1、R2 corresponding to T1'、T2' is obtained. Calculating the average value of R1 and R2 to obtain the relative peak value corresponding to R1 and R2Wherein Tm' defines the relative peak value of the region for all pixel points with (x, y)The calculation formula of (2) is as follows:
Wherein,Represents the horizontally transmitted and horizontally received echo data at the (x, y) position in the mth phase image,Representing the horizontally transmitted and vertically received echo data at the (x, y) position in the mth phase image,Representing the echo data of the vertical transmission and the vertical reception at the (x, y) position in the mth time phase image, & represents the modulus taking this number, E {. Cndot. }, represents the mean taking this number, in the present invention the window size of the delimited area may be 3.
Step 3, correcting the relative peak value of each polarized SAR image by utilizing the edge information of each polarized SAR image to obtain an edge corrected relative peak value of the polarized SAR image pair;
as an alternative embodiment of the present invention, the step 3 includes:
step 3-1, performing binarization segmentation on the edge information of the polarized SAR image pair by using a threshold binarization method to obtain a threshold corresponding to the edge information of the polarized SAR image pair;
and 3-2, correcting the relative peak value of the polarized SAR image pair by utilizing the edge information of each pixel point in the polarized SAR image pair to obtain the edge corrected relative peak value of the polarized SAR image pair after correction.
The invention utilizes the threshold value binarization method pairPerforming binarization segmentation to obtainCorresponding threshold valueThe threshold value binarization method of the invention can adopt an OSTU binarization method byEdge information of all the pixels at (x, y)For a pair ofThe relative peak value of the pixel at (x, y) in all positionsCorrecting to obtain corrected edge correction relative peak valueEdge corrected relative peak value of pixel with position (x, y)The correction formula of (2) is:
where α is an empirical constant, and it is desirable in practice that α=100.
Step 4, reconstructing the relative peak value by utilizing the edge correction to obtain a multi-scale super-pixel reconstruction difference map of the polarized SAR image pair;
As an alternative embodiment of the present invention, the step 4 includes:
step 4-1, calculating an edge correction relative peak value difference map by using a polarization covariance matrix corresponding to each polarized SAR image and the edge correction relative peak value;
step 4-2, performing super-pixel segmentation on the edge correction relative peak difference graph on a plurality of scales by using a super-pixel segmentation method to obtain a super-pixel set;
Step 4-3, counting the average value and the median value of all pixels of the edge correction relative peak difference map in the super pixel set range, and obtaining a mean value set and a median value set of the edge correction relative peak difference map corresponding to the super pixel set range;
And 4-4, reconstructing the multi-scale super-pixel reconstruction difference map of the polarized SAR image pair by utilizing the edge correction relative peak value of the polarized SAR image pair, the edge correction relative peak value difference map, the mean value set and the median value set.
The specific process for obtaining the multi-scale super-pixel reconstruction difference map is as follows:
(1) Relative peak correction by two-phase polarization SAR covariance matrix C1、C2 and edge correctionObtaining an edge correction relative peak difference map DI, wherein a difference value calculation formula of pixel points with positions (x, y) in DI is as follows:
Wherein dW {.cndot }, represents the modified Wishart distance satisfying symmetry,Representing calculating an average value in a pixel neighborhood omega(x,y) at (x, y), wherein in the embodiment, the window size of the pixel neighborhood is 3;
In the above step, the calculation formula of the corrected Wishart distance dW {. Cndot } satisfying symmetry between the polarization matrices C1(x,y)、C2 (x, y) is:
wherein p=3 in the full polarization data;
in the above steps, the average value in neighborhood Ω(x,y) at position (x, y) in any m-phase polarization covariance matrix Cm
Wherein N is the number of pixel points in the neighborhood omega(x,y);
(2) Super-pixel segmentation is carried out on DI on L scales by utilizing a super-pixel segmentation method to obtain a super-pixel setIn this embodiment, the super pixel dividing method adopts an SLIC method;
(3) Counting average meanl of all pixels within each super pixel Sl to obtain DI corresponding toMean value set of (a)
(4) Statistical DI corresponds to average value medial of all pixels within each super-pixel Sl to obtain DI corresponds toMean value set of (a)
(5) By passing throughDI、AndObtaining a multi-scale super-pixel reconstruction difference map DI ', wherein a difference value calculation formula of pixel points with positions (x, y) in DI' is as follows:
Step 5, performing double-threshold segmentation marking on the difference map of each polarized SAR image to obtain a pseudo-marking map;
the pseudo mark graph comprises a changed area, an unchanged area and an unknown area;
The method can divide DI' by using a double-threshold method to obtain the pseudo-marker diagram PI containing the changed region, the unchanged region and the unknown region, wherein in the embodiment, the double-threshold method adopts an OSTU double-threshold method.
Step 6, constructing a training sample marked with labels according to the changed area and the unchanged area in the pseudo-label graph and constructing a prediction sample without labels according to the unknown area in the pseudo-label graph;
As an alternative embodiment of the present invention, the step 6 includes:
Step 6-1, stacking the value of the marked region in the pseudo-marker graph corresponding to the polarization covariance matrix and the difference value of the marked region in the pseudo-marker graph corresponding to the multi-scale super-pixel reconstruction difference graph in a pixel-by-pixel neighborhood manner, and taking the stacking result as a training sample;
And 6-2, stacking the values of the unknown areas in the pseudo-marker image corresponding to the polarization covariance matrix and the difference values of the unknown areas in the pseudo-marker image corresponding to the multi-scale super-pixel reconstruction difference image in a pixel-by-pixel neighborhood manner, and taking the stacking result as a prediction sample.
It should be noted that the training samples and the prediction samples are identical in construction process, and different parameters are utilized, namely, polarization data and difference values at the same position (x, y) in the phase polarization SAR covariance matrix C1、C2 and DI' are stacked pixel by pixel neighborhood, and the stacked polarization data and difference values are used as input patches of the preset VIT network, wherein the input patches are expressed as:
patch={vec1(x,y)||vec2(x,y)||DI'(x,y),1≤x≤H,1≤y≤L}
Wherein,The representation takes the real part of the component,Representing the imaginary part, vec1(x,y)、vec2 (x, y) and DI '(x, y) respectively represent the corresponding vectors vec1、vec2 of the polarized SAR images of the two time phases and the polarization data of the (x, y) position in DI' pixel by pixel neighborhood, obtaining training samples and input samples, wherein both types of samples are input pixel blocks, the number of which is h×l, and the input size in the step can be set to be 13×13.
Step 7, taking the training sample as input of a preset VIT network, taking a changed area and an unchanged area in the pseudo-marker graph as pixel class labels, and iteratively training the preset VIT network until reaching a training cut-off condition to obtain a trained VIT network;
It is worth to describe that referring to fig. 2 and 3, input patches are used for VIT network training, taking training a tensor tensor as an example, the process is that each patch in tensor is flattened and dimension reduced through a Linear Layer, the tensor position is encoded by position encoding (pos_ embedding) to obtain a training tensor, classification class_token is newly built and connected in parallel with tensor, the result is added with the position encoding, dropout Layer is connected, a Transformer structure is used for extracting features, classification results are obtained through a Pool Layer, a Layer Norm Layer and a Linear Layer in sequence until the network converges, a classifier for extracting change perception features is obtained, in the embodiment, the training network adopts an Adam optimizer with a learning rate of 0.001, the training iteration number is 50 epoch, and the loss function is cross entropy loss.
The training cut-off condition is that the VIT network converges or reaches iteration times.
And 8, inputting the prediction sample into a trained VIT network to obtain a detection result for predicting whether the original polarized SAR image of the double time phases changes.
And inputting a corresponding prediction sample of an unknown region in the pseudo-marker diagram PI into the trained network and predicting to obtain a final change detection result.
The effect of the invention can be further confirmed by the following experiments:
1. simulation conditions and content:
The simulation experiment environment is :MATLAB R2016a,Intel(R)Core(TM)i7-8700 CPU 3.20GHz,Window 10,PyTorch1.5.1,NVIDIA GeForce RTX 1660Ti GPU.
The invention verifies the overall accuracy and internal consistency of the change detection of the invention and the existing unsupervised polarization SAR change detection method (PDI) based on the image local area texture characteristics and the self-adaptive end-to-end three-channel deep neural network (TCD-Net) based on transfer learning by adopting two-time phase different data, and the result is shown in a table 1 in simulation result analysis, wherein the internal consistency is evaluated by Kappa coefficient.
Simulation 1, comparing and simulating the change detection result of the polarized SAR image of the data of the two-time-phase airport 1, which is applied to the invention and the prior art, wherein the result is shown in figure 4.
Simulation 2, comparing and simulating the change detection result of the polarized SAR image of the data of the two-time-phase airport 2, which is applied to the invention and the prior art, and the result is shown in figure 5.
2. Simulation result analysis:
Referring to fig. 4, wherein (a) is a detection result of the existing method PDI of airport 1 data, (b) is a detection result of the existing method TCD-Net of airport 1 data, (c) is a change detection result of the method of the present invention of airport 1 data, (d) is a reference image of the change detection result of airport 1 data, white areas in the image represent change areas, and black areas represent unchanged areas, and the (a) image, (b) image, (c) image and (d) image are respectively compared, so that the retention capability of the (c) image on local edge detail information and the suppression capability of the (c) on local micro noise are stronger than those of the (a) image and the (b) image, and the results given by the reference image are closer;
Referring to fig. 5, where (a) is a detection result of the existing method PDI of airport 2 data, (b) is a detection result of the existing method TCD-Net of airport 2 data, (c) is a change detection result of the method of the present invention of airport 2 data, and (d) is a reference image of the change detection result of airport 2 data, white areas in the image indicate change areas, and black areas indicate unchanged areas, and comparing (a) with (b) and (c) with (d) respectively, it can be seen that the retention capability of (c) on local edge detail information and the suppression capability of local micro noise are stronger than those of (a) with (b) and are closer to the results given by the reference image.
TABLE 1
Referring to table 1, the present invention is superior to the prior art in detection accuracy. The invention reconstructs a difference image between any two time phases by using the relative peak value of the edge correction area and multi-scale superpixel segmentation of the polarized SAR image of any two time phases, the difference image is obtained by using the relative peak value of the edge correction area and multi-scale superpixel reconstruction, the defects that the change information is lost due to the fact that the difference image obtained by the prior PDI technology only utilizes partial polarization characteristics instead of a complete polarization scattering matrix to measure the difference between two-time-phase polarized SAR images and the change area cannot be accurately obtained due to the fact that the difference image is subjected to threshold processing by using a global threshold Otsu method are avoided, and in addition, the invention constructs a VIT network based on the difference guiding global change characteristics by using Vision Transformer (VIT) to automatically extract the deep semantic characteristics of the image from the multi-time-phase polarization data and the difference image, thereby avoiding that CNN adopted in the prior art can only capture the context in the polarized image, and the situation that the data association between different slices is insufficient, and further, the change detection result cannot be effectively extracted from the multi-time-phase polarized SAR image with complex backward scattering characteristics is obviously consistent with the prior art.
The invention realizes the unsupervised change detection of the polarized SAR image, is used for solving the technical problem of lower detection overall accuracy and region internal consistency in the prior art, and can be applied to the fields of land utilization, urban construction, disaster evaluation, forest monitoring, ice field monitoring and the like.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present invention, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
Although the application is described herein in connection with various embodiments, other variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed application, from a study of the drawings, the disclosure, and the appended claims. In the claims, the word "comprising" does not exclude other elements or steps, and the "a" or "an" does not exclude a plurality.
The foregoing is a further detailed description of the invention in connection with the preferred embodiments, and it is not intended that the invention be limited to the specific embodiments described. It will be apparent to those skilled in the art that several simple deductions or substitutions may be made without departing from the spirit of the invention, and these should be considered to be within the scope of the invention.