Disclosure of Invention
The present invention has been made to solve the above-mentioned problems occurring in the prior art. Therefore, an automatic identification method, device and medium for the sea for cultivation are needed, and an improved SegNet network model is constructed by introducing a pyramid-type scale-aware convolution module and a channel-space attention dual-focusing module so as to improve the automatic identification precision of the sea for cultivation, and provide technical support for scientific and reasonable monitoring and management work of the sea for cultivation.
According to a first aspect of the present invention, there is provided an automatic identification method for a sea for cultivation, the method comprising:
acquiring remote sensing image data, wherein the remote sensing image data comprises a multispectral image and a panchromatic image;
Preprocessing the remote sensing image data to obtain a data set;
Constructing an automatic recognition model, wherein the automatic recognition model takes remote sensing image data as an input feature map, and outputs a sea recognition result for cultivation; the automatic identification model comprises a pyramid scale perception convolution module and a channel-space attention double-focusing module, an input feature map is firstly matched to a first channel depth through a first standard convolution module and enters the scale perception convolution module to carry out grouping convolution, a plurality of first feature maps are generated, the plurality of first feature maps pass through the attention double-focusing module to improve the focusing acuity of extracted feature lifting information, a plurality of second feature maps are obtained, the plurality of second feature maps are matched to an initial channel depth through the second convolution module, a normalization layer and an activation function are added after each convolution, and finally the input feature maps and the second feature maps matched to the initial channel depth are added to be used as a sea identification result for cultivation;
and training the automatic identification model by using the data set, and realizing automatic identification of the sea for cultivation by using the trained automatic identification model.
Further, the preprocessing the remote sensing image data to obtain a data set specifically includes:
Performing atmospheric correction and orthographic correction on the multispectral image and the full-color image;
Carrying out image fusion on the corrected multispectral image and the full-color image to obtain a fusion image, wherein spectral characteristics are reserved in the fusion image, and extracting near infrared wave bands, green wave bands and blue wave bands of the fusion image are respectively corresponding to three channels of red, green and blue in sequence;
Labeling the fusion image by using a label to obtain a label image, wherein the label comprises a sea label for cultivation and a seamark label for non-cultivation;
cutting the fusion image and the corresponding label graph, and carrying out sample amplification on the cut image to obtain a plurality of groups of sample pairs, wherein each group of sample pairs consists of the fusion image and the corresponding label graph, and the sample pairs are divided into a training data set and a verification data set;
And extracting a plurality of test areas of the non-training image area based on the fusion image and the corresponding label image, respectively cutting the plurality of test areas to obtain test area comparison images, and drawing the label image of the corresponding test area comparison images to form a plurality of test area comparison sample pairs.
Further, labeling the fused image by using a label to obtain a label map specifically includes:
based on visual interpretation and manual annotation, the vector labels of the marked sample areas are endowed with field values of two different types of sea areas for cultivation and sea areas for non-cultivation, and converted into gray values of raster images,
Marking the sea area for cultivation and the sea area for non-cultivation as different colors;
and converting the vector label into raster data to finish the manufacture of the label graph.
Further, the cropping the fused image and the corresponding label graph specifically includes:
And carrying out overlapped sliding window cutting on the fusion image and the corresponding label image, and uniformly cutting the fusion image and the corresponding label image into images with the same pixel size.
Further, after performing sample amplification on the image after clipping to obtain a plurality of groups of sample pairs, the method further includes:
The data of the plurality of groups of sample pairs are enhanced by horizontal and vertical rotation, diagonal mirroring and addition of salt and pepper noise or Gaussian noise, the generation of an countermeasure network is introduced more innovatively to carry out large-scale automatic sample amplification, the simulated sample pairs have extremely high similarity, and the sample diversity is widened.
Further, the scale-aware convolution module comprises four convolution layers with different convolution kernel sizes, and the four convolution layers are used for carrying out grouping convolution on the input feature images matched to the first channel depth to generate a plurality of first feature images.
Further, the convolution kernel sizes of the four layer-different convolution layers are 9×9, 7×7, 5×5, and 3×3, respectively.
Further, the attention dual-closure injection molding block comprises a channel attention and spatial attention module;
The channel attention module extracts the information of the characteristics of each channel of the first characteristic diagram through global maximum pooling and global average pooling operation, and learns the channel attention weight through using a shared full-connection layer and a Sigmoid activation function;
The spatial attention module calculates spatial attention by carrying out maximum pooling and average pooling on the features of each spatial position of the first feature map, stacks the spatial attention of each spatial position, connects the spatial attention by utilizing standard convolution, and obtains spatial attention weight by Sigmoid activation function;
And multiplying the channel attention weight and the space attention weight, and then weighting the first feature map to obtain a second feature map.
According to a second aspect of the present invention, there is provided an automatic identification device for a sea for cultivation, the device comprising:
The data acquisition module is configured to acquire remote sensing image data, wherein the remote sensing image data comprises a multispectral image and a panchromatic image;
the data preprocessing module is configured to preprocess the remote sensing image data to obtain a data set;
The model construction module is configured to construct an automatic identification model, and the automatic identification model takes remote sensing image data as an input feature map and outputs a sea identification result for cultivation; the automatic identification model comprises a pyramid convolution module and a convolution attention module, an input feature map is firstly adapted to a first channel depth through a first standard convolution module and enters the pyramid convolution module to carry out grouping convolution, a plurality of first feature maps are generated, the plurality of first feature maps pass through the convolution attention module to improve the acuity of focusing on extracted feature lifting information, a plurality of second feature maps are obtained, the plurality of second feature maps are adapted to an initial channel depth through the second convolution module, a normalization layer and an activation function are added after each convolution, and finally the input feature map and the second feature map adapted to the initial channel depth are added to be used as a sea identification result for cultivation;
And the automatic identification module is configured to train the automatic identification model by utilizing the data set, and realize automatic identification of the sea for cultivation by the trained automatic identification model.
According to a third aspect of the present invention, there is provided a readable storage medium storing one or more programs executable by one or more processors to implement the method as described above.
The invention has at least the following beneficial effects:
The invention obtains multi-scale information by adding the pyramid-type scale sensing convolution module without additionally adding network parameters and adding the channel-space attention double-relation injection molding block, thereby enhancing the utilization of effective information. The effectiveness of the module is verified through an ablation experiment, and the improvement mode of combined use is helpful to precision improvement, and the overall precision, average intersection ratio and F1 fraction of the culture sea are identified in 3 test areas by the proposed model, and are improved by 2.83%, 6.67% and 1.8% respectively compared with the original model. By carrying out comparison experiments with UNet, segNet, denseNet models and traditional machine learning SVM and RF methods, the effectiveness of the proposed models is verified, and the overall accuracy, average cross-over ratio and F1 score of the proposed models for identifying the sea for cultivation in 3 test areas respectively reach 94.86%, 87.23% and 96.59%, which are improved compared with the comparison methods. Experimental results show that the network model provided by the method can automatically and accurately identify the sea area for cultivation, and can provide technical support for monitoring and management of the sea for cultivation.
Detailed Description
The present invention will be described in detail below with reference to the drawings and detailed description to enable those skilled in the art to better understand the technical scheme of the present invention. Embodiments of the present invention will be described in further detail below with reference to the drawings and specific examples, but not by way of limitation. The order in which the steps are described herein by way of example should not be construed as limiting if there is no necessity for a relationship between each other, and it should be understood by those skilled in the art that the steps may be sequentially modified without disrupting the logic of each other so that the overall process is not realized.
The embodiment of the invention provides an automatic identification method for a sea for cultivation, which firstly provides an application scene of the method, as shown in fig. 1, wherein the application scene is a geographical position and region remote sensing image schematic diagram of a selected application scene, the application scene is 34 degrees 18 '21' -35 degrees 1 '3', and the geographical coordinate range is 118 degrees 41 '38' -119 degrees 33 '44', in the Jiangsu province, and the sea for cultivation is the remote sensing image schematic diagram of the region. The Lianyuangang is an important ocean fishery area in Jiangsu province, the ocean economy is developed, the ocean resources are abundant, the area of the ocean is 6677 square kilometers, the shallow sea beach is 11 ten thousand hectares, 17 large rivers flow into the sea along the coast, the water quality of the sea area is rich, and the national north salt farm is one of the sea state bay fishery of one of the national eight-big fishery and one of the four national sea salt production areas. The main cultivated crop is laver, the largest national laver cultivation processing base is located at the place, and the national fishery association organization in 2021 also examines and approves the name of' Chinese laver. The sea for cultivation is mainly distributed in coast county city of east coast, such as Guanyun, east sea, sea state, etc., wherein the sea for cultivation is the largest in Guanyun county.
As shown in fig. 2, which is a flow chart of the method, the method comprises the following steps:
Step S100, remote sensing image data is obtained, wherein the remote sensing image data comprises a multispectral image and a full-color image.
In the embodiment, the experimental data adopts GF-1D satellite L1A remote sensing image data of Lianghong Kong city, and the shooting time is 2022, 5, 24, 11 and 39 minutes. The high-resolution one-size-D satellite is jointly emitted by 2018 and the B, C satellite in a three-star-with-arrow mode, the full-color image spatial resolution is 2 meters, the multispectral image spatial resolution is better than 8 meters, the single-star imaging breadth is more than 60 kilometers, and the all-weather and full-coverage real-time monitoring capability of natural resources is greatly improved.
Step S200, preprocessing the remote sensing image data to obtain a data set.
In this embodiment, the software ENVI5.3 is used to perform atmospheric correction and orthographic correction on the high-resolution first-order multispectral image and the full-color image of the research area, then perform image fusion on the corrected first-order multispectral image and the full-color image to improve the resolution of the corrected first-order multispectral image and keep the spectral characteristics, and extract the fused high-resolution multispectral image in a band 4 (near infrared band), a band 2 (green band) and a band 1 (blue band) respectively corresponding to three red, green and blue channels in sequence. And then, by combining visual interpretation and manual annotation with the vector labels of the marked sample areas through ArcGIS10.8 software, the field values of two different types of the sea for cultivation and other land features are given, and converted into gray values of raster images, wherein the sea for cultivation is1, and the other land features are 0. Meanwhile, the sea area for cultivation is marked as white, and RGB is (255 ); the other feature areas are marked black, and RGB is (0, 0). And finally, converting the vector label into raster data to finish the production of the seamark signature data set for cultivation.
After the marking is completed, the whole image and the label are cut into sliding windows overlapped by using Python codes, and the size of 256 multiplied by 256 pixels is uniformly cut. Sample amplification is carried out after cutting to avoid the condition of over-fitting of network training, a data set is enhanced through traditional amplification and generation of countermeasure network amplification, and finally 10000 groups of 256 multiplied by 256 pixel sample pairs are obtained, wherein 7500 groups of training data sets and 2500 groups of verification data sets. Each set of sample pairs in the marine culture dataset consists of an image map and a corresponding label map, as shown in fig. 3. In addition, three test areas of the non-training image area are selected, cut into remote sensing images of 1000 multiplied by 1000 pixels, and corresponding sea labels for cultivation are drawn to form three groups of test area comparison.
Step S300, an automatic recognition model is constructed, remote sensing image data is used as an input feature map by the automatic recognition model, and a sea recognition result for cultivation is output; the automatic identification model comprises a pyramid scale perception convolution module and a channel-space attention dual-attention module, an input feature map is firstly matched to a first channel depth through a first standard convolution module and enters the pyramid convolution module to carry out grouping convolution, a plurality of first feature maps are generated, the plurality of first feature maps pass through the convolution attention module to improve the acuity of attention to extracted feature lifting information, a plurality of second feature maps are obtained, the plurality of second feature maps are matched to an initial channel depth through the second convolution module, a normalization layer and an activation function are added after each convolution, and finally the input feature maps are added with the second feature maps matched to the initial channel depth to serve as a sea identification result for cultivation.
The present embodiment improves upon the underlying SegNet network. SegNet the network is a basic convolutional neural network of classical encoder-decoder architecture, published by Vijay, alex equal to 2017 under IEEE. The network structure is clear, and can be quickly applied to real-time application with small storage space. The study redesigns the classical SegNet network structure, adds PyConv module (pyramid scale-aware convolution module) and CBAM module (channel-space attention dual-focus module), as shown in fig. 4, and strengthens the utilization of characteristic information.
First, the first standard convolution of each layer SegNet of the network is replaced with a scale-aware convolution, and a portion of the pyramid-type scale-aware convolution is designed, as shown in FIG. 4. In the convolution process, taking the input feature map depth of 256 as an example, the input feature map is designed to be firstly matched to the channel depth of 64 through standard convolution of 1×1, and then is subjected to feature pyramid convolution with 4 layers of different convolution kernel sizes. According to the traditional experience and parameter debugging, four layers of convolution kernels in the pyramid are respectively set to 9×9, 7×7, 5×5 and 3×3, groups are set to 16, 8, 4 and 1 to carry out grouping convolution, 16 feature maps are generated in each layer, 64 feature maps are generated by four layers as output, and then channel depth with the size of 256 is adapted by standard convolution of 1×1. At the same time, BN normalization and ReLU activation functions are added after each convolution. And finally, adding the output characteristic diagram and the input characteristic diagram through quick connection to obtain final output.
Secondly, the encoder part of SegNet networks is improved, a channel-space attention dual attention module is added before the pooling operation of the fifth-layer network, and the input feature map sequentially passes through the channel attention module and the space attention module, so that the information attention acuity of the features extracted by the encoder is improved. According to experience principle and debugging, the fifth layer network belongs to the encoder part, the attention double-attention module is added, excessive semantic information is not added, the phenomenon of overfitting can be reduced to a certain extent, the current more key information can be focused, the attention to other information is reduced, more information related to the identification target is acquired, and the utilization efficiency of the characteristic information is improved.
It should be noted that, the core idea of the pyramid scale-aware convolution module described herein is to use convolution kernels of different levels, that is, different sizes and depths, to process an input image, so as to better capture details of different levels and different scales, so as to solve the problem that standard convolution lacks the capability of processing the input image in multiple scales. As shown in fig. 5, the pyramid scale-aware convolution includes n-level pyramids of different convolution kernels, which increase in size from the bottom (level 1) to the top (level n) of the pyramid, while the depth of the convolution kernels decreases.
Thanks to this, the biggest advantage of scale-aware convolution is that multi-scale processing can be achieved by diversified combinations, different convolution kernels can contain both larger and smaller receptive fields, and can focus on larger objects as well as on details. At the same time, no additional network parameters are added, and similar number of levels of model parameters and requirements are maintained by default in the computing resources as compared to standard convolution.
The channel-space attention dual attention module is a module which is often added in a convolutional neural network, and the core of the module is to focus the network on more important information, and generally comprises two types of space attention mechanisms and channel attention mechanisms. As shown in fig. 6, the feature map of the input network is sequentially processed by the channel attention module and the spatial attention module. The channel attention module extracts information of the characteristics of each channel through global maximum pooling and global average pooling operation, and learns the channel attention weight by using a shared full connection layer and a Sigmoid activation function; additionally, the spatial attention module calculates spatial attention by max pooling and average pooling features of each spatial location and stacks them together, then connects with a standard convolution of channel number 1, and gets spatial attention weights by Sigmoid activation function. And finally multiplying the outputs of the two modules, and weighting the feature map to obtain the final output.
And finally, in step S400, training the automatic identification model by using the data set, and realizing automatic identification of the sea for cultivation by using the trained automatic identification model.
According to the steps, the method provided by the invention is operated on an experimental platform, the experimental platform selected by the embodiment adopts a Windows 10 professional 64-bit system, the processor is Intel Rui 12-generation Inter Core i7-12700 twelve-Core processor, 48G memory (DDR 43200 MHz) is configured, and the built-in display card is NVIDIA GeForce RTX 3060. The experimental environment is configured by taking Anaconda3 software as a carrier, an experiment is carried out by creating a virtual environment of Python3.6 version, and a deep learning framework selects TensorFlow 2.4.4 and an integrated Keras 2.4.4 interface thereof. Secondly, by configuring the corresponding CUDA 11.1 as an operation platform and carrying cuDNN 8.0.0 as a neural network acceleration library, the capability of the GPU for solving the complex calculation problem is improved. Finally, software PyCharm 2022 is installed as an integrated development environment to write, debug and develop programs so as to ensure that experiments can be successfully developed.
Finally, the super-parameter setting for training the improved network is shown in table 1 through multiple parameter adjustment experiment optimization.
Table 1 network training parameter settings
The precision evaluation refers to comparing the breeding sea identified by the three test areas with the real labels thereof, so as to judge the effectiveness and the accuracy of the method. The evaluation index quantitatively analyzes the sea for cultivation by relying on the confusion matrix. The images predicted into the map are divided into two categories of sea for cultivation and other ground features, so that the confusion matrix is in a matrix form of 2 rows and 2 columns. Combining the predicted and real results in a matrix has four cases: TP, FP, FN, TN, T and F represent correct or incorrect, P and N represent 1 or 0. Wherein TP represents the correct marine pixel for cultivation, FP represents the wrong marine pixel for cultivation, TN represents the correct other ground object pixel for cultivation, FN represents the wrong other ground object pixel for cultivation, n is the number of categories, and n is 2 herein.
Five evaluation factors of accuracy (precision), recall (recall), overall Accuracy (OA), F1-Score (F1-Score), average cross ratio (mIoU) are selected, and the calculation formulas are shown in formulas (1) to (5), respectively. Accuracy represents the probability that all samples predicted as sea for farming are correctly identified; recall represents the probability of being correctly identified in all samples that are truly marine for farming; the total precision OA represents the probability that each random sample, the predicted result is the same as the real type; f1-score is the balance value with highest classification model accuracy and recall; the intersection ratio represents the ratio of the intersection and union of the predicted class sample and the actual class sample, and the average intersection ratio is the result of averaging all classes.
The calculation formula of the five evaluation factors is as follows:
In order to verify the performance of the improved model and prove the superiority of the improved model, the best improved SegNet model after training is stored, the three selected test areas are predicted and identified, and the visual analysis and quantitative accuracy evaluation are carried out on the predicted image and the real label image of the test areas. The proposed improved method is compared with the classification results of classical SegNet, UNet, denseNet networks and traditional machine learning support vector machines, random forest methods. The comparison diagrams of the identification results generated by the original images, the real labels and the various methods of the three test areas are shown in fig. 7, wherein white represents the sea for cultivation and black represents other ground objects.
As can be seen by visual comparison, the model provided by the method has the best visual effect, and the situations of wrong separation and missing separation are greatly reduced compared with other models. In other models, the condition of the sea leakage for cultivation is in the green frame, and the condition of the omission of other models is serious in the sea area of a large area; the yellow frame is internally provided with a situation that other ground objects are wrongly identified as the sea for cultivation, and the model proposed herein is also an optimal model for avoiding the situation; in addition, the red frame is a drain sea channel of the culture sea area which is missed, and the recognition capability of the model provided herein on finer river channels is improved compared with other models. In addition, the method of identification by using the traditional machine learning method obviously produces broken blocks, the identification result of the sea for cultivation is scattered and finely divided, the extraction effect is poor, and the region integrity and the integrity are poor. According to analysis, the traditional machine learning only depends on single information such as spectrum information, only focuses on features of single dimension to judge pixel by pixel, lacks grasp of global information of an image, cannot link the global information to strengthen utilization of the features like deep learning, cannot continuously learn and utilize other feature information in classification, and therefore the situation of breaking plaques is generated. In a comprehensive view, the model provided by the method is accurate in identifying the sea for cultivation, can accurately distinguish other ground objects, and has the highest identification integrity on a large area.
In addition, quantitative analysis is also carried out on the model performance based on the evaluation index, and the results are shown in table 2, and the optimal values are shown in a rough scale. It can be seen that the method provided by the invention has the best indexes, the accuracy reaches 96.17%, and the accuracy is improved by 1.41% compared with other best methods; the recall rate reaches 97.02%, and is improved by 0.2% compared with other optimal methods; the overall precision reaches 94.86%, and is improved by 2.77% compared with other optimal methods; the average cross ratio reaches 87.23%, and is improved by 6.67% compared with other optimal methods; the F1 fraction reaches 96.56%, and is improved by 1.66% compared with other optimal methods. The data fully prove the superior performance of the method in the aspect of automatic identification of the sea for cultivation, so that the improved SegNet model provided herein can accurately identify the sea for cultivation through comprehensive visual comparison results and accuracy evaluation result analysis, and has high information mining capability for the sea for cultivation, and good identification capability in areas with large and dense sea areas for cultivation or small and dispersed areas.
TABLE 2 evaluation results of precision of test areas (comparative experiments)
In order to verify the effectiveness of different modules in the improved network proposed herein, a series of experiments were performed on the marine culture dataset, including the use of different modules, respectively adding Pyconv and CBAM modules separately to the original SegNet and SegNet networks, respectively, and four models of the improved SegNet networks. Performance of the four models on the three test zones can be determined by comparing the graphs, see fig. 8. Through visual comparison, the combination of CBAM and Pyconv enables the visual effect to be optimal, and compared with the original SegNet model or the addition of a single module, the capability of identifying the whole area of the sea for cultivation is improved, as shown by a green frame in the figure; meanwhile, the capability of identifying a thinner drainage sea channel is obviously improved, as shown by a red frame in the figure; in addition, the error division is reduced, as shown by the yellow box in the figure.
And (3) analyzing according to quantitative precision evaluation indexes, wherein the analysis is shown in a table 3, and the optimal values are shown in a rough scale. The data strongly demonstrate that CBAM and Pyconv modules, when used alone, both improve accuracy over the original model. The method has the advantages that the method can be improved to the optimal performance when the method is combined and added, the accuracy of automatic identification of the sea for cultivation is remarkably improved, the overall accuracy of an improved model is improved by 2.83%, 2% and 0.46% respectively compared with that of an original model, a CBAM module is added singly, a Pyconv module is added singly, the average cross ratio is improved by 6.67%, 6.48% and 1.33% respectively, and the F1 fraction is improved by 1.8% and 1.07% and 0.26% respectively.
Table 3 evaluation results of accuracy of test area (ablation experiment)
In summary, the method provided by the invention can realize rapid and accurate large-area automatic extraction of the culture sea, multi-scale information is obtained by adding PyConv pyramid convolution modules, network parameters are not additionally added, and the utilization of effective information is enhanced by adding CBAM attention mechanism modules. The effectiveness of the module is verified through an ablation experiment, and the improvement mode of combined use is helpful to precision improvement, and the overall precision, average intersection ratio and F1 fraction of the culture sea are identified in 3 test areas by the proposed model, and are improved by 2.83%, 6.67% and 1.8% respectively compared with the original model. By carrying out comparison experiments with UNet, segNet, denseNet models and traditional machine learning SVM and RF methods, the effectiveness of the proposed models is verified, and the overall accuracy, average cross-over ratio and F1 score of the proposed models for identifying the sea for cultivation in 3 test areas respectively reach 94.86%, 87.23% and 96.59%, which are improved compared with the comparison methods. Experimental results show that the improved SegNet model provided by the method can automatically and accurately identify the sea area for cultivation, and can provide technical support for monitoring and management of the sea for cultivation.
An embodiment of the present invention provides an automatic identification device for a sea for cultivation, as shown in fig. 9, the device 900 includes:
A data acquisition module 901 configured to acquire remote sensing image data including a multispectral image and a panchromatic image;
the data preprocessing module 902 is configured to preprocess the remote sensing image data to obtain a data set;
The model construction module 903 is configured to construct an automatic recognition model, and the automatic recognition model uses remote sensing image data as an input feature map and outputs a sea recognition result for cultivation; the automatic identification model comprises a pyramid scale perception convolution module and a channel-space attention dual-attention module, an input feature map is firstly matched to a first channel depth through a first standard convolution module and enters the pyramid convolution module to carry out grouping convolution, a plurality of first feature maps are generated, the plurality of first feature maps pass through the convolution attention module to improve the acuity of attention to extracted feature lifting information, a plurality of second feature maps are obtained, the plurality of second feature maps are matched to an initial channel depth through the second convolution module, a normalization layer and an activation function are added after each convolution, and finally the input feature maps are added with the second feature maps matched to the initial channel depth to serve as a sea identification result for cultivation;
An automatic recognition module 904 configured to train the automatic recognition model using the data set, and to implement automatic recognition of the sea for farming by the trained automatic recognition model.
In some embodiments, the data preprocessing module is further configured to:
Performing atmospheric correction and orthographic correction on the multispectral image and the full-color image;
Carrying out image fusion on the corrected multispectral image and the full-color image to obtain a fusion image, wherein spectral characteristics are reserved in the fusion image, and extracting near infrared wave bands, green wave bands and blue wave bands of the fusion image are respectively corresponding to three channels of red, green and blue in sequence;
Labeling the fusion image by using a label to obtain a label image, wherein the label comprises a sea label for cultivation and a seamark label for non-cultivation;
cutting the fusion image and the corresponding label graph, and carrying out sample amplification on the cut image to obtain a plurality of groups of sample pairs, wherein each group of sample pairs consists of the fusion image and the corresponding label graph, and the sample pairs are divided into a training data set and a verification data set;
And extracting a plurality of test areas of the non-training image area based on the fusion image and the corresponding label image, respectively cutting the plurality of test areas to obtain test area comparison images, and drawing the label image of the corresponding test area comparison images to form a plurality of test area comparison sample pairs.
In some embodiments, the data preprocessing module is further configured to:
based on visual interpretation and manual annotation, the vector labels of the marked sample areas are endowed with field values of two different types of sea areas for cultivation and sea areas for non-cultivation, and converted into gray values of raster images,
Marking the sea area for cultivation and the sea area for non-cultivation as different colors;
and converting the vector label into raster data to finish the manufacture of the label graph.
In some embodiments, the data preprocessing module is further configured to:
And carrying out overlapped sliding window cutting on the fusion image and the corresponding label image, and uniformly cutting the fusion image and the corresponding label image into images with the same pixel size.
In some embodiments, the data preprocessing module is further configured to:
The plurality of sets of sample pairs are data enhanced by combining conventional amplification steps to generate a method of automatically amplifying samples substantially against a network.
In some embodiments, the pyramid convolution module includes four different convolution layers with different convolution kernel sizes, and the four convolution layers group-convolve the input feature map adapted to the first channel depth to generate a plurality of first feature maps.
In some embodiments, the convolution kernel sizes of the four layer-specific convolution layers are 9×9, 7×7, 5×5, 3×3, respectively.
In some embodiments, the convolution attention module includes a channel attention module and a spatial attention module;
The channel attention module extracts the information of the characteristics of each channel of the first characteristic diagram through global maximum pooling and global average pooling operation, and learns the channel attention weight through using a shared full-connection layer and a Sigmoid activation function;
The spatial attention module calculates spatial attention by carrying out maximum pooling and average pooling on the features of each spatial position of the first feature map, stacks the spatial attention of each spatial position, connects the spatial attention by utilizing standard convolution, and obtains spatial attention weight by Sigmoid activation function;
And multiplying the channel attention weight and the space attention weight, and then weighting the first feature map to obtain a second feature map.
It should be noted that, the device in this embodiment and the method described in the foregoing belong to the same technical idea, and the same technical effects can be achieved, which are not repeated here.
Embodiments of the present invention provide a readable storage medium storing one or more programs executable by one or more processors to implement the methods described in the above embodiments.
The above description is intended to be illustrative and not restrictive. For example, the above-described examples (or one or more aspects thereof) may be used in combination with each other. For example, other embodiments may be used by those of ordinary skill in the art upon reading the above description. In addition, in the above detailed description, various features may be grouped together to streamline the invention. This is not to be interpreted as an intention that the features of the claimed invention are essential to any of the claims. Rather, inventive subject matter may lie in less than all features of a particular inventive embodiment. Thus, the following claims are hereby incorporated into the detailed description as examples or embodiments, with each claim standing on its own as a separate embodiment, and it is contemplated that these embodiments may be combined with one another in various combinations or permutations. The scope of the invention should be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.