Movatterモバイル変換


[0]ホーム

URL:


Next Article in Journal
DSM Generation from Multi-View High-Resolution Satellite Images Based on the Photometric Mesh Refinement Method
Next Article in Special Issue
UIR-Net: A Simple and Effective Baseline for Underwater Image Restoration and Enhancement
Previous Article in Journal
A Comprehensive Analysis of Ultraviolet Remote Sensing for Aerosol Layer Height Retrieval from Multi-Angle Polarization Satellite Measurements
Previous Article in Special Issue
DP-ViT: A Dual-Path Vision Transformer for Real-Time Sonar Target Detection
 
 
Search for Articles:
Title / Keyword
Author / Affiliation / Email
Journal
Article Type
 
 
Section
Special Issue
Volume
Issue
Number
Page
 
Logical OperatorOperator
Search Text
Search Type
 
add_circle_outline
remove_circle_outline
 
 
Journals
Remote Sensing
Volume 14
Issue 24
10.3390/rs14246260
Font Type:
ArialGeorgiaVerdana
Font Size:
AaAaAa
Line Spacing:
Column Width:
Background:
Article

Sonar Image Target Detection Based on Style Transfer Learning and Random Shape of Noise under Zero Shot Target

College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, China
*
Author to whom correspondence should be addressed.
Remote Sens.2022,14(24), 6260;https://doi.org/10.3390/rs14246260
Submission received: 22 October 2022 /Revised: 1 December 2022 /Accepted: 6 December 2022 /Published: 10 December 2022
(This article belongs to the Special IssueAdvancement in Undersea Remote Sensing)

Abstract

:
With the development of sonar technology, sonar images have been widely used to detect targets. However, there are many challenges for sonar images in terms of object detection. For example, the detectable targets in the sonar data are more sparse than those in optical images, the real underwater scanning experiment is complicated, and the sonar image styles produced by different types of sonar equipment due to their different characteristics are inconsistent, which makes it difficult to use them for sonar object detection and recognition algorithms. In order to solve these problems, we propose a novel sonar image object-detection method based on style learning and random noise with various shapes. Sonar style target sample images are generated through style transfer, which enhances insufficient sonar objects image. By introducing various noise shapes, which included points, lines, and rectangles, the problems of mud and sand obstruction and a mutilated target in the real environment are solved, and the single poses of the sonar image target is improved by fusing multiple poses of optical image target. In the meantime, a method of feature enhancement is proposed to solve the issue of missing key features when using style transfer on optical images directly. The experimental results show that our method achieves better precision.

Graphical Abstract

    1. Introduction

    With the improvement of sonar equipment, sonar images have gained a high level of achievement in underwater exploration [1] and target detection [2,3]. Compared to limitation of optical sensor in target detection, such as short detection distance, bad underwater visibility, and so on, side-scan based sonar target detection methods are widely used and are more effective in terms of distance and visibility.
    Now and then, the real situation is much more complicated, as when images must be detected, and there is no training data available [4]. At present, the deep convolutional neural network (DCNN) is widely used in the sonar target detection [5,6]. Many scholars have been studying side-scan sonar-based object detection [7,8] with DCNN. In the meantime, sonar image detection based on noise analysis has also developed [9,10]. It has highly improved sonar detection accuracy compared with traditional recognition methods.
    Nevertheless, the high cost of underwater experiments [11], such as target deployment underwater, the diversity of sonar devices, and the search for suitable experiment area, etc., has caused a lack of samples. Applicable results of sonar image detection have received much attention. However, given the minimal training data, few systems are widely used in real applications [12,13].The characteristics of a complex underwater environment and the lack of samples limit the generalization ability and precision of the sonar object detection.
    Therefore, this paper comprehensively considers the complex underwater situation and lack of samples. To begin with, we use style transfer [14] optical images to augment pseudo samples [4]. However, the limitation of a small area on images with style transfer will cause low performance on object detection. To improve the performance of sonar target detection, we propose shape noises on images to solve the situation of mud and sand obstruction and a mutilated target in the real environment. Furthermore, we combine various optical image datasets to enhance multiple poses to solve the single-state issue of targets. Finally, considering the key features of reflector and shadow are missing when we use style transfer directly, we use the binary and gamma methods [15,16] to enhance object features via frequency analysis on real sonar images. The remainder of this paper is organized as follows.
    Section 2 introduces the related work about existing method and shortages.
    Section 3 introduces our methods, including the data augmentation and simulation methods based on feature enhancement and the addition of random shape noises.
    Section 4 introduces the comparison of the existing methods and their training results with our experiments and compares the results of our designed experiments.

    2. Related Works

    To overcome the shortage of samples, scholars have been working on zero-shot methods [2,3,4] to augment samples. The fine-tuning of a pretrained CNN is a useful method in sonar image detection [2,3]. Lee et al. [4] adopted StyleBankNet [14] to perform style transfer simulations of optical images of the human body, further improving sonar object detection and attained an 86% precision. The samples are generated by the software of computer aided design (CAD), but still require large simulation work to generate samples. Li et al. [3] made full use of the style transfer whitening and coloring transform (WCT) method and the remote sensing image simulation sonar image for target style transfer. It is effectively applied to underwater sonar image object detection which obtained 87.5% of precision. This method applied a large number of remote sensing images to transfer sonar images as sample data for training a DCNN model. However, it cannot express features of a target properly without considering the image environment (the state of target, such as target damage and corrupt, target postures, etc.). Yu, Yongcan et al. [17], by using transformer-YOLOv5, attained an accuracy of 85.6%. Huang et al. [2] combined a 3D model, amplified data, equipment noise, and image mechanism to extract target features and simulate target damage and postures via DCNN and a fine-tuning style transfer method. The method achieved 85.3% precision and 94.5% recall. Song et al. [18] proposed an efficient sonar segmentation method based on speckle noise analysis, which facilitated pixel-wise classification, and a single-stream deep neural network (DNN) with multiple side-outputs to optimize edge segmentation.
    Most of the studies focus on amplifying samples and image mechanism, without considering too much on the target’s real environment, like mud and sand obstruction, miss target parts, multiple state of the target, and shadow and reflector on real sonar data. In order to solve the problem, this paper proposes a method based on a fast-style learning [19] and random shape noise model, and simultaneously integrates multiple morphological fusions of optical targets to improve the uncertainty of target detection. In addition, for getting closer to real data and enhancing the object features, an image-processing [15,16] method based on binary and gamma transformation simulated shadow and reflector of object in training data. By adding random shape noise to the target image, the optical image is simulated for sediment cover and mutilated targets. The multiple state of optical image targets is integrated to deal with the problem of multiple forms of underwater targets, and the detection rate of sonar images is further strengthened.

    3. Our Method

    3.1. Problems Definition and Our Framework

    As described inSection 2, a lack of samples is a common problem in detecting sonar images, which leads to low model performance. Most of the existing methods come from transfer optical data to rising sonar detection performance [2,3,4] with less consideration of the real underwater environment. Combined the issue, the crux of deep learning work is to prepare datasets. To extract complex features from the dataset and based on zero-shot samples in target detection, in this section we present the main contributions of the paper. We consider three major aspects in dataset design. (1) We define a dataset and augmentation from an optical image to extend multiple poses on the target. (2) We transfer the optical image to sonar style. (3) We design random shape noise on a target to simulate mud and sand obstruction.
    In our experiment, the data processing to generate training data is shown inFigure 1.
    The overall process framework of our main method is shown inFigure 2. It includes three parts, data preprocessing, style transfer, and detection model training. Data preprocessing integrates different optical image datasets into target categories, such as airplanes, boats, cars, etc., where sediment occlusion on the seabed is simulated by adding random noise.
    A stylized model with sonar background images to stylize preprocessed and integrated datasets. Different sonar datasets have different image textures, and style transfer learning method is pays attention to image texture. To reduce the manual work in different sonar datasets, the proposed method can also be applied to different sonar datasets. The stylized dataset is then rotated and stretched by data preprocessing to further enhance the target final attitude dataset. We enhance the key features via primary simulation.
    The processed image is trained in combination with the yolov5 model to obtain an image model with detection sonar. The real sonar data is styled before the scene is restored and detected.

    3.2. Improved Methods of Data Combination and Augmentation

    Most of the existing methods focus on amplifying samples with fewer combined target poses. The final state (upright, inverted, incline, etc.) of real targets are always uncertain. The example of real sonar images with different poses are shown inFigure 3.
    Because the uncertain object state to be detected on the seabed will eventually form, we propose a method that uses a large amount of optical data with different poses to help enrich the final form of the object. For example, with regard to aircraft data, there are different kinds of postures as interpreted by our senses, but cognitively they are all considered to be airplanes, as shown inFigure 4.
    To improve multiple target morphologies, we have approached the data augmentation method [20] in this paper. The extraction of image features in terms of stretching, scaling, and rotation for sonar object detection. The image size of targets in training data were set to (64, 64), (64, 32), (32, 32), (32, 64), and (128, 128). The target size can be adjusted to be larger, which can be matched with the background image. The generated random position and the target merge to the background of sonar image. The example of image expansion is shown inFigure 5.
    The amount of original and augmentation data is shown inTable 1 where org-set is the number of images integrated from the original different optical data sets, and augmentation dataset is the number of enhanced data.

    3.3. Improved Methods of Style Transfer on Image Dataset

    The zero shot samples are a universal issue in underwater target detection. Many scholars have been studying on transfer learning and data enhancement, which is descried inSection 2. The performance of style transfer-based models in sonar target detection has improved significantly. It comes to be a technical trend of sonar target detection. The original image and transfer result is shown inFigure 6.
    Generally, style transfer can be described as being in two steps. First, the style transfer network via style image and content images generate a style model. Secondly, we input an image into the generated model and output the styled image.
    By using the style transfer method directly, the key feature of shadow and reflector in sonar images will be lost, and most scholars have little regard for the features when using style transfer. A reference example is shown inFigure 7.
    The network of the object-detection model extracts high-frequency signals as “key” features in the sonar image. Mostly, Fourier transform (FT) and inverse Fourier transform (IFT) are the main analysis methods in image frequency. An example of original sonar images of feature distribution on the frequency area is shown inFigure 8b, and the Gaussian filter of the frequency feature is shown inFigure 8c.
    An example that uses the style transfer method on an optical target directly is shown inFigure 9.
    Apparently, the distribution of a high-frequency signal is not smooth, according toFigure 9b, and the object feature are unclear according toFigure 9d.
    To enhance the features, we proposal a simulation method that can enhance the features based on fast style transfer [19] as shown inFigure 10, and an analysis example of our method is shown inFigure 11.
    The general process of style transfer is shown inFigure 10a. Here,x is the content image,fw is the network of style transfer,yc isx, andys is the style image. Then,y^ is the generated image which is input imagex via style transferred by the network offw. The content ofy^ is similar toyc. The style ofy^ is similar toys. The mathematical principles explanation is below. We definep as sytle image,a as content image, that is, to be style transferred. For example,p is a backgournd of sonar image,a is an optical image,f is the transferred image that has sonar image style. We define two lost functions,Lstyle andLcontent.Lstyle expectsf to be more similar top in terms of style.Lcontent expectsf to be more similar toa in terms of content, which is shown in formula [19]:
    l(a,f,p)=αlstyle(p,f)+βlcontent(a,f).
    Based on the general process of style transfer, which is shown inFigure 10a, we define a functionf(g,x,y^) to enhance the target’s shadow and reflector inFigure 10b. Here,x is original image,y^ is generated result fromFigure 10a,g is enhancement function which is implemented by binary and gamma transformation processing [15,16]. It can be expressed as follows:
    g(B(x,θ1,θ2),γ),θ1<θ2 [0, 255],γ[0, 15].
    Here,B is a binarization function, andθ1,θ2 is the threshold value,γ is the threshold value of gamma function, andy˜ is the final result. The enhancement result is shown inFigure 6d. Our method also can be applied on other types of sonar image as shown inTable 2. From our experimentθ1=50, γ=10 is the shadow threshold value, andθ2=180, γ=0.5 is the shadow reflector value.

    3.4. Improved Methods of Designing Random Shape Noise on Target

    In the actual sonar image application, it is not difficult to see that many of the targets to be detected are incomplete targets, or some are defective, examples of incomplete target are shown inFigure 12.
    On the other hand, benefiting from the rapid development of DCNN, the network can extract object features from data easily. From the zero-shot samples, the network can extract most object features with less real conditions, because the samples which we used for training are too perfect to be close to reality. Our goal is to extract the key features from zero-shot samples and reduce excess features. We propose a method that generates random shape noises on target to simulate the real environment. We defined three types of shapes, points, lines, and rectangles and integrated the classified optical image data to add random noises in our experiment. The noises shape can also be a different type of shape.
    We defineP as the probability which generates random noise on targets.P1 is the probability of random lines on targets,N1 is the number of lines.P2 is the probability of random point on targets,N2 is the number of points,P3 is the probability of random rectangles on targets,N3 is the number of rectangles,Pn is the probability of random shape on targets,Nn is the number of shapes,X is the total number of optical image samples,Z is the total number of the training data, andY is the total number of noises. The process of generate noises can be express as follows:
    Z=X(1+i=1nPi),Y=i=1nPiNi, Ni(1,2,3N).
    Some noises are randomly selected is shown inTable 3 andTable 4:
    The area of noise in the target can be express as follows:
    y={(max(w,h)16)l,l=(x1x2)2+(y1y2)22 ,line noise;(w5)2,rectangle noise;π*r2,point noise.
    Here,y is the shape noise area in the image,(w,h) is the image’s width and height,l is length of noise line,(x1,y1),(x2,y2) is two connect points for noise line in the image, andr is radius of point noise and value betweenw10 andw16. All the constant parameters described in the express are the fine-tuned results from our experiments.
    On the flip side, overusing the method of shape noise causes low detection performance. Due to the relatively large area of a single noise coverage target, too many noise points are not taken to avoid the problem of excessive coverage, which is shown inFigure 13, the excessive noise almost completely covers the target, resulting in the original image losing target features and decreasing in the detection rate.

    4. Experiment and Analysis

    In this section, we perform a series of experiments to compare the performance of existing methods and our method.

    4.1. Experiment Data

    In order to enrich the diversity of the target forms, our experiment uses part of the VOC2007 dataset [21] and remote-sensing images [22] dataset for training. This paper has conducted a comparative experiment on the same batch of real sonar data. A total of 29 real sonar aircraft wrecks and 43 real shipwreck sonar images were compared and verified under three indicators. Example of sonar images is shown inFigure 14.

    4.2. Experiment Details of Training on Style Transfer Learning and Target Detection Model

    We used Python 3.6 and yolov5-large premodel for training. Our training platform employs i7-10700F and GPU NVIDIA GeForce RTX 2060. The content image uses 40,504 images from the COCO 2014 [23] in style transfer training. The training time is 8 h, and the average time spent on one picture by using the style transfer model’s transfer is < 0.2 s. The training detection model takes 8 h to detect a 540 × 480 pixel sheet containing 3 targets, with an average time < 0.1 s.
    Interception of part of the training process is shown inFigure 15, after trained 80 batches, and the results tend to be stable.
    Currently, there are two major DCNN-based target detection methods, Faster R-CNN [24] and Yolov5. Faster RCNN has a higher accuracy and lower speed compared with Yolov5. Yolov5 has an easier engineering deployment.

    4.3. Experiment Results on Comparison with Existing Methods

    The proposed method transfers optical images in various datasets into sonar image style images, providing an effective way to enrich target poses. The method of feature enhancement on sample makes the fake-target closer to the real target by using random shape noises on target to simulate the actual environment to enhance the model detection ability.
    We use precision and recall as the common criteria for judging target detection models. The definition of precision is that the number of samples be predicted results are correct. Recall refers to the number of positive samples be detected correctly in the predicted results according the published papers and code by existing methods, running in our experiment data. To further evaluate the performance of our method, the comparison with existing methods is shown inTable 5.
    From the results, we can see that if the engineer or researcher focus on precision, the proposed method has the better results. From the comparison table, the precision is increased by 0.044 compared with the existing top of precision when using Fastrer R-CNN. Compared with the existing top of precision, the precision is increased by 0.024 when using yolov5.
    We combine the style mode and the two major DCNN methods to do experiments on real data is shown inTable 6. The results of this experiment show that the model after adding interference has a higher detection rate and can be used for the detection in sonar images of the real environment.
    From the experimental results ofTable 6, we can clearly conclude that after adding interference noises to the yolov5 model, the precision is increased by 0.026, and recall is increased by 0.097. After adding interference noises to the Faster R-CNN model, the precision is increased by 0.11, and recall is increased by 0.028. Note that the sonar detection model not using real data on training phase inTable 6 andTable 7.
    In additional, we fine tuned the model with mixed real data in optical image samples under the yolov5 model. A total of 72 real sonar images mixed in 1310 optical images. We attained 0.957 precision and 0.944 recall. The result also shown inTable 6. Considering the complex underwater environment, most of the existing methods detect objects and training model without real data. We also cannot use the result to compare existing methods.

    4.4. Further Analysis on Our Methods

    4.4.1. Experiment on Multiples Poses and Shape Noises

    The experiment results about combined mutilated poses is shown inTable 7 under the same detection model.
    After combining different poses to the yolov5 model, the precision is increased by 0.058, and the recall is increased by 0.028. Experiments show that the model with random noise can detect the incomplete target. The detection result on validation data is shown inFigure 16. By contrast, the model without random noise cannot detect the incomplete target.
    The detection results obtained by the number and type of noise are shown inTable 8, which is the recognition rate of the same target. In the aircraft wreck recognition rate under different noise conditions, the ratio of noise to noiseless is 1:1, and the ratio of mixed line, point, rectangle and noiseless is 1:1:1:1.
    It is not difficult to find fromTable 8 that the noise detection accuracy of the mixed type is higher than that of a single noise, and the accuracy rate is close when the number of noises is 6 and 8. Some noises are randomly selected is shown inTable 3 andTable 4. In this noise design, we define the target image size as long h, width as w pixels, and w > h: (a) rectangular noise length and width as w/5 pixels rectangle. (b) The radius of point noise is a random value between w/16 and w/10 pixels. (c) Line noise is a randomly long line between 5 pixels and w, and a randomly wide line between 2 and 10 pixels. In our experiments, the detection confidence value reached peak when the noise quantities equal to 8. The situation of over noises on target which introduced in end ofSection 3.4.
    Compared with random shapes noise, we also analyze other types of noise which impact the performance on detection target. This includes Gaussian noise and salt and pepper noise [25]. The result of image noises and style transferred image is shown inTable 9:
    The detection compares result with real data is shown inTable 10. Apparently, the shape noise performance is higher than any other types of noise.

    4.4.2. Experiment on Two Style Models

    As described inSection 2 and benefiting from the rapid development of sonar image detection field, style transfer model which has become a common skill of sonar target detection. Fast neural style is the basic style transfer model in this paper due to high performance in our experiments. We compared fast neural style with StyleBank transfer models in our detection work.Figure 17 shows that comparison of the training set and real data under the two style models.
    From the result of image comparison inFigure 16, we see that fast neural style has a clearer target than StyleBank model. The results of performance comparison between fast neural style and StyleBank is shown inTable 11.
    The average confidence value of all detected target under fast neural style model is 0.970. Furthermore, the average confidence value of all detected target under StyleBank model is 0.936. From the results, we know that fast neural style has better performance.

    4.4.3. Experiment on Real Data

    In order to better approach to the real sonar data in experiment, we perform three types of experiments.
    (1)
    Detect transferred original image which fusion background style and enhance object feature.
    First, perform style transfer of the original image data. Secondly, for simulated target shadow and reflector, we combine the original data with binarization processing and gamma transformation to simulate real sonar images which are introduced inSection 3.3. Thirdly, we detect the object and the application to the images of real sonar aircraft wreckage process is shown inFigure 18.
    From the results, we know that it can be applied to real object detection.
    (2)
    Detect wreckage data which simulate from real sonar data.
    We carried out a wreckage simulation for the real sonar data; the simulation process is shown inFigure 19, which indicates that the detection of the wreckage can be better simulated after adding noise.
    At the same time, the real sonar image data is analyzed, and the enhancement methods are added to further restore the real scene for primary simulation.
    (3)
    Detect real data and difference target size.
    Our model can be fit-customized target sizes which can be define in training data. We adjust the target size to (64, 64), (64, 32), (32, 32), (32, 64), and (128,128) in training data. From the results, we know that the detection model has similar performance. An example of difference size on same target is shown inFigure 20.
    It is worth mentioning that the public data of the unprocessed (without style transfer) target and detection result by our proposed method is shown inFigure 21. The detection result by our proposed method has 0.93 confidence value.
    It should be noted that the detection result uses the model without real data on training phase (only optical image in training dataset).

    5. Conclusions

    In this paper, we applied yolov5 and Faster R-CNN, which are used to improve underwater detection performance with lack of training data. We introduce the design consideration of a complex underwater situation and lack of samples, and the limitation of the small area on images with style transfer will cause low performance on object detection. In addition, the designed shapes noises on images solve the problem of mud and sand obstruction in the real environment. At the same time, the combined various optical image datasets enhance multiple poses to solve the issue of the single state of the target. Furthermore, we use binary and gamma method to enhance object features which solve the key features of reflector and shadow are missing when use style transfer directly.
    Through the detailed comparison experiment results, we know that the performance of training data with combined mutilated poses is better than without the data. The performance of training data with shape noise is better than Gaussian noise or salt and pepper noise. The performance of faster style transfer better than Stylebank. The performance of Faster R-CNN is better than yolov5. In addition, the selected model can be applied to simulated wreckage data and difference target sizes. We selected the top two with high detection performance in all of our experiments, and we compared it with the existing method to know that the proposed method can achieve better target detection performance than other methods without shape noise fusion and or key feature enhancement in training data.

    Author Contributions

    Conceptualization, J.X. and X.Y.; methodology, J.X.; software, J.X.; validation, J.X., X.Y. and C.L.; formal analysis, J.X.; investigation, J.X. and C.L.; resources, J.X. and X.Y.; data curation, J.X. and X.Y.; writing—original draft preparation, J.X.; writing—review and editing, J.X., X.Y. and C.L.; visualization, J.X.; supervision, X.Y.; project administration, J.X.; funding acquisition, X.Y. All authors have read and agreed to the published version of the manuscript.

    Funding

    This work was supported by the National Natural Science Foundation of China (Grant No. 42276187 and 41876100) and the Fundamental Research Funds for the Central Universities (Grant No. 3072022FSC0401).

    Data Availability Statement

    All the experiments data and code can be found inhttps://github.com/xijier/SonarDetection (accessed on 22 October 2022).

    Acknowledgments

    We would like to thank the editor and the anonymous reviewers for their valuable comments and suggestions that greatly improve the quality of this paper. Thanks to the researchers who have published datasets of sonar and optical images.https://www.edgetech.com/underwater-technology-gallery/ (accessed on 18 July 2022),https://sandysea.org/projects/visualisierung (accessed on 18 July 2022).

    Conflicts of Interest

    The authors declare no conflict of interest.

    References

    1. Sahoo, A.; Dwivedy, S.K.; Robi, P.S. Advancements in the field of autonomous underwater vehicle.Ocean Eng.2019,181, 145–160. [Google Scholar] [CrossRef]
    2. Huang, C.; Zhao, J.; Yu, Y.; Zhang, H. Comprehensive Sample Augmentation by Fully Considering SSS Imaging Mechanism and Environment for Shipwreck Detection Under Zero Real Samples.IEEE Trans. Geosci. Remote Sens.2021,60, 5906814. [Google Scholar] [CrossRef]
    3. Li, C.; Ye, X.; Cao, D.; Hou, J.; Yang, H. Zero shot objects classification method of side scan sonar image based on synthesis of pseudo samples.Appl. Acoust.2021,173, 107691. [Google Scholar] [CrossRef]
    4. Lee, S.; Park, B.; Kim, A. Deep learning based object detection via style-transferred underwater sonar images.IFAC-Pap.2019,52, 152–155. [Google Scholar] [CrossRef]
    5. Zhu, P.; Isaacs, J.; Fu, B.; Ferrari, S. Deep learning feature extraction for target recognition and classification in underwater sonar images. In Proceedings of the 2017 IEEE 56th Annual Conference on Decision and Control (CDC), Melbourne, Australia, 12–15 December 2017; pp. 2724–2731. [Google Scholar]
    6. Neupane, D.; Seok, J. A review on deep learning-based approaches for automatic sonar target recognition.Electronics2020,9, 1972. [Google Scholar] [CrossRef]
    7. Nayak, N.; Nara, M.; Gambin, T.; Wood, Z.; Clark, C.M. Machine learning techniques for AUV side-scan sonar data feature extraction as applied to intelligent search for underwater archaeological sites. InField and Service Robotics; Springer: Singapore, 2021; pp. 219–233. [Google Scholar]
    8. Einsidler, D.; Dhanak, M.; Beaujean, P.P. A deep learning approach to target recognition in side-scan sonar imagery. In Proceedings of the OCEANS 2018 MTS/IEEE Charleston, Charleston, SC, USA, 22–25 October 2018; pp. 1–4. [Google Scholar]
    9. Huang, Y.; Li, W.; Yuan, F. Speckle noise reduction in sonar image based on adaptive redundant dictionary.J. Mar. Sci. Eng.2020,8, 761. [Google Scholar] [CrossRef]
    10. Yuan, F.; Xiao, F.; Zhang, K.; Huang, Y.; Chen, E. Noise reduction for sonar images by statistical analysis and fields of experts.J. Vis. Commun. Image Represent.2021,74, 102995. [Google Scholar] [CrossRef]
    11. Greene, A.; Rahman, A.F.; Kline, R.; Rahman, M.S. Side scan sonar: A cost-efficient alternative method for measuring seagrass cover in shallow environments.Estuar. Coast. Shelf Sci.2018,207, 250–258. [Google Scholar] [CrossRef]
    12. Vasan, D.; Alazab, M.; Wassan, S.; Naeem, H.; Safaei, B.; Zheng, Q. IMCFN: Image-based malware classification using fine-tuned convolutional neural network architecture.Comput. Netw.2020,171, 107138. [Google Scholar] [CrossRef]
    13. Li, Q.; Xiong, Q.; Ji, S.; Wen, J.; Gao, M.; Yu, Y.; Xu, Y. Using fine-tuned conditional probabilities for data transformation of nominal attributes.Pattern Recognit. Lett.2019,128, 107–114. [Google Scholar] [CrossRef]
    14. Chen, D.; Yuan, L.; Liao, J.; Yu, N.; Hua, G. Stylebank: An explicit representation for neural image style transfer. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 1897–1906. [Google Scholar]
    15. Chaki, N.; Shaikh, S.H.; Saeed, K. A comprehensive survey on image binarization techniques.Explor. Image Bin. Tech.2014,560, 5–15. [Google Scholar]
    16. Rahman, S.; Rahman, M.M.; Abdullah-Al-Wadud, M.; Al-Quaderi, G.D.; Shoyaib, M. An adaptive gamma correction for image enhancement.EURASIP J. Image Video Process.2016,2016, 35. [Google Scholar] [CrossRef] [Green Version]
    17. Yu, Y.; Zhao, J.; Gong, Q.; Huang, C.; Zheng, G.; Ma, J. Real-time underwater maritime object detection in side-scan sonar images based on transformer-YOLOv5.Remote Sens.2021,13, 3555. [Google Scholar] [CrossRef]
    18. Song, Y.; Liu, P. Segmentation of sonar images with intensity inhomogeneity based on improved MRF.Appl. Acoust.2020,158, 107051. [Google Scholar] [CrossRef]
    19. Johnson, J.; Alahi, A.; Li, F.-F. Perceptual losses for real-time style transfer and super-resolution. InEuropean Conference on Computer Vision; Springer: Cham, Switzerland, 2016; pp. 694–711. [Google Scholar]
    20. Shorten, C.; Khoshgoftaar, T.M. A survey on image data augmentation for deep learning.J. Big Data2019,6, 60. [Google Scholar] [CrossRef]
    21. Everingham, M.; Van Gool, L.; Williams, C.K.I.; Winn, J.; Zisserman, A. The pascal visual object classes (voc) challenge.Int. J. Comput. Vis.2010,88, 303–338. [Google Scholar] [CrossRef] [Green Version]
    22. Wang, M.; Sun, Z.; Xu, G.; Ma, H.; Yang, S.; Wang, W. Deep Hash Assisted Network for Object Detection in Remote Sensing Images.IEEE Access2020,8, 180370–180378. [Google Scholar] [CrossRef]
    23. Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L.Microsoft Coco: Common Objects in Context[C]//European Conference on Computer Vision; Springer: Cham, Switzerland, 2014; pp. 740–755. [Google Scholar]
    24. Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks.Adv. Neural Inf. Process. Syst.2015,28. [Google Scholar] [CrossRef] [PubMed] [Green Version]
    25. Boyat, A.K.; Joshi, B.K. A review paper: Noise models in digital image processing.arXiv2015, arXiv:1505.03489. [Google Scholar] [CrossRef]
    Remotesensing 14 06260 g001 550
    Figure 1. Data processing to generate training data.
    Figure 1. Data processing to generate training data.
    Remotesensing 14 06260 g001
    Remotesensing 14 06260 g002 550
    Figure 2. Our process framework.
    Figure 2. Our process framework.
    Remotesensing 14 06260 g002
    Remotesensing 14 06260 g003 550
    Figure 3. Real sonar images (a) Airplane with incline pose. (b) Airplane right pose. (c) Shipwreck with lateral pose. (d) Shipwreck with frontage pose.
    Figure 3. Real sonar images (a) Airplane with incline pose. (b) Airplane right pose. (c) Shipwreck with lateral pose. (d) Shipwreck with frontage pose.
    Remotesensing 14 06260 g003
    Remotesensing 14 06260 g004 550
    Figure 4. Multiple attitudes of the optical image on same target.
    Figure 4. Multiple attitudes of the optical image on same target.
    Remotesensing 14 06260 g004
    Remotesensing 14 06260 g005 550
    Figure 5. Data expansion in one target.
    Figure 5. Data expansion in one target.
    Remotesensing 14 06260 g005
    Remotesensing 14 06260 g006 550
    Figure 6. Original sonar image with optical target transfer to sonar style target. (a) Original sonar image. (b) Optical image. (c) Traditional style transfer image. (d) Feature enhancement style transfer image.
    Figure 6. Original sonar image with optical target transfer to sonar style target. (a) Original sonar image. (b) Optical image. (c) Traditional style transfer image. (d) Feature enhancement style transfer image.
    Remotesensing 14 06260 g006
    Remotesensing 14 06260 g007 550
    Figure 7. Example of shadow and reflector on real sonar target.
    Figure 7. Example of shadow and reflector on real sonar target.
    Remotesensing 14 06260 g007
    Remotesensing 14 06260 g008 550
    Figure 8. Frequency analysis on original sonar image. (a) Gray image on original sonar target. (b) Fourier transform on sonar image. (c) Gaussian filter frequency in frequency image. (d) Gaussian filtered image.
    Figure 8. Frequency analysis on original sonar image. (a) Gray image on original sonar target. (b) Fourier transform on sonar image. (c) Gaussian filter frequency in frequency image. (d) Gaussian filtered image.
    Remotesensing 14 06260 g008
    Remotesensing 14 06260 g009 550
    Figure 9. Frequency analysis on style transfer sample image. (a) Gray image using style transfer directly. (b) Fourier transform on sonar image. (c) Gaussian filter frequency in frequency image. (d) Gaussian filtered image.
    Figure 9. Frequency analysis on style transfer sample image. (a) Gray image using style transfer directly. (b) Fourier transform on sonar image. (c) Gaussian filter frequency in frequency image. (d) Gaussian filtered image.
    Remotesensing 14 06260 g009
    Remotesensing 14 06260 g010 550
    Figure 10. Style transfer process. (a) General style transfer process. (b) Enhancement feature based on style transfer.
    Figure 10. Style transfer process. (a) General style transfer process. (b) Enhancement feature based on style transfer.
    Remotesensing 14 06260 g010
    Remotesensing 14 06260 g011 550
    Figure 11. Frequency analysis on enhancement style transfer sample image. (a) Gray image on using style transfer directly. (b) Fourier transform on sonar image. (c) Gaussian filter frequency in frequency image. (d) Gaussian filtered image.
    Figure 11. Frequency analysis on enhancement style transfer sample image. (a) Gray image on using style transfer directly. (b) Fourier transform on sonar image. (c) Gaussian filter frequency in frequency image. (d) Gaussian filtered image.
    Remotesensing 14 06260 g011
    Remotesensing 14 06260 g012 550
    Figure 12. Examples of incomplete target. (a) Sediment on target. (b) Defective target.
    Figure 12. Examples of incomplete target. (a) Sediment on target. (b) Defective target.
    Remotesensing 14 06260 g012
    Remotesensing 14 06260 g013 550
    Figure 13. Over noise target. (a) Over noise on target. (b) Style transfer on target.
    Figure 13. Over noise target. (a) Over noise on target. (b) Style transfer on target.
    Remotesensing 14 06260 g013
    Remotesensing 14 06260 g014 550
    Figure 14. Examples of real sonar images.
    Figure 14. Examples of real sonar images.
    Remotesensing 14 06260 g014
    Remotesensing 14 06260 g015 550
    Figure 15. Indicator of training. (a) Precision trend in training. (b) Recall trend in training. (c) Object loss is the error caused by confidence. (d) Class loss is the error caused by target’s type.
    Figure 15. Indicator of training. (a) Precision trend in training. (b) Recall trend in training. (c) Object loss is the error caused by confidence. (d) Class loss is the error caused by target’s type.
    Remotesensing 14 06260 g015
    Remotesensing 14 06260 g016 550
    Figure 16. Detection result on validation data. (a) Simulated incomplete target. (b) Style transfer target. (c) Detection result in sonar background.
    Figure 16. Detection result on validation data. (a) Simulated incomplete target. (b) Style transfer target. (c) Detection result in sonar background.
    Remotesensing 14 06260 g016
    Remotesensing 14 06260 g017 550
    Figure 17. Comparison of style models in the training and real data.
    Figure 17. Comparison of style models in the training and real data.
    Remotesensing 14 06260 g017
    Remotesensing 14 06260 g018 550
    Figure 18. Real data verification process.
    Figure 18. Real data verification process.
    Remotesensing 14 06260 g018
    Remotesensing 14 06260 g019 550
    Figure 19. Wreckage simulation process and detection result.
    Figure 19. Wreckage simulation process and detection result.
    Remotesensing 14 06260 g019
    Remotesensing 14 06260 g020 550
    Figure 20. Difference size on same target. (a) target size is (32,32); (b) target size is (64,64); (c)target size is (128,128).
    Figure 20. Difference size on same target. (a) target size is (32,32); (b) target size is (64,64); (c)target size is (128,128).
    Remotesensing 14 06260 g020
    Remotesensing 14 06260 g021 550
    Figure 21. Detection on real data (Without Style transfer).
    Figure 21. Detection on real data (Without Style transfer).
    Remotesensing 14 06260 g021
    Table
    Table 1. Amount of original and augmentation data by target type.
    Table 1. Amount of original and augmentation data by target type.
    Segment No.ClassOrg DatasetAugmentation Dataset
    1Aeroplane7391591
    2Bicyle254557
    3Car7811698
    4Person6501390
    5Ship5831289
    Table
    Table 2. Three sonar styles applied on our style transfer process.
    Table 2. Three sonar styles applied on our style transfer process.
    Sonar StyleRemotesensing 14 06260 i001Remotesensing 14 06260 i002Remotesensing 14 06260 i003
    Optical Image
    Remotesensing 14 06260 i004Remotesensing 14 06260 i005Remotesensing 14 06260 i006Remotesensing 14 06260 i007
    Table
    Table 3. Noise quantities and type mapping.
    Table 3. Noise quantities and type mapping.
    Quantity468
    Type
    LineRemotesensing 14 06260 i008Remotesensing 14 06260 i009Remotesensing 14 06260 i010
    PointRemotesensing 14 06260 i011Remotesensing 14 06260 i012Remotesensing 14 06260 i013
    RectangleRemotesensing 14 06260 i014Remotesensing 14 06260 i015Remotesensing 14 06260 i016
    Table
    Table 4. Style transfer result onTable 3.
    Table 4. Style transfer result onTable 3.
    Quantity468
    Type
    LineRemotesensing 14 06260 i017Remotesensing 14 06260 i018Remotesensing 14 06260 i019
    PointRemotesensing 14 06260 i020Remotesensing 14 06260 i021Remotesensing 14 06260 i022
    RectangleRemotesensing 14 06260 i023Remotesensing 14 06260 i024Remotesensing 14 06260 i025
    Table
    Table 5. Comparison on existing methods performance.
    Table 5. Comparison on existing methods performance.
    ModelPrecisionRecallmAP
    (IOU = 0.5)
    StyleBank + fastrcnn [4]0.8600.7050.786
    Whitening and Coloring Transform [3]0.8750.8360.75
    Improved style transfer+yolov5 [2]0.8530.9450.876
    Our method 1:
    fast style + yolov5 + shape noise
    0.8990.8610.865
    Our method 2:
    fast style + Fastrer R-CNN + shape noise
    0.9190.7920.882
    Table
    Table 6. Comparison of model in precision and recall.
    Table 6. Comparison of model in precision and recall.
    ModelPrecisionRecall
    fast style + Yolov5 + shape noise0.8990.861
    faststyle + Fastrer R-CNN + shape noise0.9190.792
    fast style + Yolov5 + shape noise and mix real data0.9570.944
    fast style + Yolov5 + gauss noise0.8680.819
    fast style + Yolov5 + salt and pepper noise0.8700.833
    fast style + Yolov50.8730.764
    StyleBank + Yolov50.7550.563
    faststyle + Faster R-CNN0.8090.764
    Table
    Table 7. Detection result with two datasets.
    Table 7. Detection result with two datasets.
    ModelDatasetPrecisionRecall
    style transfer+ yolov5remote sensing images + VOC20070.8730.764
    style transfer + yolov5remote sensing images0.8150.736
    Table
    Table 8. Detection confidence on 4 types of noises (test in validation data).
    Table 8. Detection confidence on 4 types of noises (test in validation data).
    Noise TypeLinePointRectangleMixed
    Noise Quantity
    00.954
    40.9450.9210.9460.959
    60.9540.9310.9560.972
    80.9670.9540.9610.978
    Table
    Table 9. Noises and style transfer image.
    Table 9. Noises and style transfer image.
    No-NoiseGauss NoiseSalt and Pepper
    OpticalRemotesensing 14 06260 i026Remotesensing 14 06260 i027Remotesensing 14 06260 i028
    StyledRemotesensing 14 06260 i029Remotesensing 14 06260 i030Remotesensing 14 06260 i031
    Table
    Table 10. Detection result for 3 types of noise.
    Table 10. Detection result for 3 types of noise.
    ModelPrecisionRecall
    fast style + yolov5 + shape noise0.8990.861
    fast style + yolov5 + gauss noise0.8680.819
    fast style + yolov5 + salt and pepper noise0.8700.833
    Table
    Table 11. Detection result for 2 style models.
    Table 11. Detection result for 2 style models.
    ModelPrecisionRecall
    fast style + yolov50.8730.764
    StyleBank + yolov50.7550.563
    Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

    © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

    Share and Cite

    MDPI and ACS Style

    Xi, J.; Ye, X.; Li, C. Sonar Image Target Detection Based on Style Transfer Learning and Random Shape of Noise under Zero Shot Target.Remote Sens.2022,14, 6260. https://doi.org/10.3390/rs14246260

    AMA Style

    Xi J, Ye X, Li C. Sonar Image Target Detection Based on Style Transfer Learning and Random Shape of Noise under Zero Shot Target.Remote Sensing. 2022; 14(24):6260. https://doi.org/10.3390/rs14246260

    Chicago/Turabian Style

    Xi, Jier, Xiufen Ye, and Chuanlong Li. 2022. "Sonar Image Target Detection Based on Style Transfer Learning and Random Shape of Noise under Zero Shot Target"Remote Sensing 14, no. 24: 6260. https://doi.org/10.3390/rs14246260

    APA Style

    Xi, J., Ye, X., & Li, C. (2022). Sonar Image Target Detection Based on Style Transfer Learning and Random Shape of Noise under Zero Shot Target.Remote Sensing,14(24), 6260. https://doi.org/10.3390/rs14246260

    Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further detailshere.

    Article Metrics

    No
    No

    Article Access Statistics

    For more information on the journal statistics, clickhere.
    Multiple requests from the same IP address are counted as one view.
    Remote Sens., EISSN 2072-4292, Published by MDPI
    RSSContent Alert

    Further Information

    Article Processing Charges Pay an Invoice Open Access Policy Contact MDPI Jobs at MDPI

    Guidelines

    For Authors For Reviewers For Editors For Librarians For Publishers For Societies For Conference Organizers

    MDPI Initiatives

    Sciforum MDPI Books Preprints.org Scilit SciProfiles Encyclopedia JAMS Proceedings Series

    Follow MDPI

    LinkedIn Facebook X
    MDPI

    Subscribe to receive issue release notifications and newsletters from MDPI journals

    © 1996-2025 MDPI (Basel, Switzerland) unless otherwise stated
    Terms and Conditions Privacy Policy
    We use cookies on our website to ensure you get the best experience.
    Read more about our cookieshere.
    Accept
    Back to TopTop
    [8]ページ先頭

    ©2009-2025 Movatter.jp