Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Maritime vessel detection from remote sensing SAR data, based on the architectures of the Faster-RCNN and YOLOv5 networks.

License

NotificationsYou must be signed in to change notification settings

jasonmanesis/Ship-Detection-on-Remote-Sensing-Synthetic-Aperture-Radar-Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The present project was conducted as part of my diploma thesis which focuses on the investigation of methods for the effective detection of ships in synthetic aperture radar satellite imagery utilizing deep learning techniques. These methods use the Faster-RCNN and YOLOv5 network architectures to create three different detectors. More specifically, the first two models created are based on the Faster-RCNN network architecture and utilize a set of normal and rotated bounding boxes for the detection process. The one-stage detection network is based on the architecture of the YOLOv5 model and uses regular bounding boxes to delimit the estimated targets. The produced models are trained and evaluated on the HRSID dataset. The greatest accuracy is found in models that use regular bounding boxes to derive estimates. While, the model with rotated bounding boxes, shows the largest localization errors and is characterized by an increased number of false negative detections.

HRSID Properties.

  • The High-Resolution SAR Images Dataset contains 116 co-polarized and 20 cross-polarized SAR imageries.
  • The original imageries for constructing HRSID are 99 Sentinel-1B imageries, 36 TerraSAR-X and 1 TanDEM-X images.
  • The above 136 panoramic SAR imageries cropped to 5604 high-resolution SAR images.
  • These 5604 images have dimensions of 800 × 800 pixels, resolution of 96 dpi, and there are in .jpeg format.
  • The colour depth of the images is 8 bits (one channel).
  • The extracted 5604 high-resolution SAR images contain 16951 ship instances.
  • The spatial resolutions of SAR images are 0.5, 1 and 3 meters per pixel.
  • The annotations of each instance are the corresponding bounding box and the ship’s outline.
  • The annotations of each SAR image constitute a .json file in MS COCO dataset format.
  • Paper Link:https://ieeexplore.ieee.org/abstract/document/9127939
  • Dataset Link:https://github.com/chaozhong2010/HRSID

Proposed architectures of Faster-RCNN.

Faster-RCNN is a two stage detection architecture and contains 3 different submodules: a) Backbone Network, b) Region Proposal Network and c) Fast-RCNN. At the proposed model, Feature Pyramid Network with ResNet backbone was used for the creation ofP2-P6 spatial levels. Region Proposal Network receives serially theP2-P6 feature maps and for everyPi level creates a hidden representation, which is shared between the regression and classification layers, and produces two output tensors with predicted objectness logits and anchor deltas for every anchor in thePi. Next, predicted anchor deltas are applied to the corresponding anchors and the above boxes are sorted by the predicted objectness scores at eachPi level. Then, after the application of a confidence threshold and the NMS algorithm, RPN retains a subset of the anchor boxes from whichk ROIs were extracted. Finally, ROI (Box) Head takes the outputs from the FPN and RPN networks, which are the multiscale feature maps and the ROIs respectively, and uses the latter to crop the regions of interest from the feature maps. The cropped regions are then pooled (transformed into the same dimensions) and fed as flattened feature vectors into a pair of fully connected layers that extract the class probabilities and the corresponding coordinates for a predefined number of boxes.

1_unZ995FzCFMCgrQ0l1R5mw

Image source:https://medium.com/@hirotoschwert/digging-into-detectron-2-part-4-3d1436f91266

Proposed architecture of YOLOv5.

YOLOv5 is a one shot detector which contains 2 different networks: a) Feature Extraction Network (Backbone Network) and b) PANet. Backbone network is used for feature extraction and It uses the main modules ofC3 (VGP+FLOPS↓) andSPPF (multiscale feature fusion). The PANet network creates a set of feature maps in 3 different spatial scales (P3-P5) which have 3 different anchors at every spatial location. The above tensors (P3-P5) are then fed into the corresponding layer of the “Head” network and after the application of a confidence threshold and the NMS algorithm the final bounding box predictions (class_id, x1, y1, x2, y2, confidence_score) were extracted.

YOLOV5

Quantitative Evaluation

Mean Average Precision

MetricFaster - RCNΝ (Normal Bboxes)Faster - RCNΝ (Rotated Bboxes)YOLOv5STANet1DB-YOLO2
AP0.50:.05:.9568.142.971.169.572.0
AP0.5091.475.394.292.494.4
AP0.7579.345.582.081.1-
APsmall69.341.362.970.9-
APmedium68.551.180.768.6-
APlarge44.120.955.137.8-

Mean Average Recall

MetricFaster - RCNΝ (Normal Bboxes)Faster - RCNΝ (Rotated Bboxes)YOLOv5STANet1DB-YOLO2
ARmax=127.821.928.2--
ARmax=1061.644.963.5--
ARmax=10074.048.375.9--
ARsmall73.546.469.5--
ARmedium79.157.984.5--
ARlarge64.329.765.1--

1 SOTA Two Stage Detector (Wang et. al.)See paper
2 SOTA One Stage Detector (Zhu et. al.)See paper

Qualitative Evaluation

I created a short video from the largeALOS-2 scene which is provided in theofficial repository of the HRSID dataset and I run the Faster-RCNN and YOLOv5 models with normal bounding boxes. The rotated bounding boxes are not supported by the Detectron2 framework for video inference so the corresponding Faster-RCNN which utilizes the above bounding box type it is not used.

Faster RCNN with normal bounding boxes

Faster-RCNN-Normal.Bounding.Boxes.mp4

YOLOv5

YOLOv5.mp4

Requirements

torch == 1.7.1+cu110                           torchvision==0.8.2+cu110                       pyyaml == 5.1     detectron2 == 0.5                              cv2 == 4.1.2                                   wandb == 0.12.11

About

Maritime vessel detection from remote sensing SAR data, based on the architectures of the Faster-RCNN and YOLOv5 networks.

Topics

Resources

License

Stars

Watchers

Forks


[8]ページ先頭

©2009-2025 Movatter.jp