
Note: This release has yet to be submitted to CRAN.
This package provides tools for semantic segmentation of geospatialdata using convolutional neural network-based deep learning. Utilityfunctions allow for creating masks, image chips, data frames listingimage chips in a directory, and DataSets for use within DataLoaders.Additional functions are provided to serve as checks during the datapreparation and training process. Training can also be conducted bydynamically generated chips (still experimental). The package relies ontorch for implementing deep learning, which does not require theinstallation of a Python environment. Raster geospatial data are handledwith terra. Models can be trained using a CUDA-enabled GPU; however,multi-GPU training is not supported by torch in R. Both binary andmulticlass models can be trained.
Full details about the package are documented in aPLOSOne article:
Maxwell, A.E., Farhadpour, S., Das, S. and Yang, Y., 2024. geodl: AnR package for geospatial deep learning semantic segmentation using torchand terra.PLoS One, 19(12), p.e0315127.
A UNet architecture can be defined with 4 blocks in the encoder, abottleneck block, and 4 blocks in the decoder. The UNet can accept avariable number of input channels, and the user can define the number offeature maps produced in each encoder and decoder block and thebottleneck. Users can also choose to (1) replace all ReLU activationfunctions with leaky ReLU or swish, (2) implement attention gates alongthe skip connections, (3) implement squeeze and excitation moduleswithin the encoder blocks, (4) add residual connections within allblocks, (5) replace the bottleneck with a modified atrous spatialpyramid pooling (ASPP) module, and/or (6) implement deep supervisionusing predictions generated at each stage in the decoder.
A second UNet architecture is implemented with a MobileNet-v2backbone. This model can be initialized using ImageNet weights for theencoder. The encoder can be frozen or trained during the training loop.If the number of input predictor variables or channels is not three,ImageNet weights are averaged for all input channels in the first layer.If three channels or predictor variables are provided, then the user canchoose to use the ImageNet weights or average the weights in the firstlayer.
Two additional models are provided: UNet3+ and a modified version ofHRNet.
A unified focal loss framework is implemented after:
Yeung, M., Sala, E., Schönlieb, C.B. and Rundo, L., 2022. Unifiedfocal loss: Generalising dice and cross entropy-based losses to handleclass imbalanced medical image segmentation.Computerized MedicalImaging and Graphics, 95, p.102026.
We have also implemented assessment metrics using the luz packageincluding overall accuracy, F1-score, recall, and precision.
Trained models can be used to predict to spatial data without theneed to generate chips from larger spatial extents. Functions areavailable for performing accuracy assessment.
Utility functions are provided to generate a variety of land surfaceparameters (LSPs) from a digital terrain model (DTM).
This package is still experimental and is a work-in-progress. We areinterested in finding additional contributors/collaborators.
You can install the development version of geodl fromGitHub with:
# install.packages("devtools")devtools::install_github("maxwell-geospatial/geodl")Chapter 15 andChapter 16 in thefree and openly available online textGeospatial SupervisedLearning using R serve as the documentation for thispackage.