Executing with Docker
Summary
Here, we describe how to runNiPreps with Docker containers.To illustrate the process, we will show the execution offMRIPrep, but these guidelines extend to any other end-userNiPrep.
Before you start: install Docker¶
Probably, the most popular framework to execute containers isDocker.If you are to run aNiPrep on your PC/laptop, this is theRECOMMENDED way of execution.Please make sure you follow theDocker Engine's installation instructions.You can check your installation running theirhello-world
image:
$dockerrun--rmhello-world
If you have a functional installation, then you should obtain the following output:
HellofromDocker!Thismessageshowsthatyourinstallationappearstobeworkingcorrectly.Togeneratethismessage,Dockertookthefollowingsteps:1.TheDockerclientcontactedtheDockerdaemon.2.TheDockerdaemonpulledthe"hello-world"imagefromtheDockerHub.(amd64)3.TheDockerdaemoncreatedanewcontainerfromthatimagewhichrunstheexecutablethatproducestheoutputyouarecurrentlyreading.4.TheDockerdaemonstreamedthatoutputtotheDockerclient,whichsentittoyourterminal.Totrysomethingmoreambitious,youcanrunanUbuntucontainerwith:$dockerrun-itubuntubashShareimages,automateworkflows,andmorewithafreeDockerID:https://hub.docker.com/Formoreexamplesandideas,visit:https://docs.docker.com/get-started/
After checking yourDocker Engine is capable of runningDockerimages, you are ready to pull your firstNiPreps container image.
Troubleshooting
If you encounter issues while executing a containerized application,it is critical to identify where the fault is sourced.For issues emerging from theDocker Engine, please read thecorresponding troubleshooting guidelines.Once verified the problem is not related to the container system,then follow the specific application debugging guidelines.
FixDocker Desktop startup issue onmacOS
Due to arecent issueaffectingDocker Desktop versions 4.29 through 4.36,the application may fail to start.If affected, please follow theofficial guidelines to resolve this issue.
Docker images¶
For every new version of the particularNiPreps application that is released, a correspondingDocker image is generated.The Docker imagebecomes a container when the execution engine loads the image and adds an extra layer that makes itrunnable. In order to runNiPreps'Docker images, theDocker Engine must be installed.
TakingfMRIPrep to illustrate the usage, first you might want to make sure of the exact version of the tool to be used:
$dockerpullnipreps/fmriprep:<latest-version>
You can runNiPreps interacting directly with theDocker Engine via thedocker run
interface.
Running aNiPrep with a lightweight wrapper¶
SomeNiPreps include a lightweight wrapper script for convenience.That is the case offMRIPrep and itsfmriprep-docker
wrapper.Before starting, make sure youhave the wrapper installed.When you runfmriprep-docker
, it will generate a Docker command line for you, print it out for reporting purposes, and then execute it without further action needed, e.g.:
$fmriprep-docker/path/to/data/dir/path/to/output/dirparticipantRUNNING:dockerrun--rm-it-v/path/to/data/dir:/data:ro\-v/path/to_output/dir:/outnipreps/fmriprep:20.2.2\/data/outparticipant...
fmriprep-docker
implementsthe unified command-line interface ofBIDS Apps, and automatically translates directories intoDocker mount points for you.
We have published astep-by-step tutorial illustrating how to runfmriprep-docker
.This tutorial also provides valuable troubleshooting insights and advice on what to do afterfMRIPrep has run.
Running aNiPrep directly interacting with theDocker Engine¶
If you need a finer control over the container execution, or you feel comfortable with theDocker Engine, avoiding the extra software layer of the wrapper might be a good decision.
Accessing filesystems in the host within the container¶
Containers are confined in a sandbox, so they can't access the data on the hostunless explicitly enabled.TheDocker Engine provides mounting filesystems into the container with the-v
argument and the following syntax:-v some/path/in/host:/absolute/path/within/container:ro
,where the trailing:ro
specifies that the mount is read-only.The mount permissions modifiers can be omitted, which means the mountwill have read-write permissions.
Docker for Windows requires enabling Shared Drives
OnWindows installations, the-v
argument will not workby default because it is necessary to enable shared drives.Please check on thisStackoverflow post how to enable them.
In general, you'll want to at least provide two mount-points:one set in read-only mode for the input data and one read/writeto store the outputs:
$dockerrun-ti--rm\-vpath/to/data:/data:ro\# read-only, for data-vpath/to/output:/out\# read-write, for outputsnipreps/fmriprep:<latest-version>\/data/out/out\participant
We recommend mounting a work (or scratch) directory for intermediate workflow results.This is particularly useful fordebugging orreusing pre-cached intermediate results,but can also be useful to control where these (large) directories get created,as the default location for files created inside a docker container may not have sufficient space.In the case ofNiPreps, we typically inform theBIDS Appsto override the work directory by setting the-w
/--work-dir
argument (please note that this is not defined by theBIDS Appsspecifications and it may change across applications):
$dockerrun-ti--rm\-vpath/to/data:/data:ro\-vpath/to/output:/out\-vpath/to/work:/work\# mount from hostnipreps/fmriprep:<latest-version>\/data/out/out\participant-w/work# override default directory
Best practices
TheReproNim initiativedistributes materials and documentation of best practicesfor containerized execution of neuroimaging workflows.Most of these are organized within theYODA (Yoda's Organigram on Data Analysis) principles.
For example, mounting$PWD
into$PWD
and setting that pathas current working directory can effectively resolve many issues.This strategy may be combined with the above suggestion aboutthe application's work directory as follows:
$dockerrun-ti--rm\-v$PWD:$PWD\# Mount the current directory with its own name-w$PWD\# DO NOT confuse with the application's work directorynipreps/fmriprep:<latest-version>\inputs/raw-dataoutputs/fmriprep\# With YODA, the inputs are inside the working directoryparticipant-w$PWD/work
Mounting$PWD
may be used with YODA so that all necessarypartsin execution are reachable from under$PWD
.This effectively(i) makes it easy totransfer configurations fromoutside the container to theinside execution runtime;(ii) theoutside/inside filesystem trees are homologous, whichmakes post-processing and orchestration easier;(iii) execution in shared systems is easier as everything issort ofself-contained.
In addition to mounting$PWD
, other advanced practicesinclude mounting specific configuration files (for example, aNipype configuration file)into the appropriate paths within the container.
BIDS Apps relying onTemplateFlowfor atlas and template management may requiretheTemplateFlow Archive be mounted from the host.Mounting theArchive from the host is an effective wayto prevent multiple downloads of templates that are not bundled in the image:
$dockerrun-ti--rm\-vpath/to/data:/data:ro\-vpath/to/output:/out\-vpath/to/work:/work\-vpath/to/tf-cache:/opt/templateflow\# mount from host-eTEMPLATEFLOW_HOME=/opt/templateflow\# override TF homenipreps/fmriprep:<latest-version>\/data/out/out\participant-w/work
Sharing theTemplateFlow cache can cause race conditions in parallel execution
When sharing theTemplateFlowHOME folder across several parallelexecutions against a single filesystem, these instance will likelyattempt to fetch unavailable templates without sufficient time betweenactions for the data to be fully downloaded (in other words,data downloads will beracing each other).
To resolve this issue, you will need to make sure all necessarytemplates are already downloaded within the cache folder.If theTemplateFlow Client is properly installed in your system,this is possible with the following command line(example shows how to fully downloadMNI152NLin2009cAsym
:
$templateflowgetMNI152NLin2009cAsym
Running containers as a user¶
By default,Docker will run the container with theuser id (uid)0, which is reserved for the defaultrootaccount inLinux.In other words, by defaultDocker will use the superuser accountto execute the container and will write files with the correspondinguid=0 unless configured otherwise.Executing as superuser may result in permissions and security issues,for example,withDataLad (discussed later).One paramount example of permissions issues where beginners typicallyrun into is deleting files after a containerized execution.If the uid is not overridden, the outputs of a containerized executionwill be owned byroot and grouproot.Therefore, normal users will not be able to modify the output andsuperuser permissions will be required to delete data generatedby the containerized application.Some shared systems only allow running containers as a normal userbecause the user will not be able to operate on the outputs otherwise.
Whether the container is available with default settings,or the execution has been customized to normal users,running as a normal user avoids these permissions issues.This can be achieved withDocker's-u
/--user
option:
--user=[ user | user:group | uid | uid:gid | user:gid | uid:group ]
We can combine this option withBash'sid
command to ensure the current user's uid and group id (gid) are being set:
$dockerrun-ti--rm\-vpath/to/data:/data:ro\-vpath/to/output:/out\-u$(id-u):$(id-g)\# set execution uid:gid-vpath/to/tf-cache:/opt/templateflow\# mount from host-eTEMPLATEFLOW_HOME=/opt/templateflow\# override TF homenipreps/fmriprep:<latest-version>\/data/out/out\participant
For example:
$dockerrun-ti--rm\-v$HOME/ds005:/data:ro\-v$HOME/ds005/derivatives:/out\-v$HOME/tmp/ds005-workdir:/work\-u$(id-u):$(id-g)\-v$HOME/.cache/templateflow:/opt/templateflow\-eTEMPLATEFLOW_HOME=/opt/templateflow\nipreps/fmriprep:<latest-version>\/data/out/fmriprep-<latest-version>\participant\-w/work
Application-specific options¶
Once theDocker Engine arguments are written, the remainder of thecommand line follows the interface defined by the specificBIDS App (for instance,fMRIPreporMRIQC).
The first section of a call consists of arguments specific toDocker,which configure the execution of the container:
$dockerrun-ti--rm\-v$HOME/ds005:/data:ro\-v$HOME/ds005/derivatives:/out\-v$HOME/tmp/ds005-workdir:/work\-u$(id-u):$(id-g)\-v$HOME/.cache/templateflow:/opt/templateflow\-eTEMPLATEFLOW_HOME=/opt/templateflow\nipreps/fmriprep:<latest-version>\/data/out/fmriprep-<latest-version>\participant\-w/work
Then, we specify the container image that we execute:
$dockerrun-ti--rm\-v$HOME/ds005:/data:ro\-v$HOME/ds005/derivatives:/out\-v$HOME/tmp/ds005-workdir:/work\-u$(id-u):$(id-g)\-v$HOME/.cache/templateflow:/opt/templateflow\-eTEMPLATEFLOW_HOME=/opt/templateflow\nipreps/fmriprep:<latest-version>\/data/out/fmriprep-<latest-version>\participant\-w/work
Finally, the application-specific options can be added.We already described the work directory setting before, in the caseofNiPreps such asMRIQC andfMRIPrep.Some options areBIDS Apps standard, such astheanalysis level (participant
orgroup
)and specific participant identifier(s) (--participant-label
):
$dockerrun-ti--rm\-v$HOME/ds005:/data:ro\-v$HOME/ds005/derivatives:/out\-v$HOME/tmp/ds005-workdir:/work\-u$(id-u):$(id-g)\-v$HOME/.cache/templateflow:/opt/templateflow\-eTEMPLATEFLOW_HOME=/opt/templateflow\nipreps/fmriprep:<latest-version>\/data/out/fmriprep-<latest-version>\participant\--participant-label001002\-w/work
Resource constraints¶
Docker may be executed with limited resources.Pleaseread the documentationto limit resources such as memory, memory policies, number of CPUs, etc.
Memory will be a common culprit when working with large datasets(+10GB).However, theDocker Engine is limited to 2GB of RAM by defaultfor some installations ofDocker forMacOSX andWindows.The general resource settings can be also modified through theDocker Desktopgraphical user interface.On a shell, the memory limit can be overridden with:
$ service docker stop$ dockerd --storage-opt dm.basesize=30G