Adding test data

  1. We really, really like test images, but

  2. We are rather conservative about the size of our code repository.

So, we have two different ways of adding test data.

  1. Small, open licensed files can go in thenibabel/tests/data directory(see below);

  2. Larger files or files with extra licensing terms can go in their own gitrepositories and be added as submodules to thenibabel-data directory.

Small files

Small files are around 50K or less when compressed. By “compressed”, we mean,compressed with zlib, which is what git uses when storing the file in therepository. You can check the exact length directly with Python and a scriptlike:

importsysimportzlibforfnameinsys.argv[1:]:withopen(fname,'rb')asfobj:contents=fobj.read()compressed=zlib.compress(contents)print(fname,len(compressed)/1024.)

One way of making files smaller when compressed is to set uninteresting valuesto zero or some other number so that the compression algorithm can be moreeffective.

Please don’t compress the file yourself before committing to a git repo unlessthere’s a really good reason; git will do this for you when adding to therepository, and it’s a shame to make git compress a compressed file.

Files with open licenses

We very much prefer files with completely open licenses such as thePDDL1.0 or theCC0 license.

The files in thenibabel/tests/data will get distributed with the nibabelsource code, and this can easily get installed without the user having anopportunity to review the full license. We don’t think this is compatiblewith extra license terms like agreeing to cite the people who provided thedata or agreeing not to try and work out the identity of the person who hasbeen scanned, because it would be too easy to miss these requirements whenusing nibabel. It is fine to use files with these kind of licenses, but theyshould go in their own repository to be used as a submodule, so they do notneed to be distributed with nibabel.

Adding the file tonibabel/tests/data

If the file is less then about 50K compressed, and the license is open, thenyou might want to commit the file undernibabel/tests/data.

Put the license for any new files in the COPYING file at the top level of thenibabel repo. You’ll see some examples in that file already.

Adding as a submodule tonibabel-data

Make a new git repository with the data.

There are example repos at

Despite the fact that both the examples are on github,Bitbucket is good forrepos like this because they don’t enforce repository size limits.

Don’t forget to include a LICENSE and README file in the repo.

When all is done, and the repository is safely on the internet and accessible,add the repo as a submodule to thenitests-data directory, with somethinglike this:

gitsubmoduleaddhttps://bitbucket.org/nipy/rosetta-samples.gitnitests-data/rosetta-samples

You should now have a checked out copy of therosetta-samples repositoryin thenibabel-data/rosetta-samples directory. Commit the submodule thatis now in your git staging area.

If you are writing tests using files from this repository, you should use theneeds_nibabel_data decorator to skip the tests if the data has not beenchecked out into the submodules. Seenibabel/tests/test_parrec_data.pyfor an example. For our example repository above it might look somethinglike:

from.nibabel_dataimportget_nibabel_data,needs_nibabel_dataROSETTA_DATA=pjoin(get_nibabel_data(),'rosetta-samples')@needs_nibabel_data('rosetta-samples')deftest_something():# Some test using the data

Using submodules for tests

Tests run vianibabel on travis start with an automatic checkout of allsubmodules in the project, so all test data submodules get checked out bydefault.

If you are running the tests locally, you may well want to do:

gitsubmoduleupdate--init

from the root nibabel directory. This will checkout all the test datarepositories.

How much data should go in a single submodule?

The limiting factor is how long it takestravis-ci to checkout the data forthe tests. Up to a hundred megabytes in one repository should be OK. The joyof submodules is we can always drop a submodule, split the repository into twoand add only one back, so you aren’t committing us to anything awful if youaccidentally put some very large files into your own data repository.

If in doubt

If you are not sure, try us with a pull request tonibabel github, or on thenipy mailing list, we will try to help.