- Notifications
You must be signed in to change notification settings - Fork43
Author's implementation of 'Unsupervised Visual Representation Learning by Context Prediction'
License
cdoersch/deepcontext
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Created by Carl Doersch (Carnegie Mellon / UC Berkeley)
This code is designed to train a visual representation from a raw, unlabeledimage collection. The resulting representation seems to be useful for standardvision tasks like object detection, surface normal estimation, and visual datamining.
This algorithm was originally described inUnsupervised Visual RepresentationLearning by ContextPrediction, which waspresented at ICCV 2015.
This code is significantly refactored from what was used to produce the resultsin the paper, and minor modifications have been made. While I do not expectthese modifications to significantly impact results, I have not yet fullytested the new codebase, and will need a few more weeks to do so.
Qualitative behavior early in the training on appears to be equivalent, butyou should still use this code with caution.
If you find this code useful, please consider citing:
@inproceedings{doersch2015unsupervised, Author = {Doersch, Carl and Gupta, Abhinav and Efros, Alexei A.}, Title = {Unsupervised Visual Representation Learning by Context Prediction}, Booktitle = {International Conference on Computer Vision ({ICCV})}, Year = {2015}}
- Clone the deepcontext repository
# Make sure to clone with --recursivegit clone --recursive https://github.com/cdoersch/deepcontext.git
Build Caffe and pycaffe
cd$DEEPCONTEXT_ROOT/caffe_ext# Now follow the Caffe installation instructions here:# http://caffe.berkeleyvision.org/installation.html# If you're experienced with Caffe and have all of the requirements installed# and your Makefile.config in place, then simply do:make -j8&& make pycaffe
External caffe installations should work as well, but need to be downloadedfrom Github later than November 22, 2015 to support all required features.
Copy
deepcontext_config.py.example
todeepcontext_config.py
and edit it tosupply your path to ImageNet, and provide an output directory that the codecan use for temporary files, including snapshots.
Execute train.py inside python. This will begin an infinite training loop, whichsnapshots every 2000 iterations. The results in the paper used a model thattrained for about 1M iterations.
By the code will run on GPU 0; you can use the environment variableCUDA_VISIBLE_DEVICES to change the GPU.
All testing was done with python 2.7. It is recommended that you run insideipython using
execfile('train.py')
.To stop the train.py script, create the file
train_quit
in the directorywhere you ran the code. This roundabout approach is required because thecode starts background processes to load data, and it's difficult toguarantee that these background threads will be terminated if the code isinterrupted viaCtrl+C
.If train.py is re-started after it is quit, it will examine the outputdirectory and attempt to continue from the snapshot with the higestiteration number.
You can pause the training at any time by creating the file
train_pause
inthe directory where you ran the code. This will let you use pycaffe toexamine the network state. Re-run train.py to continue.For our experiments, we ran for 1.7M iterations. After this point, you canrun debatchnorm.py on the output (you'll need your own copy of a caffenetwith the groups removed). Once you've run it, then you have a model thatcan be fine-tuned. I recommend using our data-dependent initializationand calibration procedure[Krähenbühl et al.]before fine-tuning, as debatchnorm.py will lead to badly-scaled weights.
The network trained using this procedure and fine-tuned withfast-rcnn on VOC2007 achieves51.4% MAP.