Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Author's implementation of 'Unsupervised Visual Representation Learning by Context Prediction'

License

NotificationsYou must be signed in to change notification settings

cdoersch/deepcontext

Repository files navigation

Created by Carl Doersch (Carnegie Mellon / UC Berkeley)

Introduction

This code is designed to train a visual representation from a raw, unlabeledimage collection. The resulting representation seems to be useful for standardvision tasks like object detection, surface normal estimation, and visual datamining.

This algorithm was originally described inUnsupervised Visual RepresentationLearning by ContextPrediction, which waspresented at ICCV 2015.

This code is significantly refactored from what was used to produce the resultsin the paper, and minor modifications have been made. While I do not expectthese modifications to significantly impact results, I have not yet fullytested the new codebase, and will need a few more weeks to do so.
Qualitative behavior early in the training on appears to be equivalent, butyou should still use this code with caution.

Citing this codebase

If you find this code useful, please consider citing:

@inproceedings{doersch2015unsupervised,    Author = {Doersch, Carl and Gupta, Abhinav and Efros, Alexei A.},    Title = {Unsupervised Visual Representation Learning by Context Prediction},    Booktitle = {International Conference on Computer Vision ({ICCV})},    Year = {2015}}

Installation

  1. Clone the deepcontext repository
# Make sure to clone with --recursivegit clone --recursive https://github.com/cdoersch/deepcontext.git
  1. Build Caffe and pycaffe

    cd$DEEPCONTEXT_ROOT/caffe_ext# Now follow the Caffe installation instructions here:#   http://caffe.berkeleyvision.org/installation.html# If you're experienced with Caffe and have all of the requirements installed# and your Makefile.config in place, then simply do:make -j8&& make pycaffe

    External caffe installations should work as well, but need to be downloadedfrom Github later than November 22, 2015 to support all required features.

  2. Copydeepcontext_config.py.example todeepcontext_config.pyand edit it tosupply your path to ImageNet, and provide an output directory that the codecan use for temporary files, including snapshots.

Running

  1. Execute train.py inside python. This will begin an infinite training loop, whichsnapshots every 2000 iterations. The results in the paper used a model thattrained for about 1M iterations.

    By the code will run on GPU 0; you can use the environment variableCUDA_VISIBLE_DEVICES to change the GPU.

    All testing was done with python 2.7. It is recommended that you run insideipython usingexecfile('train.py').

  2. To stop the train.py script, create the filetrain_quit in the directorywhere you ran the code. This roundabout approach is required because thecode starts background processes to load data, and it's difficult toguarantee that these background threads will be terminated if the code isinterrupted viaCtrl+C.

    If train.py is re-started after it is quit, it will examine the outputdirectory and attempt to continue from the snapshot with the higestiteration number.

  3. You can pause the training at any time by creating the filetrain_pause inthe directory where you ran the code. This will let you use pycaffe toexamine the network state. Re-run train.py to continue.

  4. For our experiments, we ran for 1.7M iterations. After this point, you canrun debatchnorm.py on the output (you'll need your own copy of a caffenetwith the groups removed). Once you've run it, then you have a model thatcan be fine-tuned. I recommend using our data-dependent initializationand calibration procedure[Krähenbühl et al.]before fine-tuning, as debatchnorm.py will lead to badly-scaled weights.
    The network trained using this procedure and fine-tuned withfast-rcnn on VOC2007 achieves51.4% MAP.

About

Author's implementation of 'Unsupervised Visual Representation Learning by Context Prediction'

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp