- Notifications
You must be signed in to change notification settings - Fork5
Code for the semi-supervised method proposed in our CVPR 2018 paper "Learning Pose Specific Representations by Predicting Different Views"
License
poier/PreView
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Note, we used the idea implemented here in our follow-up work to achieve state-of-the art results with only about 1% of the labeled real samples used by other works. Seecode andadditional material.
This repository contains the code for the semi-supervised method we proposed in:
Learning Pose Specific Representations by Predicting Different Views
Georg Poier, David Schinagl and Horst Bischof.
InProc. CVPR, 2018. (Project Page).
We learn to predict a low-dimensional latent representation and, subsequently,a different view of the input,solely from the latent representation.The error of the view prediction is used as feedback,enforcing the latent representation to capture pose specific informationwithout requiring labeled data.
- Download dataset
- Adapt paths in configuration to point to the dataset
- Run code
We provide data-loaders for two datasets:(i) the NYU dataset[1], and(ii) the MV-hands dataset[2]published together with the paper.
You need to change the respective paths inconfig/config_data_nyu.py
for the NYU dataset,orconfig/config_data_icg.py
for the MV-hands/ICG dataset, resp.For the MV-hands data you also need to change to the corresponding configurationby uncommenting the following line inmain_run.py
:
fromconfig.config_data_icgimportargs_data
python main_run.py
It will log the training and validation error using crayon(seehttps://github.com/torrvision/crayon),and output intermediate images and final results in theresults
folder.When using the MV-hands dataset you need to change the camera view, whichis to be predicted, by adding--output-cam-ids-train 2
to the call.To change further settings you can adapt the respective configuration filesin theconfig
folder or via the command-line(seepython main_run.py --help
for details).The default settings should be fineto reproduce the results from the paper, however.
In our case, loading of the data is the bottleneck.Hence, it's very beneficial if the data is stored on a disk with fast access times (e.g., SSD).Several workers are concurrently loading (and pre-processing) data samples.The number of workers can be changed by adjustingargs.num_loader_workers
inconfig/config.py
.
We use binary files to speed up training/testing for the NYU dataset.The binary files can be loaded faster, which will usually yield a significantspeed up for training and testing.
To make use of the binary files, you need to setargs_data.use_pickled_cache = True
inconfig/config_data_nyu.py
. Then, the binary files are used instead of the original images.If a binary file for an image does not exist already it is automaticallywritten the first time the image should be loaded.Hence, the process will be slower the first time training/testing is done withargs_data.use_pickled_cache = True
.
To ensure that all binary files will be properly written,it's probably the best/easiest to remove theWeightedRandomSampler
for a single epoch the first time you use the binary cache files.To do so, e.g., just comment out thesampler
keyword argument at thecreation of theDataLoader
indata/LoaderFactory.py
,train for one epoch (e.g., using command-line parameter--epochs 1
),and uncomment thesampler
again.Currently, thesampler
creation can be found in the lines 97-99 ofdata/LoaderFactory.py
.(And/Or use only a single worker to load the datausingargs.num_loader_workers
inconfig/config.py
.)Note, this process is not always necessary but prevents possible issues duringcreation of the binary files.
For training with the additional adversarial loss just change the training typeusing the corresponding command-line parameter.That is, callpython main_run.py --training-type 2
instead.However, note that with this additional loss we merely obtained similar resultsfor the cost of additional training time (see the paper for details).
In./source/results
you find a model pre-trained on the NYU dataset.You can generate results using this one by calling:
python main_run.py --model-filepath </path/to/model.mdl> --no-train
We used Python 2.7.To run the code you can, e.g., install the following requirements:
The code sends the data to port 8889 of "localhost".That is, you could start the server exactly as in the usage example in thecrayon README(i.e., by callingdocker run -d -p 8888:8888 -p 8889:8889 --name crayon alband/crayon
).Seehttps://github.com/torrvision/crayonfor details.
If you can make use of this work, please cite:
Learning Pose Specific Representations by Predicting Different Views.
Georg Poier, David Schinagl and Horst Bischof.
InProc. CVPR, 2018.
Bibtex:
@inproceedings{Poier2018cvpr_preview, author = {Georg Poier and David Schinagl and Horst Bischof}, title = {Learning Pose Specific Representations by Predicting Different Views}, booktitle = {{Proc. IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR)}}, year = {2018}}
[1]https://cims.nyu.edu/~tompson/NYU_Hand_Pose_Dataset.htm
[2]https://files.icg.tugraz.at/f/a190309bd4474ec2b13f/