- Notifications
You must be signed in to change notification settings - Fork56
License
CMU-Perceptual-Computing-Lab/caffe_train
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Our modified caffe for training multi-person pose estimator. The original caffe version is in July 2016. This repository at least runs on Ubuntu 14.04, OpenCV 2.4.10, CUDA 7.5/8.0, and CUDNN 5.
Thefull project repo includes detailed training steps and the testing code in matlab, C++ and python.
We add customized caffe layer for data augmentation:cpm_data_transformer.cpp, including scale augmentation e.g., in the range of 0.7 to 1.3, rotation augmentation, e.g., in the range of -40 to 40 degrees, flip augmentation and image cropping. This augmentation strategy makes the method capable of dealing with a large range of scales and orientations. You can set the augmentation parameters insetLayers.py. Example data layer parameters in thetraining prototxt is:
layer { name: "data" type: "CPMData" top: "data" top: "label" data_param { source: "/home/zhecao/COCO_kpt/lmdb_trainVal" batch_size: 10 backend: LMDB } cpm_transform_param { stride: 8 max_rotate_degree: 40 visualize: false crop_size_x: 368 crop_size_y: 368 scale_prob: 1 scale_min: 0.5 scale_max: 1.1 target_dist: 0.6 center_perterb_max: 40 do_clahe: false num_parts: 56 np_in_lmdb: 17 }}
This project is licensed under the terms of the GPL v3 license. We will merge it with the caffe testing version (https://github.com/CMU-Perceptual-Computing-Lab/caffe_rtpose) later.
Please cite the paper in your publications if it helps your research:
@article{cao2016realtime, title={Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields}, author={Zhe Cao and Tomas Simon and Shih-En Wei and Yaser Sheikh}, journal={arXiv preprint arXiv:1611.08050}, year={2016} }@inproceedings{wei2016cpm, author = {Shih-En Wei and Varun Ramakrishna and Takeo Kanade and Yaser Sheikh}, booktitle = {CVPR}, title = {Convolutional pose machines}, year = {2016} }
Caffe is a deep learning framework made with expression, speed, and modularity in mind.It is developed by the Berkeley Vision and Learning Center (BVLC) and community contributors.
Check out theproject site for all the details like
- DIY Deep Learning for Vision with Caffe
- Tutorial Documentation
- BVLC reference models and thecommunity model zoo
- Installation instructions
and step-by-step examples.
Please join thecaffe-users group orgitter chat to ask questions and talk about methods and models.Framework development discussions and thorough bug reports are collected onIssues.
Happy brewing!
Caffe is released under theBSD 2-Clause license.The BVLC reference models are released for unrestricted use.
Please cite Caffe in your publications if it helps your research:
@article{jia2014caffe, Author = {Jia, Yangqing and Shelhamer, Evan and Donahue, Jeff and Karayev, Sergey and Long, Jonathan and Girshick, Ross and Guadarrama, Sergio and Darrell, Trevor}, Journal = {arXiv preprint arXiv:1408.5093}, Title = {Caffe: Convolutional Architecture for Fast Feature Embedding}, Year = {2014}}