- Notifications
You must be signed in to change notification settings - Fork602
The project is an official implement of our ECCV2018 paper "Simple Baselines for Human Pose Estimation and Tracking(https://arxiv.org/abs/1804.06208)"
License
microsoft/human-pose-estimation.pytorch
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
- Our new workHigh-Resolution Representations for Labeling Pixels and Regions is available atHRNet. Our HRNet has been applied to a wide range of vision tasks, such asimage classification,objection detection,semantic segmentation andfacial landmark.
- Our new workDeep High-Resolution Representation Learning for Human Pose Estimation has already been released athttps://github.com/leoxiaobin/deep-high-resolution-net.pytorch. The best single HRNet can obtain anAP of 77.0 on COCO test-dev2017 dataset and92.3% of PCKh@0.5 on MPII test set. The new repositoty also support the SimpleBaseline method, and you are welcomed to try it.
- Our entry using this repo has won the winner ofPoseTrack2018 Multi-person Pose Tracking Challenge!
- Our entry using this repo ranked 2nd place in thekeypoint detection task of COCO 2018!
This is an official pytorch implementation ofSimple Baselines for Human Pose Estimation and Tracking. This work provides baseline methods that are surprisingly simple and effective, thus helpful for inspiring and evaluating new ideas for the field. State-of-the-art results are achieved on challenging benchmarks. On COCO keypoints valid dataset, our bestsingle model achieves74.3 of mAP. You can reproduce our results using this repo. All models are provided for research purpose.
| Arch | Head | Shoulder | Elbow | Wrist | Hip | Knee | Ankle | Mean | Mean@0.1 |
|---|---|---|---|---|---|---|---|---|---|
| 256x256_pose_resnet_50_d256d256d256 | 96.351 | 95.329 | 88.989 | 83.176 | 88.420 | 83.960 | 79.594 | 88.532 | 33.911 |
| 384x384_pose_resnet_50_d256d256d256 | 96.658 | 95.754 | 89.790 | 84.614 | 88.523 | 84.666 | 79.287 | 89.066 | 38.046 |
| 256x256_pose_resnet_101_d256d256d256 | 96.862 | 95.873 | 89.518 | 84.376 | 88.437 | 84.486 | 80.703 | 89.131 | 34.020 |
| 384x384_pose_resnet_101_d256d256d256 | 96.965 | 95.907 | 90.268 | 85.780 | 89.597 | 85.935 | 82.098 | 90.003 | 38.860 |
| 256x256_pose_resnet_152_d256d256d256 | 97.033 | 95.941 | 90.046 | 84.976 | 89.164 | 85.311 | 81.271 | 89.620 | 35.025 |
| 384x384_pose_resnet_152_d256d256d256 | 96.794 | 95.618 | 90.080 | 86.225 | 89.700 | 86.862 | 82.853 | 90.200 | 39.433 |
- Flip test is used.
| Arch | AP | Ap .5 | AP .75 | AP (M) | AP (L) | AR | AR .5 | AR .75 | AR (M) | AR (L) |
|---|---|---|---|---|---|---|---|---|---|---|
| 256x192_pose_resnet_50_d256d256d256 | 0.704 | 0.886 | 0.783 | 0.671 | 0.772 | 0.763 | 0.929 | 0.834 | 0.721 | 0.824 |
| 384x288_pose_resnet_50_d256d256d256 | 0.722 | 0.893 | 0.789 | 0.681 | 0.797 | 0.776 | 0.932 | 0.838 | 0.728 | 0.846 |
| 256x192_pose_resnet_101_d256d256d256 | 0.714 | 0.893 | 0.793 | 0.681 | 0.781 | 0.771 | 0.934 | 0.840 | 0.730 | 0.832 |
| 384x288_pose_resnet_101_d256d256d256 | 0.736 | 0.896 | 0.803 | 0.699 | 0.811 | 0.791 | 0.936 | 0.851 | 0.745 | 0.858 |
| 256x192_pose_resnet_152_d256d256d256 | 0.720 | 0.893 | 0.798 | 0.687 | 0.789 | 0.778 | 0.934 | 0.846 | 0.736 | 0.839 |
| 384x288_pose_resnet_152_d256d256d256 | 0.743 | 0.896 | 0.811 | 0.705 | 0.816 | 0.797 | 0.937 | 0.858 | 0.751 | 0.863 |
| Arch | AP | Ap .5 | AP .75 | AP (M) | AP (L) | AR | AR .5 | AR .75 | AR (M) | AR (L) |
|---|---|---|---|---|---|---|---|---|---|---|
| 256x192_pose_resnet_50_caffe_d256d256d256 | 0.704 | 0.914 | 0.782 | 0.677 | 0.744 | 0.735 | 0.921 | 0.805 | 0.704 | 0.783 |
| 256x192_pose_resnet_101_caffe_d256d256d256 | 0.720 | 0.915 | 0.803 | 0.693 | 0.764 | 0.753 | 0.928 | 0.821 | 0.720 | 0.802 |
| 256x192_pose_resnet_152_caffe_d256d256d256 | 0.728 | 0.925 | 0.804 | 0.702 | 0.766 | 0.760 | 0.931 | 0.828 | 0.729 | 0.806 |
- Flip test is used.
- Person detector has person AP of 56.4 on COCO val2017 dataset.
- Difference betweenPyTorch-style andCaffe-style ResNet is the position of stride=2 convolution
The code is developed using python 3.6 on Ubuntu 16.04. NVIDIA GPUs are needed. The code is developed and tested using 4 NVIDIA P100 GPU cards. Other platforms or GPU cards are not fully tested.
Install pytorch >= v0.4.0 followingofficial instruction.
Disable cudnn for batch_norm:
# PYTORCH=/path/to/pytorch# for pytorch v0.4.0sed -i "1194s/torch\.backends\.cudnn\.enabled/False/g" ${PYTORCH}/torch/nn/functional.py# for pytorch v0.4.1sed -i "1254s/torch\.backends\.cudnn\.enabled/False/g" ${PYTORCH}/torch/nn/functional.pyNote that instructions like # PYTORCH=/path/to/pytorch indicate that you should pick a path where you'd like to have pytorch installed and then set an environment variable (PYTORCH in this case) accordingly.
Clone this repo, and we'll call the directory that you cloned as ${POSE_ROOT}.
Install dependencies:
pip install -r requirements.txtMake libs:
cd ${POSE_ROOT}/libmakeInstallCOCOAPI:
# COCOAPI=/path/to/clone/cocoapigit clone https://github.com/cocodataset/cocoapi.git $COCOAPIcd $COCOAPI/PythonAPI# Install into global site-packagesmake install# Alternatively, if you do not have permissions or prefer# not to install the COCO API into global site-packagespython3 setup.py install --userNote that instructions like # COCOAPI=/path/to/install/cocoapi indicate that you should pick a path where you'd like to have the software cloned and then set an environment variable (COCOAPI in this case) accordingly.
Download pytorch imagenet pretrained models frompytorch model zoo and caffe-style pretrained models fromGoogleDrive.
Download mpii and coco pretrained models fromOneDrive orGoogleDrive. Please download them under ${POSE_ROOT}/models/pytorch, and make them look like this:
${POSE_ROOT} `-- models `-- pytorch |-- imagenet | |-- resnet50-19c8e357.pth | |-- resnet50-caffe.pth.tar | |-- resnet101-5d3b4d8f.pth | |-- resnet101-caffe.pth.tar | |-- resnet152-b121ed2d.pth | `-- resnet152-caffe.pth.tar |-- pose_coco | |-- pose_resnet_101_256x192.pth.tar | |-- pose_resnet_101_384x288.pth.tar | |-- pose_resnet_152_256x192.pth.tar | |-- pose_resnet_152_384x288.pth.tar | |-- pose_resnet_50_256x192.pth.tar | `-- pose_resnet_50_384x288.pth.tar `-- pose_mpii |-- pose_resnet_101_256x256.pth.tar |-- pose_resnet_101_384x384.pth.tar |-- pose_resnet_152_256x256.pth.tar |-- pose_resnet_152_384x384.pth.tar |-- pose_resnet_50_256x256.pth.tar `-- pose_resnet_50_384x384.pth.tarInit output(training model output directory) and log(tensorboard log directory) directory:
mkdir output mkdir logYour directory tree should look like this:
${POSE_ROOT}├── data├── experiments├── lib├── log├── models├── output├── pose_estimation├── README.md└── requirements.txt
For MPII data, please download fromMPII Human Pose Dataset. The original annotation files are in matlab format. We have converted them into json format, you also need to download them fromOneDrive orGoogleDrive.Extract them under {POSE_ROOT}/data, and make them look like this:
${POSE_ROOT}|-- data`-- |-- mpii `-- |-- annot | |-- gt_valid.mat | |-- test.json | |-- train.json | |-- trainval.json | `-- valid.json `-- images |-- 000001163.jpg |-- 000003072.jpgFor COCO data, please download fromCOCO download, 2017 Train/Val is needed for COCO keypoints training and validation. We also provide person detection result of COCO val2017 to reproduce our multi-person pose estimation results. Please download fromOneDrive orGoogleDrive.Download and extract them under {POSE_ROOT}/data, and make them look like this:
${POSE_ROOT}|-- data`-- |-- coco `-- |-- annotations | |-- person_keypoints_train2017.json | `-- person_keypoints_val2017.json |-- person_detection_results | |-- COCO_val2017_detections_AP_H_56_person.json `-- images |-- train2017 | |-- 000000000009.jpg | |-- 000000000025.jpg | |-- 000000000030.jpg | |-- ... `-- val2017 |-- 000000000139.jpg |-- 000000000285.jpg |-- 000000000632.jpg |-- ...python pose_estimation/valid.py \ --cfg experiments/mpii/resnet50/256x256_d256x3_adam_lr1e-3.yaml \ --flip-test \ --model-file models/pytorch/pose_mpii/pose_resnet_50_256x256.pth.tarpython pose_estimation/train.py \ --cfg experiments/mpii/resnet50/256x256_d256x3_adam_lr1e-3.yamlpython pose_estimation/valid.py \ --cfg experiments/coco/resnet50/256x192_d256x3_adam_lr1e-3.yaml \ --flip-test \ --model-file models/pytorch/pose_coco/pose_resnet_50_256x192.pth.tarpython pose_estimation/train.py \ --cfg experiments/coco/resnet50/256x192_d256x3_adam_lr1e-3.yamlIf you use our code or models in your research, please cite with:
@inproceedings{xiao2018simple, author={Xiao, Bin and Wu, Haiping and Wei, Yichen}, title={Simple Baselines for Human Pose Estimation and Tracking}, booktitle = {European Conference on Computer Vision (ECCV)}, year = {2018}}About
The project is an official implement of our ECCV2018 paper "Simple Baselines for Human Pose Estimation and Tracking(https://arxiv.org/abs/1804.06208)"
Topics
Resources
License
Code of conduct
Contributing
Security policy
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Contributors11
Uh oh!
There was an error while loading.Please reload this page.