Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

NeurIPS-2021: Direct Multi-view Multi-person 3D Human Pose Estimation

License

NotificationsYou must be signed in to change notification settings

sail-sg/mvp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is the official implementation of our NeurIPS-2021 work: Multi-view Pose Transformer (MvP).MvP is a simple algorithm that directly regresses multi-person 3D human pose from multi-view images.

⭐⭐⭐[News] A Re-implementation is integrated into xrmocap: athttps://github.com/openxrlab/xrmocap

Framework

mvp_framework

Example Result

mvp_framework

Reference

@article{wang2021mvp,  title={Direct Multi-view Multi-person 3D Human Pose Estimation},  author={Tao Wang and Jianfeng Zhang and Yujun Cai and Shuicheng Yan and Jiashi Feng},  journal={Advances in Neural Information Processing Systems},  year={2021}}

1. Installation

  1. Set the project root directory as ${POSE_ROOT}.
  2. Install all the required python packages (with requirements.txt).
  3. compile deformable operation for projective attention.
cd ./models/opssh ./make.sh

2. Data and Pre-trained Model Preparation

2.1 CMU Panoptic

Please followVoxelPose to downloadthe CMU Panoptic Dataset and PoseResNet-50 pre-trained model.

The directory tree should look like this:

${POSE_ROOT}|-- models|   |-- pose_resnet50_panoptic.pth.tar|-- data|   |-- panoptic|   |   |-- 16060224_haggling1|   |   |   |-- hdImgs|   |   |   |-- hdvideos|   |   |   |-- hdPose3d_stage1_coco19|   |   |   |-- calibration_160224_haggling1.json|   |   |-- 160226_haggling1|   |   |-- ...

2.2 Shelf/Campus

Please followVoxelPose to downloadthe Shelf/Campus Dataset.

Due to the limited and incomplete annotations of the two datasets, we use psudoground truth 3D pose generated from VoxelPose to train the model, we expect mvp wouldperform much better with absolute ground truth pose data.

Please use voxelpose or other methods to generate psudo ground truth for the training set,you can also use our generated psudo GT:psudo_gt_shelf.psudo_gt_campus.psudo_gt_campus_fix_gtmorethanpred.

Due to the small dataset size, we fine-tune Panoptic pre-trained model to Shelf and Campus.Download the pretrained MvP on Panoptic frommodel_best_5view andmodel_best_3view_horizontal_view ormodel_best_3view_2horizon_1lookdown

The directory tree should look like this:

${POSE_ROOT}|-- models|   |-- model_best_5view.pth.tar|   |-- model_best_3view_horizontal_view.pth.tar|   |-- model_best_3view_2horizon_1lookdown.pth.tar|-- data|   |-- Shelf|   |   |-- Camera0|   |   |-- ...|   |   |-- Camera4|   |   |-- actorsGT.mat|   |   |-- calibration_shelf.json|   |   |-- pesudo_gt|   |   |   |-- voxelpose_pesudo_gt_shelf.pickle|   |-- CampusSeq1|   |   |-- Camera0|   |   |-- Camera1|   |   |-- Camera2|   |   |-- actorsGT.mat|   |   |-- calibration_campus.json|   |   |-- pesudo_gt|   |   |   |-- voxelpose_pesudo_gt_campus.pickle|   |   |   |-- voxelpose_pesudo_gt_campus_fix_gtmorethanpred_case.pickle

2.3 Human3.6M dataset

Please followCHUNYUWANG/H36M-Toolbox to prepare the data.

2.4 Full Directory Tree

The data and pre-trained model directory tree should look like this, you can only downloadthe Panoptic dataset and PoseResNet-50 for reproducing the main MvP result and ablation studies:

${POSE_ROOT}|-- models|   |-- pose_resnet50_panoptic.pth.tar|   |-- model_best_5view.pth.tar|   |-- model_best_3view_horizontal_view.pth.tar|   |-- model_best_3view_2horizon_1lookdown.pth.tar|-- data|   |-- pesudo_gt|   |   |-- voxelpose_pesudo_gt_shelf.pickle|   |   |-- voxelpose_pesudo_gt_campus.pickle|   |   |-- voxelpose_pesudo_gt_campus_fix_gtmorethanpred_case.pickle|   |-- panoptic|   |   |-- 16060224_haggling1|   |   |   |-- hdImgs|   |   |   |-- hdvideos|   |   |   |-- hdPose3d_stage1_coco19|   |   |   |-- calibration_160224_haggling1.json|   |   |-- 160226_haggling1|   |   |-- ...|   |-- Shelf|   |   |-- Camera0|   |   |-- ...|   |   |-- Camera4|   |   |-- actorsGT.mat|   |   |-- calibration_shelf.json|   |   |-- pesudo_gt|   |   |   |-- voxelpose_pesudo_gt_shelf.pickle|   |-- CampusSeq1|   |   |-- Camera0|   |   |-- Camera1|   |   |-- Camera2|   |   |-- actorsGT.mat|   |   |-- calibration_campus.json|   |   |-- pesudo_gt|   |   |   |-- voxelpose_pesudo_gt_campus.pickle|   |   |   |-- voxelpose_pesudo_gt_campus_fix_gtmorethanpred_case.pickle|   |-- HM36

3. Training and Evaluation

The evaluation result will be printed after every epoch, the best result can be found in the log.

3.1 CMU Panoptic dataset

We train and validate on the five selected camera views. We trained our models on 8 GPUs and batch_size=1 for each GPU, note the total iteration per epoch should be3205, if not, please check your data.

python -m torch.distributed.launch --nproc_per_node=8 --use_env run/train_3d.py --cfg configs/panoptic/best_model_config.yaml

Pre-trained models

DatasetsAP25AP25AP25AP25MPJPEpth
Panoptic92.396.697.597.715.8here

3.1.1 Ablation Experiments

You can find several ablation experiment configs under./configs/panoptic/, for example, removing RayConv:

python -m torch.distributed.launch --nproc_per_node=8 --use_env run/train_3d.py --cfg configs/panoptic/ablation_remove_rayconv.yaml

3.2 Shelf/Campus datasets

As shelf/campus are very small dataset with incomplete annotation, we finetune pretrained MvP with pseudo ground truth 3D pose extracted with VoxelPose, we expect more accurate GT would help MvP achieve much higher performance.

python -m torch.distributed.launch --nproc_per_node=8 --use_env run/train_3d.py --cfg configs/shelf/mvp_shelf.yaml

Pre-trained models

DatasetsActor 1Actor 2Actor 2Averagepth
Shelf99.395.197.897.4here
Campus98.294.197.496.6here

3.3 Human3.6M dataset

MvP also applies to the naive single-person setting, with dataset like Human3.6, to come

python -m torch.distributed.launch --nproc_per_node=8 --use_env run/train_3d.py --cfg configs/h36m/mvp_h36m.yaml

4. Evaluation Only

To evaluate a trained model, pass the config and model pth:

python -m torch.distributed.launch --nproc_per_node=8 --use_env run/validate_3d.py --cfg xxx --model_path xxx

LICENSE

This repo is under the Apache-2.0 license. For commercial use, please contact the authors.

About

NeurIPS-2021: Direct Multi-view Multi-person 3D Human Pose Estimation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp