- Notifications
You must be signed in to change notification settings - Fork11
Pytorch code for ECCV'22 paper. ShAPO: Implicit Representations for Multi-Object Shape, Appearance and Pose Optimization
License
zubair-irshad/shapo
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This repository is the pytorch implementation of our paper:
ShAPO: Implicit Representations for Multi-Object Shape, Appearance and Pose Optimization
Muhammad Zubair Irshad,Sergey Zakharov,Rares Ambrus,Thomas Kollar,Zsolt Kira,Adrien Gaidon
European Conference on Computer Vision (ECCV), 2022
[Project Page] [arXiv] [PDF] [Video] [Poster]
Previous ICRA'22 work:
CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and Categorical 6D Pose and Size Estimation
Muhammad Zubair Irshad,Thomas Kollar,Michael Laskey,Kevin Stone,Zsolt Kira
International Conference on Robotics and Automation (ICRA), 2022
[Project Page] [arXiv] [PDF] [Video] [Poster]
If you find this repository useful, please consider citing:
@inproceedings{irshad2022shapo, title = {ShAPO: Implicit Representations for Multi-Object Shape Appearance and Pose Optimization}, author = {Muhammad Zubair Irshad and Sergey Zakharov and Rares Ambrus and Thomas Kollar and Zsolt Kira and Adrien Gaidon}, journal = {European Conference on Computer Vision (ECCV)}, year = {2022} }@inproceedings{irshad2022centersnap, title = {CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and Categorical 6D Pose and Size Estimation}, author = {Muhammad Zubair Irshad and Thomas Kollar and Michael Laskey and Kevin Stone and Zsolt Kira}, journal = {IEEE International Conference on Robotics and Automation (ICRA)}, year = {2022} }
If you want to experiment with ShAPO, we have written aColab. It's quite comprehensive and easy to setup. It goes through the following experiments / ShAPO properties:
- Single Shot inference
- Visualize peak and depth output
- Decode shape with predicted textures
- Project 3D Pointclouds and 3D bounding boxes on 2D image
- Shape, Appearance and Pose Optimization
- Core optimization loop
- Viusalizing optimized 3D output (i.e. textured asset creation)
Create a python 3.8 virtual environment and install requirements:
cd$ShAPO_Repoconda create -y --prefix ./env python=3.8conda activate ./env/./env/bin/python -m pip install --upgrade pip./env/bin/python -m pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html
The code was built and tested oncuda 10.2
Downloadcamera_train,camera_val,real_train,real_test,ground-truth annotations,camera_composed_depth,mesh models andeval_results provided byNOCS andnocs preprocess data.
Also downloadsdf_rgb_pretrained_weights.Unzip and organize these files in $ShAPO_Repo/data as follows:
data├── CAMERA│ ├── train│ └── val├── Real│ ├── train│ └── test├── camera_full_depths│ ├── train│ └── val├── gts│ ├── val│ └── real_test├── results│ ├── camera│ ├── mrcnn_results│ ├── nocs_results│ └── real├── sdf_rgb_pretrained│ ├── LatentCodes│ ├── Reconstructions│ ├── ModelParameters│ ├── OptimizerParameters│ └── rgb_net_weights└── obj_models ├── train ├── val ├── real_train ├── real_test ├── camera_train.pkl ├── camera_val.pkl ├── real_train.pkl ├── real_test.pkl └── mug_meta.pkl
Create image lists
./runner.sh prepare_data/generate_training_data.py --data_dir /home/ubuntu/shapo/data/nocs_data/
Now run distributed script to collect data locally in a few hours. The data would be saved underdata/NOCS_data
.
Note: The script uses multi-gpu and runs 8 workers per gpu on a 16GB GPU. Changeworker_per_gpu
variable depending on your GPU size.
python prepare_data/distributed_generate_data.py --data_dir /home/ubuntu/shapoplusplus/data/nocs_data --type camera_train--type chose from 'camera_train', 'camera_val', 'real_train', 'real_val'
ShAPO is a two-stage process; First, a single-shot network to predict 3D shape, pose and size codes along with segmentation masks in a per-pixel manner. Second, test-time optimization of joint shape, pose and size codes given a single-view RGB-D observation of a new instance.
- Train on NOCS Synthetic (requires 13GB GPU memory):
./runner.sh net_train.py @configs/net_config.txt
Note thanrunner.sh is equivalent to usingpython to run the script. Additionally it sets up the PYTHONPATH and ShAPO Enviornment Path automatically.Also note that this part of the code is similar toCenterSnap. We predictimplicit shapes as SDF MLP instead of pointclouds and additionally also predictappearance embedding andobject masks in this stage.
- Finetune on NOCS Real Train (Note that good results can be obtained after finetuning on the Real train set for only a few epochs i.e. 1-5):
./runner.sh net_train.py @configs/net_config_real_resume.txt --checkpoint\path\to\best\checkpoint
- Inference on a NOCS Real Test Subset
Download a small Real test subset fromhere, our shape and texture decoder pretrained checkpoints fromhere and shapo pretrained checkpoints on real datasethere.Unzip and organize these files in $ShAPO_Repo/data as follows:
test_data├── Real│ ├── test| ckpts└── sdf_rgb_pretrained ├── LatentCodes ├── LatentCodes ├── Reconstructions ├── ModelParameters ├── OptimizerParameters └── rgb_net_weights
Now run the inference script to visualize the single-shot predictions as follows:
bash./runner.sh inference/inference_real.py @configs/net_config.txt --test_data_dir path_to_nocs_test_subset --checkpoint checkpoint_path_here
You should see thevisualizations saved inresults/ShAPO_real
. Change the --ouput_path in *config.txt to save them to a different folder
- Optimization
This is the core optimization script to update latent shape and appearance codes along with 6D pose and sizes to better the fit the unseen single-view RGB-D observation. For a quick run of the core optimization loop along with visualization, see thisnotebook here
./runner.sh opt/optimize.py @configs/net_config.txt --data_dir /path/to/test_data_dir/ --checkpoint checkpoint_path_here
Please see FAQs from CenterSnaphere
- This code is built upon the implementation fromCenterSnap
- This repository is released under theCC BY-NC 4.0 license.