Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

[CVPR 2022] Focal Length and Object Pose Estimation via Render and Compare

License

NotificationsYou must be signed in to change notification settings

ponimatkin/focalpose

Repository files navigation

Georgy Ponimatkin,Yann Labbé,Bryan Russell,Mathieu Aubry,Josef Sivic

CVPR: Conference on Computer Vision and Pattern Recognition, 2022

[Paper][Project page]

Preparing the environment and data

To prepare the environment run the following commands:

conda env create -n focalpose --file environment.yamlconda activate focalposegit clone https://github.com/ylabbe/bullet3.git && cd bullet3 python setup.py buildpython setup.py install

To download the data runbash download_data.sh. This will download all the filesexcept for the CompCars and texture datasets. For CompCars, please followthese instructionsand download the full.zip archive namedCompCars.zip intolocal_data directory. Same needs to be done for the texture datasetwhich can be found atthis link. After all files are downloaded, just run

bash prepare_data.shbash preprocess_data.sh

This will prepare and preprocess all the files necessary for the codebase.

Rendering synthetic data

The synthetic data needed for training can be generated via:

python -m focalpose.scripts.run_dataset_recording --config CONFIG --local

You can see all possible configs in therun_dataset_recording.py file. Synthetic data for Pix3D chair, CompCars and Stanford Carsdatasets are split into multiple chunks to reduce possible rendering artifacts due to the large number of meshes. There are 21 chunks for the Pix3D chair, 10 for CompCars and 13 for Stanford Cars.The rendering process can be potentially sped-up by running the command without--local flag. This will use SLURM backend of thedask_jobqueue library. You will need to fix config of theSLURMCluster in therecord_dataset.py according to your cluster.

Alternatively, synthetic data can be downloaded atthis link. The downloaded data should be unpacked intolocal_data/synt_datasets folder.

Training and evaluating the models

The model can be trained via the following command:

python -m focalpose.scripts.run_pose_training --config pix3d-sofa-coarse-disent-F05p

This particular config will train coarse model on Pix3D sofa dataset using disentangled loss and 0.5% of real-to-synth data ratio. As another example, the following command will trainrefiner model on the Stanford Cars dataset with 10% of real-to-synth data ratio and using the Huber loss:

python -m focalpose.scripts.run_pose_training --config stanfordcars3d-refine-huber-F10p

We also provide an example submission scripts forSLURM andPBS batch systems.

To evaluate the trained coarse and refiner models run (using provided checkpoints as an example):

python -m focalpose.scripts.run_pose_evaluation --dataset pix3d-bed.test \                                               --coarse-run-id pix3d-bed-coarse-F05p-disent--cvpr2022 \                                               --refine-run-id pix3d-bed-refine-F05p-disent--cvpr2022 \                                               --mrcnn-run-id detector-pix3d-bed-real-two-class--cvpr2022 \                                               --niter 15

The pretrained models are located in thelocal_data/experiments folder, which appears after running the data preparation scripts.

Running inference on the single image

You can also directly run inference on a given image after running the data preparation scripts via:

python -m focalpose.scripts.run_single_image_inference --img path/to/image.jpg \                                                       --cls class_on_image \                                                       --niter 15 \                                                       --topk 15

This will run the inference on an image with the class manually provided to the script. The pose will be refined for 15iterations and the script will output top-15 model instances predicted by our instance retrieval pipeline. The ouput will consistof images with aligned meshes, and.txt files containing camera matrix and camera pose.

Citation

If you use this code in your research, please cite the following paper:

@inproceedings{ponimatkin2022focal, title= {Focal Length and Object Pose Estimation via Render and Compare}, author={G. {Ponimatkin} and Y. {Labbe} and B. {Russell} and M. {Aubry} and J. {Sivic}}, booktitle={Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2022}} }

This project is derived from the originalCosyPose codebase.

About

[CVPR 2022] Focal Length and Object Pose Estimation via Render and Compare

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp