Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Multi-View Stereo by Temporal Nonparametric Fusion

License

NotificationsYou must be signed in to change notification settings

AaltoML/GP-MVS

Repository files navigation

Yuxin Hou ·Juho Kannala ·Arno Solin

Codes for the paper:

  • Yuxin Hou, Arno Solin, and Juho Kannala (2019).Multi-view stereo by temporal nonparametric fusion.International Conference on Computer Vision (ICCV). Seoul, Korea. [arXiv] [video] [project page]

Summary

We propose a novel idea for depth estimation from unstructured multi-view image-pose pairs, where the model has capability to leverage information from previous latent-space encodings of the scene. This model uses pairs of images and poses, which are passed through an encoder-decoder model for disparity estimation. The novelty lies in soft-constraining the bottleneck layer by a nonparametric Gaussian process prior.

Example depth estimation result running in real-time on an iPad.

Prerequisites

  • Python3
  • Numpy
  • Pytorch 0.4.0
  • CUDA 9 (You can also run without CUDA, but then you need to remove all.cuda() in codes)
  • opencv
  • tensorboardX
  • imageio
  • path.py
  • blessings
  • progressbar2

Training

As we mentioned in our paper, the training use the split pretrained MVDepthNet model as statring point.Check thelink to get the pretrained model.

python train.py train_dataset_path --pretrained-dict pretrained_mvdepthnet --log-output

Testing

For testing run

python test.py formatted_seq_path --savepath disparity.npy --encoder encoder_path --gp gp_path --decoder decoder_path

Our pretrained model can be downloadedhere.

Use your own data for testing

The formatted sequence have the folder structure like this:

  • K.txt: The txt file stores the camera intrinsic matrix
  • poses.txt: The text file stores extrinsic matrixs for all frames in the sequence in order.
  • images: The folder includes all RGB images(.png), and the images are ordered by name.
  • depth: The folder includes all ground truth depth map(.npy), and the name is matched with the images'name.

We also provide one example sequence:redkitchen seq-01-formatted.

Acknowledgements

The encoder/decoder codes build onMVDepthNet. Some useful util functions used during training are fromSfmLearner. Most of the training data are collected byDeMoN. We appreciate their work!

License

Copyright Yuxin Hou, Juho Kannala, and Arno Solin.

This software is provided under the MIT License. See the accompanying LICENSE file for details.

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp