Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
Springer Nature Link
Log in

Self-supervised indoor scene point cloud completion from a single panorama

  • Research
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

In this paper, we propose a self-supervised learning method of point cloud completion for indoor scenes. Considering the limited view of single-view image and the time-consuming and labor-intensive acquisition of multi-view images, we take panoramas as input, which makes the acquisition easier and the scope of the scene wider. As it is difficult to obtain complete scene point cloud, we design an auxiliary task to simulate scene missing area by shifting viewpoint of panorama and extract the supervision information of the scene itself. Given the difficulty to complete large-scale scene point cloud, we design a neighborhood integration and feature spreading module for feature extraction and reservation before substantial point cloud downsampling, enabling the completion network to handle large-scale point cloud. Then we propose a transformer-based scene point cloud completion network and show competitive completion results compared to relevant supervised learning methods.

This is a preview of subscription content,log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

Data is provided within the manuscript.

References

  1. Armeni, I., Sener, O., Zamir, A.R., Jiang, H., Brilakis, I., Fischer, M., Savarese, S.: 3d semantic parsing of large-scale indoor spaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1534–1543 (2016)

  2. Armeni, I., Sax, S., Zamir, A.R., Savarese, S.: Joint 2d-3d-semantic data for indoor scene understanding. arXiv preprintarXiv:1702.01105 (2017)

  3. Berger, M., Tagliasacchi, A., Seversky, L.M., Alliez, Levine, J.A., Sharf, A., Silva, C.T.: State of the art in surface reconstruction from point clouds. In: 35th Annual Conference of the European Association for Computer Graphics, Eurographics 2014-State of the Art Reports, number CONF. The Eurographics Association(2014)

  4. Cai, X., Lou, J., Bu, J., Dong, J., Wang, H., Yu, H.: Single depth image 3d face reconstruction via domain adaptive learning. Front. Comput. Sci.18(1) (2024)

  5. Cao, A.-Q., de Charette, R.: Monoscene: Monocular 3d semantic scene completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3991–4001 (2022)

  6. Chen, S., Geng, C.: A comprehensive perspective of contrastive self-supervised learning. Front. Comp. Sci.15, 1–3 (2021)

    MATH  Google Scholar 

  7. Dai, A., Ritchie, D., Bokeloh, M., Reed, S., Sturm, J., Nießner, M.: Scancomplete: Large-scale scene completion and semantic segmentation for 3d scans. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4578–4587 (2018)

  8. Dai, A., Diller, C., Nießner, M.: Sg-nn: Sparse generative neural networks for self-supervised scene completion of rgb-d scans. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 849–858 (2020)

  9. Fei, B., Yang, W., Chen, W.-M., Li, Z., Li, Y., Ma, T., Hu, X., Ma, L.: Comprehensive review of deep learning-based 3d point cloud completion processing and analysis. IEEE Trans. Intell. Transp. Syst. (2022)

  10. Gu, J., Ma, W.-C., Manivasagam, S., Zeng, W., Wang, Z., Xiong, Y., Su, H., Urtasun, R.: Weakly-supervised 3d shape completion in the wild. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V 16, pp. 283–299. Springer (2020)

  11. Guo, Y.-X., Tong, X.: View-volume network for semantic scene completion from a single depth image. arXiv preprintarXiv:1806.05361 (2018)

  12. Gurumurthy, S., Agrawal, S.: High fidelity semantic shape completion for point clouds using latent optimization. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1099–1108. IEEE (2019)

  13. Han, B., Zhang, X., Ren, S.: Pu-gacnet: graph attention convolution network for point cloud upsampling. Image Vis. Comput.118, 104371 (2022)

    Article MATH  Google Scholar 

  14. Han, X., Zhang, Z., Du, D., Yang, M., Yu, J., Pan, P., Yang, X., Liu, L., Xiong, Z., Cui, S.: Deep reinforcement learning of volume-guided progressive view inpainting for 3d point scene completion from a single depth image. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 234–243 (2019)

  15. Hu, Q., Yang, B., Xie, L., Rosa, S., Guo, Y., Wang, Z., Trigoni, N., Markham, A.: Randla-net: Efficient semantic segmentation of large-scale point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 108–11117 (2020)

  16. Huang, Z., Yu, Y., Xu, J., Ni, F., Le, X.: Pf-net: Point fractal network for 3d point cloud completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7662–7670 (2020)

  17. Imambi, S., Bhanu Prakash, K., Kanagachidambaresan, G.R.: Pytorch. Programming with TensorFlow: Solution for Edge Computing Applications, pp. 87–104 (2021)

  18. Li, D., Shao, T., Hongzhi, W., Zhou, K.: Shape completion from a single rgbd image. IEEE Trans. Visual Comput. Graphics23(7), 1809–1822 (2016)

    Article MATH  Google Scholar 

  19. Li, Y., Wu, X., Chrysathou, Y., Sharf, A., Cohen-Or, D., Mitra, N.J.: Globfit: Consistently fitting primitives by discovering global relations. In: ACM SIGGRAPH 2011 papers, pp. 1–12 (2011)

  20. Li, Y., Dai, A., Guibas, L., Nießner, M.: Database-assisted object retrieval for real-time 3d reconstruction. In: Computer Graphics Forum, vol. 34, pp. 435–446. Wiley Online Library (2015)

  21. Li, Y., Yu, Z., Choy, C., Xiao, C., Alvarez, J.M., Fidler, S., Feng, C., Anandkumar, A.: Voxformer: Sparse voxel transformer for camera-based 3d semantic scene completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9087–9098 (2023)

  22. Liu, S., Luo, X., Kexue, F., Wang, M., Song, Z.: A learnable self-supervised task for unsupervised domain adaptation on point cloud classification and segmentation. Front. Comp. Sci.17(6), 176708 (2023)

    Article  Google Scholar 

  23. Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprintarXiv:1711.05101 (2017)

  24. Luo, F., Zhu, Y., Yanping, F., Zhou, H., Chen, Z., Xiao, C.: Sparse rgb-d images create a real thing: a flexible voxel based 3d reconstruction pipeline for single object. Vis. Inform.7(1), 66–76 (2023)

    Article  Google Scholar 

  25. Martinovic, A., Van Gool, L.: Bayesian grammar learning for inverse procedural modeling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 201–208 (2013)

  26. Mitra, N.J., Pauly, M., Wand, M., Ceylan, D.: Symmetry in 3d geometry: Extraction and applications. In: Computer Graphics Forum, vol. 32, pp. 1–23. Wiley Online Library (2013)

  27. Mittal, P., Cheng, Y-C., Singh, M., Tulsiani, S.: Autosdf: Shape priors for 3d completion, reconstruction and generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 306–315 (2022)

  28. Nan, L., Sharf, A., Zhang, H., Cohen-Or, D., Chen, B.: Smartboxes for interactive urban reconstruction. In: ACM Siggraph 2010 Papers, pp. 1–10 (2010)

  29. Navaneet, K.L., Mathew, A., Kashyap, S., Hung, W-C., Jampani, V., Venkatesh Babu, R.: From image collections to point clouds with self-supervised shape and pose networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1132–1140 (2020)

  30. Thanh Nguyen, D., Hua, B-S., Tran, K., Pham, Q-H., Yeung, S-K.: A field model for repairing 3d shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5676–5684 (2016)

  31. Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: Deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)

  32. Qi, R.C., Yi, L., Su, H., Guibas, L.J.: Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Adv. Neural Inf. Process. Syst.30 (2017)

  33. Qi, L., Zhang, Y., Liu, T.: Bidirectional transformer with absolute-position aware relative position encoding for encoding sentences. Front. Comput. Sci.17(1), 171301 (2023)

    Article MATH  Google Scholar 

  34. Rock, J., Gupta, T., Thorsen, J., Gwak, J.Y., Shin, D.: Completing 3d object shape from one depth image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2484–2493 (2015)

  35. Sarkar, K., Varanasi, K., Stricker, D.: Learning quadrangulated patches for 3d shape parameterization and completion. In: 2017 International Conference on 3D Vision (3DV), pp. 383–392. IEEE (2017)

  36. Schnabel, R., Degener, P., Klein, R.: Completion and reconstruction with primitive shapes. In: Computer Graphics Forum, vol. 28, pp. 503–512. Wiley Online Library (2009)

  37. Shao, T., Weiwei, X., Zhou, K., Wang, J., Li, D., Guo, B.: An interactive approach to semantic modeling of indoor scenes with an rgbd camera. ACM Trans. Graphics (TOG)31(6), 1–11 (2012)

    Article  Google Scholar 

  38. Shaw, P., Uszkoreit, J., Vaswani, A.: Self-attention with relative position representations. arXiv preprintarXiv:1803.02155 (2018)

  39. Shi, H., Zhou, H.: Deep active sampling with self-supervised learning. Front. Comp. Sci.17(4), 174323 (2023)

    Article MATH  Google Scholar 

  40. Sipiran, I., Gregor, R., Schreck, T.: Approximate symmetry detection in partial 3d meshes. In: Computer Graphics Forum, vol. 33, pp. 131–140. Wiley Online Library (2014)

  41. Song, S., Yu, F., Zeng, A., Chang, A.X., Savva, M., Funkhouser, T.: Semantic scene completion from a single depth image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1746–1754 (2017)

  42. Sun, C., Hsiao, C-W., Wang, N-H., Sun, M., Chen, H-T.: Indoor panorama planar 3d reconstruction via divide and conquer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11338–11347 (2021)

  43. Sung, M., Kim, V.G., Angst, R., Guibas, L.: Data-driven structural priors for shape completion. ACM Trans. Graphics (TOG)34(6), 1–11 (2015)

    Article MATH  Google Scholar 

  44. Tan, Z., Chen, S.: On the learning dynamics of two-layer quadratic neural networks for understanding deep learning. Front. Comp. Sci.16(3), 163313 (2022)

    Article MATH  Google Scholar 

  45. Taylor, C.J., Cowley, A., Kettler, R., Ninomiya, K., Gupta, M., Niu, B.: Mapping with depth panoramas. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6265–6272. IEEE (2015)

  46. Theoharis, T., Papaioannou, G.: The magic of the z-buffer and Evaggelia-Aggeliki Karabassi. A survey (2001)

  47. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Adv. Neural Inf. Process. Syst.30 (2017)

  48. Wang, H., Huang, D., Wang, Y.: Gridnet: efficiently learning deep hierarchical representation for 3d point cloud understanding. Front. Comp. Sci.16(1), 161301 (2022)

    Article MATH  Google Scholar 

  49. Wang, Y., Joseph Tan, D., Navab, N., Tombari, F.: Forknet: Multi-branch volumetric semantic completion from a single depth image. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8608–8617 (2019)

  50. Wang, Y., Joseph Tan, D., Navab, N., Tombari, F.: Softpoolnet: Shape descriptor for point cloud completion and classification. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16, pp. 70–85. Springer (2020)

  51. Westover, A.L.: Splatting: a parallel, feed-forward volume rendering algorithm. The University of North Carolina at Chapel Hill (1991)

  52. Wu, H., Zhang, H., Cheng, J., Guo, J., Chen, W.: Perspectives on point cloud-based 3d scene modeling and xr presentation within the cloud-edge-client architecture. Vis. Inform.7(3), 59–64 (2023)

    Article MATH  Google Scholar 

  53. Wu, W., Qi, Z., Fuxin, L.: Pointconv: Deep convolutional networks on 3d point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9621–9630 (2019)

  54. Wu, Z., Gan, Y., Tianhao, X., Wang, F.: Graph-segmenter: graph transformer with boundary-aware attention for semantic segmentation. Front. Comp. Sci.18(5), 1–12 (2024)

    MATH  Google Scholar 

  55. Xu, L., Guan, T., Wang, Y., Luo, Y., Chen, Z., Liu, W., Yang, W.: Self-supervised multi-view stereo via adjacent geometry guided volume completion. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 2202–2210 (2022)

  56. Xu, M., Xu, M., He, T., Ouyang, W., Wang, Y., Han, X., Qiao, Y.: Mm-3dscene: 3d scene understanding by customizing masked modeling with informative-preserved reconstruction and self-distilled consistency. arXiv preprintarXiv:2212.09948 (2022)

  57. Yan, W., Zhang, R., Wang, J., Liu, S., Li, T.H., Li, G.: Vaccine-style-net: Point cloud completion in implicit continuous function space. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 2067–2075 (2020)

  58. Yang, H., Zhang, H.: Efficient 3d room shape recovery from a single panorama. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5422–5430 (2016)

  59. Yang, S., Li, B., Cao, Y.-P., Hongbo, F., Lai, Y.-K., Kobbelt, L., Shi-Min, H.: Noise-resilient reconstruction of panoramas and 3d scenes using robot-mounted unsynchronized commodity rgb-d cameras. ACM Trans. Graphics (TOG)39(5), 1–15 (2020)

    Article  Google Scholar 

  60. Yang, Y., Jin, S., Liu, R., Bing Kang, S., Yu, J.: Automatic 3d indoor scene modeling from single panorama. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3926–3934 (2018)

  61. Yang, Y., Feng, C., Shen, Y., Tian, D.: Foldingnet: Point cloud auto-encoder via deep grid deformation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 206–215 (2018)

  62. Yin, K., Huang, H., Zhang, H., Gong, M., Cohen-Or, D., Chen, B.: Morfit: interactive surface reconstruction from incomplete point clouds with curve-driven topology and geometry control. ACM Trans. Graph.33(6), 202–1 (2014)

    Article MATH  Google Scholar 

  63. Yu, X., Rao, Y., Wang, Z., Liu, Z., Lu, J., Zhou, J.: Pointr: Diverse point cloud completion with geometry-aware transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12498–12507 (2021)

  64. Yikuan, Yu., Huang, Z., Li, F., Zhang, H., Le, X.: Point encoder gan: A deep learning model for 3d point cloud inpainting. Neurocomputing384, 192–199 (2020)

    Article  Google Scholar 

  65. Yuan, M., Kexue, F., Li, Z., Wang, M.: Decoupled deep hough voting for point cloud registration. Front. Comp. Sci.18(2), 182703 (2024)

    Article MATH  Google Scholar 

  66. Yuan, W., Khot, T., Held, D., Mertz, C., Hebert, M.: Pcn: Point completion network. In: 2018 International Conference on 3D Vision (3DV), pp. 728–737. IEEE (2018)

  67. Zeng, W., Karaoglu, S., Gevers, T.: Pano2scene: 3d indoor semantic scene reconstruction from a single indoor panorama image. In: BMVC (2020)

  68. Zhang, P., Liu, W., Lei, Y., Lu, H., Yang, X.: Cascaded context pyramid for full-resolution 3d semantic scene completion. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7801–7810 (2019)

  69. Zhang, Y., Zhao, W., Sun, B., Zhang, Y., Wen, W.: Point cloud upsampling algorithm: a systematic review. Algorithms15(4), 124 (2022)

    Article MATH  Google Scholar 

  70. Zhang, Z., Dong, B., Li, T., Heide, F., Peers, P., Yin, B., Yang, X.: Single depth-image 3d reflection symmetry and shape prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8896–8906 (2023)

  71. Zhang, Z., Han, X., Dong, B., Li, T., Yin, .,Yang, X.: Point cloud scene completion with joint color and semantic estimation from single rgb-d image. IEEE Trans. Pattern Anal. Mach. Intell. (2023)

  72. Zhao, W., Liu, X., Zhong, Z., Jiang, J., Gao, W., Li, G., Ji, X.: Self-supervised arbitrary-scale point clouds upsampling via implicit neural representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1999–2007 (2022)

  73. Zhou, H., Cao, Y., Chu, W., Zhu, J., Lu, T., Tai, Y., Wang, C.: Seedformer: Patch seeds based point cloud completion with upsample transformer. In: European Conference on Computer Vision, pp. 416–432. Springer (2022)

  74. Zhu, Z., Nan, L., Xie, H., Chen, H., Wang, J., Wei, M., Qin, J.: Csdn: Cross-modal shape-transfer dual-refinement network for point cloud completion. IEEE Trans. Vis. Comput. Graphics (2023)

Download references

Author information

Authors and Affiliations

  1. School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning, China

    Tong Li, Zhaoxuan Zhang, Yuxin Wang, Baocai Yin & Xin Yang

  2. Zhuhai 4DAGE Technology Co., Ltd, Zhuhai, Guangdong, China

    Yan Cui

  3. Ningbo University, Ningbo, Zhejiang, China

    Yuqi Li

  4. Dalian University, Dalian, Liaoning, China

    Dongsheng Zhou

Authors
  1. Tong Li

    You can also search for this author inPubMed Google Scholar

  2. Zhaoxuan Zhang

    You can also search for this author inPubMed Google Scholar

  3. Yuxin Wang

    You can also search for this author inPubMed Google Scholar

  4. Yan Cui

    You can also search for this author inPubMed Google Scholar

  5. Yuqi Li

    You can also search for this author inPubMed Google Scholar

  6. Dongsheng Zhou

    You can also search for this author inPubMed Google Scholar

  7. Baocai Yin

    You can also search for this author inPubMed Google Scholar

  8. Xin Yang

    You can also search for this author inPubMed Google Scholar

Contributions

We, T.L., Z.Z., Y.W., Y.C., Y.L., D.Z., B.Y., and X.Y., have collaborated on this research work. T.L., Z.Z., and X.Y. primarily conducted the data collection, experimental design, data analysis and interpretation. Y.W., Y.C., Y.L., and D.Z. contributed to the literature review and provided critical insights. B.Y. and X.Y. supervised the project and provided overall guidance. T.L., Z.Z., Y.W., Y.C., Y.L., , D.Z., B.Y., and X.Y. jointly wrote the main manuscript text and reviewed the final version. All authors have made substantial contributions to this study and have reviewed the manuscript.

Corresponding author

Correspondence toXin Yang.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, T., Zhang, Z., Wang, Y.et al. Self-supervised indoor scene point cloud completion from a single panorama.Vis Comput41, 1891–1905 (2025). https://doi.org/10.1007/s00371-024-03509-w

Download citation

Keywords

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Advertisement


[8]ページ先頭

©2009-2025 Movatter.jp