Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
Springer Nature Link
Log in

Stroke-based semantic segmentation for scene-level free-hand sketches

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Sketching is a simple and efficient way for humans to express their perceptions of the world. Sketch semantic segmentation plays a key role in sketch understanding and is widely used in sketch recognition, sketch-based image retrieval, or editing. Due to modality difference between images and sketches, existing image segmentation methods may not perform best, which overlook the sparse nature and stroke-based representation in sketches. The existing sketch semantic segmentation methods are mainly designed for single-instance sketches. In this paper, we present a new stroke-based sequential-spatial neural network (S\(^3\)NN) for scene-level free-hand sketch semantic segmentation, which leverages a bidirectional LSTM and graph convolutional network to capture the sequential and spatial features of sketches. In order to address the data lacking issue, we propose the first scene-level free-hand sketch dataset (SFSD). SFSD is composed of 12K sketch-photo pairs over 40 object categories, where the sketches were completely hand-drawn and each contains 7 objects on average. We conduct comparative and ablative experiments on SFSD to evaluate the effectiveness of our method. The experimental results demonstrate that our method outperforms state-of-the-art methods. The code, models, and dataset will be made public after acceptance.

This is a preview of subscription content,log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Zou, C. et al.: Sketchyscene: Richly-annotated scene sketches. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 421–436 (2018)

  2. Eitz, M., Hays, J., Alexa, M.: How do humans sketch objects? ACM Trans. Graph.31(4), 1–10 (2012)

    Google Scholar 

  3. Ha, D., Eck, D.A.: Neural representation of sketch drawings. In: International Conference on Learning Representations (ICLR) (2018)

  4. Gao, C., et al.: Sketchycoco: image generation from freehand scene sketches. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5174–5183 (2020)

  5. Sangkloy, P., Burnell, N., Ham, C., Hays, J.: The sketchy database: learning to retrieve badly drawn bunnies. ACM Trans. Graph.35(4), 1–12 (2016)

    Article  Google Scholar 

  6. Yu, Q., et al. Sketch me that shoe. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 799–807 (2016)

  7. Delaye, A., Lee, K.: A flexible framework for online document segmentation by pairwise stroke distance learning. Pattern Recogn.48(4), 1197–1210 (2015)

    Article  Google Scholar 

  8. Gennari, L., Kara, L.B., Stahovich, T.F., Shimada, K.: Combining geometry and domain knowledge to interpret hand-drawn diagrams. Comput. Graph.29(4), 547–562 (2005)

    Article  Google Scholar 

  9. Sun, Z., Wang, C., Zhang, L., Zhang, L.: Free hand-drawn sketch segmentation. In: European Conference on Computer Vision (ECCV), pp. 626–639. Springer (2012)

  10. Schneider, R.G., Tuytelaars, T.: Example-based sketch segmentation and labeling using CRFS. ACM Trans. Graph.35(5), 1–9 (2016)

    Article  Google Scholar 

  11. Huang, Z., Fu, H., Lau, R.W.: Data-driven segmentation and labeling of freehand sketches. ACM Trans. Graph.33(6), 1–10 (2014)

    Article  Google Scholar 

  12. Qi, Y., et al.: Making better use of edges via perceptual grouping. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1856–1865 (2015)

  13. Li, L., Fu, H., Tai, C.-L.: Fast sketch segmentation and labeling with deep learning. IEEE Comput. Graph. Appl.39(2), 38–51 (2018)

    Article  Google Scholar 

  14. Wang, F., et al.: Multi-column point-CNN for sketch segmentation. Neurocomputing392, 50–59 (2020)

    Article  Google Scholar 

  15. Zhu, X., Xiao, Y., Zheng, Y.: 2d freehand sketch labeling using CNN and CRF. Multimedia Tools Appl.79(1), 1585–1602 (2020)

    Article  Google Scholar 

  16. Sarvadevabhatla, R.K., Dwivedi, I., Biswas, A., Manocha, S.: Sketchparse: towards rich descriptions for poorly drawn sketches using multi-task hierarchical deep networks. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 10–18 (2017)

  17. Wu, X., Qi, Y., Liu, J., Yang, J.: Sketchsegnet: a RNN model for labeling sketch strokes. In: 2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6. IEEE (2018)

  18. Qi, Y., Tan, Z.-H.: Sketchsegnet+: an end-to-end learning of RNN for multi-class sketch semantic segmentation. IEEE Access7, 102717–102726 (2019)

    Article  Google Scholar 

  19. Li, K., et al.: Toward deep universal sketch perceptual grouper. IEEE Trans. Image Process.28(7), 3219–3231 (2019)

    Article MathSciNet MATH  Google Scholar 

  20. Kaiyrbekov, K., Sezgin, M.: Deep stroke-based sketched symbol reconstruction and segmentation. IEEE Comput. Graph. Appl.40(1), 112–126 (2019)

  21. Li, K. et al.: Universal sketch perceptual grouping. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 582–597 (2018)

  22. Yang, L., et al.: Sketchgnn: semantic sketch segmentation with graph neural networks. ACM Trans. Graph.40(3), 1–13 (2021)

    Article  Google Scholar 

  23. Hähnlein, F., Gryaditskaya, Y., Bousseau, A. Bitmap or vector? A study on sketch representations for deep stroke segmentation. Journées Francaises d’Informatique Graphique et de Réalité virtuelle (2019)

  24. Lin, T.-Y. et al.: Microsoft coco: common objects in context. In: European Conference on Computer Vision (ECCV), pp. 740–755. Springer (2014)

  25. Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw.18(5–6), 602–610 (2005)

    Article  Google Scholar 

  26. Welling, M., Kipf, T.N.: Semi-supervised classification with graph convolutional networks. In: Journal of International Conference on Learning Representations (ICLR) (2016)

  27. Kirillov, A., He, K., Girshick, R., Dollár, P.: A unified architecture for instance and semantic segmentation (2017).http://presentations.cocodataset.org/COCO17-Stuff-FAIR.pdf (2017)

  28. Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F. & Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)

  29. Ge, C., Sun, H., Song, Y.-Z., Ma, Z., Liao, J.: Exploring local detail perception for scene sketch semantic segmentation. IEEE Trans. Image Process.31, 1447–1461 (2022)

    Article  Google Scholar 

  30. Long, J., Shelhamer, E. & Darrell, T.: Fully convolutional networks for semantic segmentation. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431–3440 (2015)

Download references

Acknowledgements

This work was supported by the Natural Science Foundation of China under Grant 61872346, Beijing Natural Science Foundation under Grant L222008, and 2019 China Prize of Newton Prize Project under Grant NP2PB/100047.

Author information

Authors and Affiliations

  1. State Key Laboratory of Computer Science and Beijing Key Lab of Human-Computer Interaction, Institute of Software, Chinese Academy of Sciences, Beijing, China

    Zhengming Zhang, Xiaoming Deng, Jinyao Li, Cuixia Ma & Hongan Wang

  2. University of Chinese Academy of Sciences, Beijing, China

    Zhengming Zhang, Xiaoming Deng, Jinyao Li, Cuixia Ma & Hongan Wang

  3. Cardiff University, Cardiff, UK

    Yukun Lai

  4. Tsinghua University, Beijing, China

    Yongjin Liu

Authors
  1. Zhengming Zhang

    You can also search for this author inPubMed Google Scholar

  2. Xiaoming Deng

    You can also search for this author inPubMed Google Scholar

  3. Jinyao Li

    You can also search for this author inPubMed Google Scholar

  4. Yukun Lai

    You can also search for this author inPubMed Google Scholar

  5. Cuixia Ma

    You can also search for this author inPubMed Google Scholar

  6. Yongjin Liu

    You can also search for this author inPubMed Google Scholar

  7. Hongan Wang

    You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toCuixia Ma.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Z., Deng, X., Li, J.et al. Stroke-based semantic segmentation for scene-level free-hand sketches.Vis Comput39, 6309–6321 (2023). https://doi.org/10.1007/s00371-022-02731-8

Download citation

Keywords

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Advertisement


[8]ページ先頭

©2009-2025 Movatter.jp