- Zhengming Zhang1,2,
- Xiaoming Deng1,2,
- Jinyao Li1,2,
- Yukun Lai3,
- Cuixia Ma ORCID:orcid.org/0000-0003-3999-74291,2,
- Yongjin Liu4 &
- …
- Hongan Wang1,2
766Accesses
3Citations
1Altmetric
Abstract
Sketching is a simple and efficient way for humans to express their perceptions of the world. Sketch semantic segmentation plays a key role in sketch understanding and is widely used in sketch recognition, sketch-based image retrieval, or editing. Due to modality difference between images and sketches, existing image segmentation methods may not perform best, which overlook the sparse nature and stroke-based representation in sketches. The existing sketch semantic segmentation methods are mainly designed for single-instance sketches. In this paper, we present a new stroke-based sequential-spatial neural network (S\(^3\)NN) for scene-level free-hand sketch semantic segmentation, which leverages a bidirectional LSTM and graph convolutional network to capture the sequential and spatial features of sketches. In order to address the data lacking issue, we propose the first scene-level free-hand sketch dataset (SFSD). SFSD is composed of 12K sketch-photo pairs over 40 object categories, where the sketches were completely hand-drawn and each contains 7 objects on average. We conduct comparative and ablative experiments on SFSD to evaluate the effectiveness of our method. The experimental results demonstrate that our method outperforms state-of-the-art methods. The code, models, and dataset will be made public after acceptance.
This is a preview of subscription content,log in via an institution to check access.
Access this article
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
Price includes VAT (Japan)
Instant access to the full article PDF.







Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Zou, C. et al.: Sketchyscene: Richly-annotated scene sketches. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 421–436 (2018)
Eitz, M., Hays, J., Alexa, M.: How do humans sketch objects? ACM Trans. Graph.31(4), 1–10 (2012)
Ha, D., Eck, D.A.: Neural representation of sketch drawings. In: International Conference on Learning Representations (ICLR) (2018)
Gao, C., et al.: Sketchycoco: image generation from freehand scene sketches. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5174–5183 (2020)
Sangkloy, P., Burnell, N., Ham, C., Hays, J.: The sketchy database: learning to retrieve badly drawn bunnies. ACM Trans. Graph.35(4), 1–12 (2016)
Yu, Q., et al. Sketch me that shoe. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 799–807 (2016)
Delaye, A., Lee, K.: A flexible framework for online document segmentation by pairwise stroke distance learning. Pattern Recogn.48(4), 1197–1210 (2015)
Gennari, L., Kara, L.B., Stahovich, T.F., Shimada, K.: Combining geometry and domain knowledge to interpret hand-drawn diagrams. Comput. Graph.29(4), 547–562 (2005)
Sun, Z., Wang, C., Zhang, L., Zhang, L.: Free hand-drawn sketch segmentation. In: European Conference on Computer Vision (ECCV), pp. 626–639. Springer (2012)
Schneider, R.G., Tuytelaars, T.: Example-based sketch segmentation and labeling using CRFS. ACM Trans. Graph.35(5), 1–9 (2016)
Huang, Z., Fu, H., Lau, R.W.: Data-driven segmentation and labeling of freehand sketches. ACM Trans. Graph.33(6), 1–10 (2014)
Qi, Y., et al.: Making better use of edges via perceptual grouping. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1856–1865 (2015)
Li, L., Fu, H., Tai, C.-L.: Fast sketch segmentation and labeling with deep learning. IEEE Comput. Graph. Appl.39(2), 38–51 (2018)
Wang, F., et al.: Multi-column point-CNN for sketch segmentation. Neurocomputing392, 50–59 (2020)
Zhu, X., Xiao, Y., Zheng, Y.: 2d freehand sketch labeling using CNN and CRF. Multimedia Tools Appl.79(1), 1585–1602 (2020)
Sarvadevabhatla, R.K., Dwivedi, I., Biswas, A., Manocha, S.: Sketchparse: towards rich descriptions for poorly drawn sketches using multi-task hierarchical deep networks. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 10–18 (2017)
Wu, X., Qi, Y., Liu, J., Yang, J.: Sketchsegnet: a RNN model for labeling sketch strokes. In: 2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6. IEEE (2018)
Qi, Y., Tan, Z.-H.: Sketchsegnet+: an end-to-end learning of RNN for multi-class sketch semantic segmentation. IEEE Access7, 102717–102726 (2019)
Li, K., et al.: Toward deep universal sketch perceptual grouper. IEEE Trans. Image Process.28(7), 3219–3231 (2019)
Kaiyrbekov, K., Sezgin, M.: Deep stroke-based sketched symbol reconstruction and segmentation. IEEE Comput. Graph. Appl.40(1), 112–126 (2019)
Li, K. et al.: Universal sketch perceptual grouping. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 582–597 (2018)
Yang, L., et al.: Sketchgnn: semantic sketch segmentation with graph neural networks. ACM Trans. Graph.40(3), 1–13 (2021)
Hähnlein, F., Gryaditskaya, Y., Bousseau, A. Bitmap or vector? A study on sketch representations for deep stroke segmentation. Journées Francaises d’Informatique Graphique et de Réalité virtuelle (2019)
Lin, T.-Y. et al.: Microsoft coco: common objects in context. In: European Conference on Computer Vision (ECCV), pp. 740–755. Springer (2014)
Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw.18(5–6), 602–610 (2005)
Welling, M., Kipf, T.N.: Semi-supervised classification with graph convolutional networks. In: Journal of International Conference on Learning Representations (ICLR) (2016)
Kirillov, A., He, K., Girshick, R., Dollár, P.: A unified architecture for instance and semantic segmentation (2017).http://presentations.cocodataset.org/COCO17-Stuff-FAIR.pdf (2017)
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F. & Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)
Ge, C., Sun, H., Song, Y.-Z., Ma, Z., Liao, J.: Exploring local detail perception for scene sketch semantic segmentation. IEEE Trans. Image Process.31, 1447–1461 (2022)
Long, J., Shelhamer, E. & Darrell, T.: Fully convolutional networks for semantic segmentation. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431–3440 (2015)
Acknowledgements
This work was supported by the Natural Science Foundation of China under Grant 61872346, Beijing Natural Science Foundation under Grant L222008, and 2019 China Prize of Newton Prize Project under Grant NP2PB/100047.
Author information
Authors and Affiliations
State Key Laboratory of Computer Science and Beijing Key Lab of Human-Computer Interaction, Institute of Software, Chinese Academy of Sciences, Beijing, China
Zhengming Zhang, Xiaoming Deng, Jinyao Li, Cuixia Ma & Hongan Wang
University of Chinese Academy of Sciences, Beijing, China
Zhengming Zhang, Xiaoming Deng, Jinyao Li, Cuixia Ma & Hongan Wang
Cardiff University, Cardiff, UK
Yukun Lai
Tsinghua University, Beijing, China
Yongjin Liu
- Zhengming Zhang
You can also search for this author inPubMed Google Scholar
- Xiaoming Deng
You can also search for this author inPubMed Google Scholar
- Jinyao Li
You can also search for this author inPubMed Google Scholar
- Yukun Lai
You can also search for this author inPubMed Google Scholar
- Cuixia Ma
You can also search for this author inPubMed Google Scholar
- Yongjin Liu
You can also search for this author inPubMed Google Scholar
- Hongan Wang
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toCuixia Ma.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, Z., Deng, X., Li, J.et al. Stroke-based semantic segmentation for scene-level free-hand sketches.Vis Comput39, 6309–6321 (2023). https://doi.org/10.1007/s00371-022-02731-8
Accepted:
Published:
Issue Date:
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative