Movatterモバイル変換


[0]ホーム

URL:


Skip to main content

Advertisement

Springer Nature Link
Log in

Multi-thread Parsing for Recognizing Complex Events in Videos

  • Conference paper

Part of the book series:Lecture Notes in Computer Science ((LNIP,volume 5304))

Included in the following conference series:

  • 8331Accesses

Abstract

This paper presents a probabilistic grammar approach to the recognition of complex events in videos. Firstly, based on the original motion features, a rule induction algorithm is adopted to learn the event rules. Then, a multi-thread parsing (MTP) algorithm is adopted to recognize the complex events involving parallel temporal relation in sub-events, whereas the commonly used parser can only handle the sequential relation. Additionally, a Viterbi-like error recovery strategy is embedded in the parsing process to correct the large time scale errors, such as insertion and deletion errors. Extensive experiments including indoor gymnastic exercises and outdoor traffic events are performed. As supported by experimental results, the MTP algorithm can effectively recognize the complex events due to the strong discriminative representation and the error recovery strategy.

Similar content being viewed by others

References

  1. Niebles, J.C., Wang, H., Fei-Fei, L.: Unsupervised Learning of Human Action Categories Using Spatial-TemporalWords. In: Proc. Conf. BMVC (2006)

    Google Scholar 

  2. Laptev, I., Lindeberg, T.: Space-time interest points. In: Proc. Int. Conf. on Computer Vision (ICCV) (2003)

    Google Scholar 

  3. Laxton, B., Lim, J., Kriegman, D.: Leveraging temporal, contextual and ordering constraints for recognizing complex activities in video. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) (2007)

    Google Scholar 

  4. Shi, Y., Huang, Y., Minnen, D., Bobick, A., Essa, I.: Propagation networks for recognition of partially ordered sequential action. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) (2004)

    Google Scholar 

  5. Nguyen, N.T., Phung, D.Q., Venkatesh, S., Bui, H.: Learning and Detecting Activities from Movement Trajectories Using the Hierarchical Hidden Markov Model. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) (2005)

    Google Scholar 

  6. Xiang, T., Gong, S.: Beyond Tracking: Modelling Activity and Understanding Behaviour. International Journal of Computer Vision (IJCV) 67(1) (2006)

    Google Scholar 

  7. Min, J., Kasturi, R.: Activity Recognition Based on Multiple Motion Trajectories. In: Proc. Int. Conf. on Pattern Recognition (ICPR), pp. 199–202 (2004)

    Google Scholar 

  8. Minnen, D., Essa, I., Starner, T.: Expectation Grammars: Leveraging High-Level Expectations for Activity Recognition. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 626–632 (2003)

    Google Scholar 

  9. Moore, D., Essa, I.: Recognizing Multitasked Activities from Video Using Stochastic Context-Free Grammar. In: Proc. Conf. AAAI (2002)

    Google Scholar 

  10. Ivanov, Y.A., Bobick, A.F.: Recognition of visual activities and interactions by stochastic parsing. IEEE TRANS. PAMI 22(8), 852–872 (2000)

    Article  Google Scholar 

  11. Ryoo, M.S., Aggarwal, J.K.: Recognition of Composite Human Activities through Context-Free Grammar Based Representation. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) (2006)

    Google Scholar 

  12. Yamamoto, M., Mitomi, H., Fujiwara, F., Sato, T.: Bayesian Classification of Task-Oriented Actions Based on Stochastic Context-Free Grammar. In: Proc. Int. Conf. on Automatic Face and Gesture Recognition (FGR) (2006)

    Google Scholar 

  13. Wang, X., Tieu, K., Grimson, E.: Learning Semantic Scene Models by Trajectory Analysis. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 110–123. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  14. Zhang, Z., Huang, K.Q., Tan, T.N., Wang, L.S.: Trajectory Series Analysis based Event Rule Induction for Visual Surveillance. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) (2007)

    Google Scholar 

  15. Hakeem, A., Shah, M.: Learning, detection and representation of multi-agent events in videos. Artif. Intell. 171(8-9), 586–605 (2007)

    Article  Google Scholar 

  16. Nevatia, R., Zhao, T., Hongeng, S.: Hierarchical Language-based Representation of Events in Video Streams. In: Proc. CVPRW on Event Mining (2003)

    Google Scholar 

  17. Allen, J.F., Ferguson, F.: Actions and Events in Interval Temporal Logical. J. Logic and Computation 4(5), 531–579 (1994)

    Article MathSciNet MATH  Google Scholar 

  18. Johnston, M.: Unification-based Multimodal Parsing. In: Proc. Conf. on COLING-ACL, pp. 624–630 (1998)

    Google Scholar 

  19. Amengual, J.C., Vidal, E.: Efficient Error-Correcting Viterbi Parsing. IEEE TRANS PAMI 20(10), 1109–1116 (1998)

    Article  Google Scholar 

  20. Stolcke, A.: An efficient probabilistic context-free parsing algorithm that computes prefix probabilities. Computational Linguistics 21(2), 165–201 (1995)

    MathSciNet  Google Scholar 

  21. Snoek, C.G.M., Worring, M.: Multimedia event-based video indexing using time intervals. IEEE TRANS Multimedia 7(4), 638–647 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

  1. National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Science, Beijing, 100190, China

    Zhang Zhang, Kaiqi Huang & Tieniu Tan

Authors
  1. Zhang Zhang

    You can also search for this author inPubMed Google Scholar

  2. Kaiqi Huang

    You can also search for this author inPubMed Google Scholar

  3. Tieniu Tan

    You can also search for this author inPubMed Google Scholar

Editor information

Editors and Affiliations

  1. Computer Science Department, University of Illinois at Urbana Champaign, 3310 Siebel Hall, IL 61801, Urbana, USA

    David Forsyth

  2. Department of Computing, Oxford Brookes University, OX33 1HX, Wheatley, Oxford, UK

    Philip Torr

  3. Department of Engineering Science, University of Oxford, Parks Road, OX1 3PJ, Oxford, UK

    Andrew Zisserman

Rights and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, Z., Huang, K., Tan, T. (2008). Multi-thread Parsing for Recognizing Complex Events in Videos. In: Forsyth, D., Torr, P., Zisserman, A. (eds) Computer Vision – ECCV 2008. ECCV 2008. Lecture Notes in Computer Science, vol 5304. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88690-7_55

Download citation

Publish with us


[8]ページ先頭

©2009-2025 Movatter.jp