Abstract
Abnormal detection plays an important role in video surveillance. LSTM encoder–decoder is used to learn representation of video sequences and applied for detecting abnormal event in complex environment. The learned representation of LSTM encoder–decoder is learned from encoder, and it is crucial for decoder. However, LSTM encoder–decoder generally fails to account for the global context of the learned representation with a fixed dimension representation. In this paper, we explore a hybrid autoencoder architecture, which not only extracts better spatio-temporal context, but also improves the extrapolate capability of the corresponding decoder by the shortcut connection. The experiment shows that the hybrid model performs better than the state-of-the-art anomaly detection methods in both qualitative and quantitative ways on benchmark datasets.
This is a preview of subscription content,log in via an institution to check access.
Access this article
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
Price includes VAT (Japan)
Instant access to the full article PDF.












Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Zhao B, Li FF, Xing EP (2011) Online detection of unusual events in videos via dynamic sparse coding. In: IEEE conference on computer vision and pattern recognition, pp 3313–3320
Cong Y, Yuan J, Liu J (2011) Sparse reconstruction cost for abnormal event detection. In: IEEE conference on computer vision and pattern recognition, pp 3449–3456
Chen Z, Saligrama V (2012) Video anomaly detection based on local statistical aggregates. In: IEEE conference on computer vision and pattern recognition, pp 2112–2119
Ricci E, Zen G, Sebe N, Messelodi S (2013) A prototype learning framework using EMD: application to complex scenes analysis. IEEE Trans Pattern Anal Mach Intell 35:513–526
Sabokrou M, Fathy M, Hoseini M (2016) Video anomaly detection and localisation based on the sparsity and reconstruction error of auto-encoder. Electron Lett 52:1122–1124
Xu D, Ricci E, Yan Y, Song J, Sebe N (2015) Learning deep representations of appearance and motion for anomalous event detection. In: The British machine vision conference
Hasan M, Choi J, Neumann J, Roychowdhury AK, Davis LS (2016) Learning temporal regularity in video sequences. In: IEEE conference on computer vision and pattern recognition, pp 733–742
Zhou XG, Zhang LQ (2015) Abnormal event detection using recurrent neural network. In: International conference on computer science and applications, pp 222–226
Yong SC, Yong HT (2017) Abnormal event detection in videos using spatiotemporal autoencoder. In: International symposium on neural networks, pp 189–196
Goodfellow IJ, Pougetabadie J, Mirza M, Xu B, Wardefarley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks. Adv Neural Inf Process Syst 3:2672–2680
Ravanbakhsh M, Nabi M, Sangineto E, Marcenaro L, Regazzoni C, Sebe N (2017) Abnormal event detection in videos using generative adversarial nets. In: International conference on image processing
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 234–241
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Graves A, Jaitly N (2014) Towards end-to-end speech recognition with recurrent neural networks. In: Proceedings of the 31st international conference on machine learning (ICML-14), pp 1764–1772
Yildirim O (2018) A novel wavelet sequences based on deep bidirectional LSTM network model for ECG signal classification. In: Computers in biology and medicine, S0010482518300738
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. Adv Neural Inf Process Syst 4:3104–3112
Cho K, Courville A, Bengio Y (2015) Describing multimedia content using attention-based encoder–decoder networks. IEEE Trans Multimed 17(11):1875–1886
Kim HY, Won CH (2018) Forecasting the volatility of stock price index: a hybrid model integrating LSTM with multiple GARCH-type models. In: Expert systems with applications, S0957417418301416
Venugopalan S, Xu H, Donahue J et al (2014) Translating videos to natural language using deep recurrent neural networks. arXiv preprintarXiv:1412.4729
Vinyals O, Toshev A, Bengio S et al (2015) Show and tell: a neural image caption generator. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3156–3164
Srivastava N, Mansimov E, Salakhutdinov R (2015) Unsupervised learning of video representations using LSTMS. In: International conference on machine learning, pp 843–852
Wang X, Gao L, Song J et al (2016) Beyond frame-level CNN: saliency-aware 3D CNN with LSTM for video action recognition. IEEE Signal Process Lett PP(99):1–1
Song S, Lan C, Xing J et al (2018) Spatio-temporal attention based LSTM networks for 3D action recognition and detection. IEEE Trans Image Process 1–1
Wang L, Zhou F, Li Z et al (2018) Abnormal event detection in videos using hybrid spatio-temporal autoencoder. In: 2018 25th IEEE International Conference on Image Processing (ICIP). IEEE, pp 2276–2280
Ji Y, Cohn T, Kong L et al (2015) Document context language models. arXiv preprintarXiv:1511.03962
Shi X, Chen Z, Wang H, Yeung DY, Wong WK, Woo WC (2015) Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In: Conference and workshop on neural information processing systems, pp 802–810
Wu L, Shen C, Hengel AVD (2016) Convolutional LSTM networks for video-based person re-identification. arXiv preprintarXiv:1606.01609v1
He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition, pp 770–778
Huang G, Liu Z, Weinberger KQ, Laurens VDM (2016) Densely connected convolutional networks. In: IEEE conference on computer vision and pattern recognition
Graves A (2013) Generating sequences with recurrent neural networks. arXiv preprintarXiv:1308.0850
Medel JR (2016) Anomaly detection using predictive convolutional long short-term memory units. Master’s thesis
Vondrick C, Pirsiavash H, Torralba A (2016) Anticipating visual representations from unlabeled video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 98–106
Kozlov Y, Weinkauf T. Persistence1D: extracting and filtering minima and maxima of 1D functions.http://people.mpi-inf.mpg.de/~weinkauf/notes/persistence1d.html
Mahadevan V, Li W, Bhalodia V, Vasconcelos N (2010) Anomaly detection in crowed scenes. In: IEEE conference on computer vision and pattern recognition, pp 1975–1981
Lu C, shi J, Jia J (2013) Anomaly event detection at 150fps in matlab. In: IEEE international conference on computer vision, no 3, pp 2720–2727
Adam A, Rivlin E, Shimshoni I et al (2008) Robust real-time unusual event detection using multiple fixed-location monitors. IEEE Trans Pattern Anal Mach Intell 30(3):555–560
Wang T, Snoussi H (2013) Histograms of optical flow orientation for abnormal events detection. In: 2013 IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (PETS). IEEE, pp 45–52
Acknowledgements
This is an extended version of our paper accepted in 2018 IEEE ICIP [24]. This work is supported by the National Natural Science Foundation of China (NSFC) (No. 61471123).
Author information
Authors and Affiliations
School of Instrumentation Science and Opto-electronics Engineering, Beihang University, Beijing, 100191, China
Fuqiang Zhou, Lin Wang, Zuoxin Li & Wangxia Zuo
Department of Electronic Information Engineering, Foshan University, Foshan, 528000, China
Haishu Tan
- Fuqiang Zhou
You can also search for this author inPubMed Google Scholar
- Lin Wang
You can also search for this author inPubMed Google Scholar
- Zuoxin Li
You can also search for this author inPubMed Google Scholar
- Wangxia Zuo
You can also search for this author inPubMed Google Scholar
- Haishu Tan
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toFuqiang Zhou.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhou, F., Wang, L., Li, Z.et al. Unsupervised Learning Approach for Abnormal Event Detection in Surveillance Video by Hybrid Autoencoder.Neural Process Lett52, 961–975 (2020). https://doi.org/10.1007/s11063-019-10113-w
Published:
Issue Date:
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative