Part of the book series:Lecture Notes in Computer Science ((LNSC,volume 12248))
Included in the following conference series:
1386Accesses
13Citations
Abstract
Image spam emails are often used to evade text-based spam filters that detect spam emails with their frequently used keywords. In this paper, we propose a new image spam email detection tool called DeepCapture using a convolutional neural network (CNN) model. There have been many efforts to detect image spam emails, but there is a significant performance degrade against entirely new and unseen image spam emails due to overfitting during the training phase. To address this challenging issue, we mainly focus on developing a more robust model to address the overfitting problem. Our key idea is to build a CNN-XGBoost framework consisting of eight layers only with a large number of training samples using data augmentation techniques tailored towards the image spam detection task. To show the feasibility of DeepCapture, we evaluate its performance with publicly available datasets consisting of 6,000 spam and 2,313 non-spam image samples. The experimental results show that DeepCapture is capable of achieving an F1-score of 88%, which has a 6% improvement over the best existing spam detection model CNN-SVM [19] with an F1-score of 82%. Moreover, DeepCapture outperformed existing image spam detection solutions against new and unseen image datasets.
This is a preview of subscription content,log in via an institution to check access.
Access this chapter
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
- Chapter
- JPY 3498
- Price includes VAT (Japan)
- eBook
- JPY 11439
- Price includes VAT (Japan)
- Softcover Book
- JPY 14299
- Price includes VAT (Japan)
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Annadatha, A., Stamp, M.: Image spam analysis and detection. J. Comput. Virol. Hacking Tech.14(1), 39–52 (2016).https://doi.org/10.1007/s11416-016-0287-x
Attar, A., Rad, R.M., Atani, R.E.: A survey of image spamming and filtering techniques. Artif. Intell. Rev.40(1), 71–105 (2013)
Bappy, J.H., Roy-Chowdhury, A.K.: CNN based region proposals for efficient object detection. In: Proceeding of the 23rd International Conference on Image Processing, pp. 3658–3662 (2016)
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res.13(10), 281–305 (2012)
Biggio, B., Fumera, G., Pillai, I., Roli, F.: A survey and experimental evaluation of image spam filtering techniques. Pattern Recognit. Lett.32(10), 1436–1446 (2011)
Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)
Dredze, M., Gevaryahu, R., Elias-Bachrach, A.: Learning fast classifiers for image spam. In: Proceedings of the 4th Conference on Email and Anti-Spam, pp. 487–493 (2007)
Fatichah, C., Lazuardi, W.F., Navastara, D.A., Suciati, N., Munif, A.: Image spam detection on instagram using convolutional neural network. In: Proceedings of the 3rd Conference on Intelligent and Interactive Computing, pp. 295–303 (2019)
Fumera, G., Pillai, I., Roli, F.: Spam filtering based on the analysis of text information embedded into images. J. Mach. Learn. Res.7, 2699–2720 (2006)
Gao, Y., et al.: Image spam hunter. In: Proceeding of the 32nd International Conference on Acoustics, Speech and Signal Processing, pp. 1765–1768 (2008)
Ismail, A., Khawandi, S., Abdallah, F.: Image spam detection: problem and existing solution. Int. Res. J. Eng. Technol.6(2), 1696–1710 (2019)
Kim, J., Kim, H., Lee, J.H.: Analysis and comparison of fax spam detection algorithms. In: Proceedings of the 11th International Conference on Ubiquitous Information Management and Communication, pp. 1–4 (2017)
Klangpraphant, P., Bhattarakosol, P.: PIMSI: a partial image SPAM inspector. In: Proceedings of the 5th International Conference on Future Information Technology, pp. 1–6 (2010)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM60(6), 84–90 (2017)
Kumar, P., Biswas, M.: SVM with Gaussian kernel-based image spam detection on textual features. In: Proceedings of the 3rd International Conference on Computational Intelligence and Communication Technology, pp. 1–6 (2017)
Leszczynski, M.: Emails going to spam? 12 reasons why that happens and what you can do about it (2019).https://www.getresponse.com/blog/why-emails-go-to-spam
Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the 30th International Conference on Machine Learning, pp. 1–6 (2013)
Ng, A.Y.: Feature selection, L 1 vs. L 2 regularization, and rotational invariance. In: Proceedings of the 21st International Conference on Machine learning, pp. 1–8 (2004)
Shang, E.X., Zhang, H.G.: Image spam classification based on convolutional neural network. In: Proceedings of the 15th International Conference on Machine Learning and Cybernetics, pp. 398–403 (2016)
Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data6(1), 60 (2019)
Soranamageswari, M., Meena, C.: Statistical feature extraction for classification of image spam using artificial neural networks. In: Proceedings of the 2nd International Conference on Machine Learning and Computing, pp. 101–105 (2010)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res.15(1), 1929–1958 (2014)
Acknowledgement
Hyoungshick Kim is the corresponding author. This work has been supported in part by the Cyber Security Research Centre Limited whose activities are partially funded by the Australian Government’s Cooperative Research Centres Programme and the NRF grant (No. 2017H1D8A2031628) and the ITRC Support Program (IITP-2019- 2015-0-00403) funded by the Korea government. The authors would like to thank all the anonymous reviewers for their valuable feedback.
Author information
Authors and Affiliations
Sungkyunkwan University, Suwon, Republic of Korea
Bedeuro Kim & Hyoungshick Kim
Data61, CSIRO, Sydney, Australia
Bedeuro Kim, Sharif Abuadbba & Hyoungshick Kim
Cyber Security Cooperative Research Centre, Joondalup, Australia
Sharif Abuadbba
- Bedeuro Kim
Search author on:PubMed Google Scholar
- Sharif Abuadbba
Search author on:PubMed Google Scholar
- Hyoungshick Kim
Search author on:PubMed Google Scholar
Corresponding author
Correspondence toHyoungshick Kim.
Editor information
Editors and Affiliations
Faculty of Information Technology, Monash University, Clayton, VIC, Australia
Joseph K. Liu
Murdoch University, Perth, WA, Australia
Hui Cui
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Kim, B., Abuadbba, S., Kim, H. (2020). DeepCapture: Image Spam Detection Using Deep Learning and Data Augmentation. In: Liu, J., Cui, H. (eds) Information Security and Privacy. ACISP 2020. Lecture Notes in Computer Science(), vol 12248. Springer, Cham. https://doi.org/10.1007/978-3-030-55304-3_24
Download citation
Published:
Publisher Name:Springer, Cham
Print ISBN:978-3-030-55303-6
Online ISBN:978-3-030-55304-3
eBook Packages:Computer ScienceComputer Science (R0)
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative