Browse > Article
http://dx.doi.org/10.13088/jiis.2022.28.2.101

Anomaly Detection Methodology Based on Multimodal Deep Learning  

Lee, DongHoon (Graduate School of Business IT, Kookmin University)
Kim, Namgyu (Graduate School of Business IT, Kookmin University)
Publication Information
Journal of Intelligence and Information Systems / v.28, no.2, 2022 , pp. 101-125 More about this Journal
Abstract
Recently, with the development of computing technology and the improvement of the cloud environment, deep learning technology has developed, and attempts to apply deep learning to various fields are increasing. A typical example is anomaly detection, which is a technique for identifying values or patterns that deviate from normal data. Among the representative types of anomaly detection, it is very difficult to detect a contextual anomaly that requires understanding of the overall situation. In general, detection of anomalies in image data is performed using a pre-trained model trained on large data. However, since this pre-trained model was created by focusing on object classification of images, there is a limit to be applied to anomaly detection that needs to understand complex situations created by various objects. Therefore, in this study, we newly propose a two-step pre-trained model for detecting abnormal situation. Our methodology performs additional learning from image captioning to understand not only mere objects but also the complicated situation created by them. Specifically, the proposed methodology transfers knowledge of the pre-trained model that has learned object classification with ImageNet data to the image captioning model, and uses the caption that describes the situation represented by the image. Afterwards, the weight obtained by learning the situational characteristics through images and captions is extracted and fine-tuning is performed to generate an anomaly detection model. To evaluate the performance of the proposed methodology, an anomaly detection experiment was performed on 400 situational images and the experimental results showed that the proposed methodology was superior in terms of anomaly detection accuracy and F1-score compared to the existing traditional pre-trained model.
Keywords
Object Recognition; Deep Learning; Multi-modal; Image Captioning; Anomaly Detection;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Claudio, D. S., C. Sansone, and M. Vento, "To reject or not to reject: that is the question-an answer in case of neural classifiers," IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 30.1 (2000): 84-94.   DOI
2 Fu, K., D. Cheng, Y. Tu, and L. Zhang, "Credit card fraud detection using convolutional neural networks," International conference on neural information processing. Springer, Cham, 2016.
3 Xie, X., C. Wang, S. Chen, G. Shi, and Z. Zhao, "Real-time illegal parking detection system based on deep learning," Proceedings of the 2017 International Conference on Deep Learning Technologies. 2017.
4 Shvetsova, N., B. Bakker, I. Fedulova, H. Schulz and D. Dylov, "Anomaly detection in medical imaging with deep perceptual autoencoders," IEEE Access 9 (2021): 118571-118583.   DOI
5 Szegedy, C., V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, "Rethinking the inception architecture for computer vision," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016).
6 Xu, G., S. Niu, M. Tan, Y. Luo, Q. Du, and Q. Wu, "Towards accurate text-based image captioning with content diversity exploration," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
7 Szegedy, C., W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going deeper with convolutions," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015).
8 Tao. Y., X. Xiao, and S. Zhou. "Mining distance-based outliers from large databases in any metric space," Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, 2006.
9 Van, N. T., T. N. Thinh, and L. T. Sach, "An anomaly-based network intrusion detection system using deep learning." 2017 international conference on system science and engineering (ICSSE). IEEE, 2017.
10 Reiss, T., N. Cohen, L. Bergman, and Y. Hoshen, "Panda: Adapting pretrained features for anomaly detection and segmentation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
11 Chandola, V., A. Banerjee, and V. Kumar, "Anomaly detection: A survey," ACM computing surveys (CSUR) 41.3 (2009): 1-58.   DOI
12 Alexey D., L. Beyer, A. Kolesnikov, D. Weissemborn, X. Zhai, T. Unterthiner, M. Dehgani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, "An image is worth 16x16 words: Transformers for image recognition at scale," arXiv preprint arXiv:2010.11929 (2020).
13 Ashfaq, R. A. R., W. Z. Wang, J. Z. Huang, H. Abbas, and Y. L. He, "Fuzziness based semi-supervised learning approach for intrusion detection system," Information sciences 378 (2017): 484-497.   DOI
14 He, K., X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016).
15 Christodorescu, M., S. Jha, S. A. Seshia, D. Song, and R. E. Bryant, "Semantics-aware malware detection," 2005 IEEE symposium on security and privacy (S&P'05). IEEE, 2005.
16 Bergmann, P., M. Fauser, D. Sattlegger, and C. Steger, "Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
17 Bochkovskiy, A., C. Y. Wang, and H. Y. M. Liao, "YOLOv4: Optimal speed and accuracy of object detection," arXiv preprint arXiv:2004.10934 (2020).
18 Chalapathy, R. and S. Chawla, "Deep learning for anomaly detection: A survey," arXiv preprint arXiv:1901.03407 (2019).
19 Chensi C., F. Li, H. Tan, D. Song, W. Shu, W. Li, Y. Zhou, X. Bo and Z. Xie, "Deep learning and its applications in biomedicine," Genomics, proteomics & bioinformatics 16.1 (2018): 17-32.   DOI
20 Cohen, N., R. Abutbul, and Y. Hoshen, "Out-of-Distribution Detection without Class Labels," arXiv preprint arXiv:2112.07662 (2021).
21 Fernando, T., S. Denman, D.. Ahmedt-Aristizabal, S. Sridharan, K. R. Laurens, P.Johnston, and C. Fookes, "Neural memory plasticity for medical anomaly detection," Neural Networks 127 (2020): 67-81.   DOI
22 Ge, Z., S. Liu, F. Wang, Z. Li, and J. Sun, "YOLOX: Exceeding YOLO series in 2021," arXiv preprint arXiv:2107.08430 (2021).
23 Girshick, R., J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2014).
24 Gornitz, N., M. Kloft, K. Rieck, and U. Brefeld, "Toward supervised anomaly detection," Journal of Artificial Intelligence Research 46 (2013): 235-262.   DOI
25 Hochreiter, S. and J. Schmidhuber, "Long short-term memory," Neural computation 9.8 (1997): 1735-1780.   DOI
26 Howard, A. G., M. Zhu, B. Chen, D. Kalenichenk o, W. Wang, T. Weyand, M. Andreetto, and H. Adam, "Mobilenets: Efficient convolutional neural networks for mobile vision applications," arXiv preprint arXiv:1704.04861. (2017).
27 Jain, A. K. and R. C. Dubes, "Algorithms for clustering data. Prentice-Hall," Inc., 1988.
28 Pierre, B. and K. Hornik, "Neural Networks and Principal Component Analysis: Learning from Examples Without Local Minima," Neural Networks, Vol.2, (1989), 53~58.   DOI
29 Seonwoo M., L. Byunghan and Y. Sungroh, "Deep learning in bioinformatics," Briefings in bioinformatics 18.5 (2017): 851-869.   DOI
30 Hawkins, D. M., "Identification of outliers," Vol. 11. London: Chapman and Hall, 1980.
31 Simonyan, K. and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556 (2014).
32 Kiran, B. R., D. M. Thomas, and R. Parakkal, "An overview of deep learning based methods for unsupervised and semi-supervised anomaly detection in videos," Journal of Imaging 4.2 (2018): 36.   DOI
33 Cohen, M. J. and S. Avidan, "Transformaly--Two (Feature Spaces) Are Better Than One," arXiv preprint arXiv:2112.04185 (2021).
34 Di Biase, G., H. Blum, R. Siegwart, and C. Cadena, "Pixel-wise anomaly detection in complex driving scenes," Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021.
35 Dosovitskiy, A., L. Beyer, A. Kolensnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, "An image is worth 16x16 words: Transformers for image recognition at scale," arXiv preprint arXiv:2010.11929 (2020).
36 Jena, B., G. K. Nayak, and S. Saxena, "Convolutional neural network and its pretrained models for image classification and object detection: A survey," Concurrency and Computation: Practice and Experience 34.6 (2022): e6767.
37 Ji, X., J. F. Henriques, and A. Vedaldi, "Invariant information clustering for unsupervised image classification and segmentation," Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019.
38 Lecun, Y., B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, "Backpropagation applied to handwritten zip code recognition," Neural computation 1.4 (1989): 541-551.   DOI
39 Li, W., V. Mahadevan, and N. Vasconcelos, "Anomaly detection and localization in crowded scenes." IEEE transactions on pattern analysis and machine intelligence 36.1 (2013): 18-32.   DOI
40 Johnson, J., A. Karpathy, and L. Fei-Fei, "Densecap: Fully convolutional localization networks for dense captioning," Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
41 Redmon, J. and A. Farhadi, "YOLOv3: An incremental improvement," arXiv preprint arXiv:1804.02767 (2018).
42 Liu, Y., S. Garg, J. Nie, Y, Zhang, Z. Xiong, J. Kang, and M. S. Hossain, "Deep anomaly detection for time-series data in industrial IoT: A communication-efficient on-device federated learning approach," IEEE Internet of Things Journal 8.8 (2020): 6348-6358.
43 Logothetis, N. K. and D. L. Sheinberg, "Visual object recognition," Annual review of neuroscience 19.1 (1996): 577-621.   DOI
44 Mitchell, W., G. Ilharco, S. Y. Gadre, R. Roelofs, R. Gontijo-Lopes, A.S. Morcos, H. Namkoong, A. Farhadi, Y. Carmon, S. Kornblith, and L. Schmidt, "Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time," arXiv pre print arXiv:2203.05482 (2022).
45 Redmon, J., S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified, real-time object detection," Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
46 Lin, T. Y., M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. L. Zitnick, "Microsoft coco: Common objects in context," European conference on computer vision. Springer, Cham, 2014.
47 Ren, S., K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards real-time object detection with region proposal networks," Advances in Neural Information Processing Systems (2015).
48 Tao, X., D. Zhang, W. Ma, X. Liu, and D. Xu, "Automatic metallic surface defect detection and recognition with convolutional neural networks," Applied Sciences 8.9 (2018): 1575.   DOI
49 Vicente, S., J. Carreira, L. Agapito, and J. Batista, "Reconstructing pascal voc," Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.
50 Xu, K., J. L. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhutdinov, R. S. Zemel, and Y. Bengio, "Show, attend and tell: Neural image caption generation with visual attention," In International conference on machine learning, 2015, (pp. 2048-2057). PMLR.
51 Ghosh, S. and D. L. Reilly, "Credit card fraud detection with a neural-network," System Sciences, 1994. Proceedings of the Twenty-Seventh Hawaii International Conference on. Vol. 3. IEEE, 1994.
52 Russakovsky, O., J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, "Imagenet large scale visual recognition challenge," International journal of computer vision 115.3 (2015): 211-252.   DOI
53 Shen, A., R. Tong, and Y. Deng, "Application of classification models on credit card fraud detection," 2007 International conference on service systems and service management. IEEE, 2007.
54 Gaus, Y. F. A., N. Bhowmik, S. Akcay, P. M. Guillen-Garcia, J. W. Barker, and T. P. Breckon, "Evaluation of a dual convolutional neural network architecture for object-wise anomaly detection in cluttered X-ray security imagery," 2019 international joint conference on neural networks (IJCNN). IEEE, 2019.
55 Girshick, R., "Fast R-CNN," Proceedings of the IEEE International Conference on Computer Vision (2015).
56 Goldstein, M. and S. Uchida, "A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data," PloS one 11.4 (2016): e0152173.   DOI