DOI QR코드

DOI QR Code

Anomaly Detection Methodology Based on Multimodal Deep Learning

멀티모달 딥 러닝 기반 이상 상황 탐지 방법론

  • Lee, DongHoon (Graduate School of Business IT, Kookmin University) ;
  • Kim, Namgyu (Graduate School of Business IT, Kookmin University)
  • 이동훈 (국민대학교 비즈니스IT전문대학원) ;
  • 김남규 (국민대학교 비즈니스IT전문대학원)
  • Received : 2022.05.24
  • Accepted : 2022.06.19
  • Published : 2022.06.30

Abstract

Recently, with the development of computing technology and the improvement of the cloud environment, deep learning technology has developed, and attempts to apply deep learning to various fields are increasing. A typical example is anomaly detection, which is a technique for identifying values or patterns that deviate from normal data. Among the representative types of anomaly detection, it is very difficult to detect a contextual anomaly that requires understanding of the overall situation. In general, detection of anomalies in image data is performed using a pre-trained model trained on large data. However, since this pre-trained model was created by focusing on object classification of images, there is a limit to be applied to anomaly detection that needs to understand complex situations created by various objects. Therefore, in this study, we newly propose a two-step pre-trained model for detecting abnormal situation. Our methodology performs additional learning from image captioning to understand not only mere objects but also the complicated situation created by them. Specifically, the proposed methodology transfers knowledge of the pre-trained model that has learned object classification with ImageNet data to the image captioning model, and uses the caption that describes the situation represented by the image. Afterwards, the weight obtained by learning the situational characteristics through images and captions is extracted and fine-tuning is performed to generate an anomaly detection model. To evaluate the performance of the proposed methodology, an anomaly detection experiment was performed on 400 situational images and the experimental results showed that the proposed methodology was superior in terms of anomaly detection accuracy and F1-score compared to the existing traditional pre-trained model.

최근 컴퓨팅 기술의 발전과 클라우드 환경의 개선에 따라 딥 러닝 기술이 발전하게 되었으며, 다양한 분야에 딥 러닝을 적용하려는 시도가 많아지고 있다. 대표적인 예로 정상적인 데이터에서 벗어나는 값이나 패턴을 식별하는 기법인 이상 탐지가 있으며, 이상 탐지의 대표적 유형인 점 이상, 집단적 이상, 맥락적 이중 특히 전반적인 상황을 파악해야 하는 맥락적 이상을 탐지하는 것은 매우 어려운 것으로 알려져 있다. 일반적으로 이미지 데이터의 이상 상황 탐지는 대용량 데이터로 학습된 사전학습 모델을 사용하여 이루어진다. 하지만 이러한 사전학습 모델은 이미지의 객체 클래스 분류에 초점을 두어 생성되었기 때문에, 다양한 객체들이 만들어내는 복잡한 상황을 탐지해야 하는 이상 상황 탐지에 그대로 적용되기에는 한계가 있다. 이에 본 연구에서는 객체 클래스 분류를 학습한 사전학습 모델을 기반으로 이미지 캡셔닝 학습을 추가적으로 수행하여, 객체 파악뿐만 아니라 객체들이 만들어내는 상황까지 이해해야 하는 이상 상황 탐지에 적절한 2 단계 사전학습 모델 구축 방법론을 제안한다. 구체적으로 제안 방법론은 ImageNet 데이터로 클래스 분류를 학습한 사전학습 모델을 이미지 캡셔닝 모델에 전이하고, 이미지가 나타내는 상황을 설명한 캡션을 입력 데이터로 사용하여 학습을 진행한다. 이후 이미지와 캡션을 통해 상황 특질을 학습한 가중치를 추출하고 이에 대한 미세 조정을 수행하여 이상 상황 탐지 모델을 생성한다. 제안 방법론의 성능을 평가하기 위해 직접 구축한 데이터 셋인 상황 이미지 400장에 대해 이상 탐지 실험을 수행하였으며, 실험 결과 제안 방법론이 기존의 단순 사전학습 모델에 비해 이상 상황 탐지 정확도와 F1-score 측면에서 우수한 성능을 나타냄을 확인하였다.

Keywords

Acknowledgement

이 논문은 2021년 대한민국 교육부와 한국연구재단의 지원을 받아 수행된 연구임(NRF-2021S1A5A2A01061459). 이 논문은 과학기술정보통신부와 정보통신산업진흥원의 '고성능 컴퓨팅 지원' 사업의 지원을 받아 수행하였음.

References

  1. Alexey D., L. Beyer, A. Kolesnikov, D. Weissemborn, X. Zhai, T. Unterthiner, M. Dehgani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, "An image is worth 16x16 words: Transformers for image recognition at scale," arXiv preprint arXiv:2010.11929 (2020).
  2. Ashfaq, R. A. R., W. Z. Wang, J. Z. Huang, H. Abbas, and Y. L. He, "Fuzziness based semi-supervised learning approach for intrusion detection system," Information sciences 378 (2017): 484-497. https://doi.org/10.1016/j.ins.2016.04.019
  3. Bergmann, P., M. Fauser, D. Sattlegger, and C. Steger, "Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
  4. Bochkovskiy, A., C. Y. Wang, and H. Y. M. Liao, "YOLOv4: Optimal speed and accuracy of object detection," arXiv preprint arXiv:2004.10934 (2020).
  5. Chalapathy, R. and S. Chawla, "Deep learning for anomaly detection: A survey," arXiv preprint arXiv:1901.03407 (2019).
  6. Chandola, V., A. Banerjee, and V. Kumar, "Anomaly detection: A survey," ACM computing surveys (CSUR) 41.3 (2009): 1-58. https://doi.org/10.1145/1541880.1541882
  7. Chensi C., F. Li, H. Tan, D. Song, W. Shu, W. Li, Y. Zhou, X. Bo and Z. Xie, "Deep learning and its applications in biomedicine," Genomics, proteomics & bioinformatics 16.1 (2018): 17-32. https://doi.org/10.1016/j.gpb.2017.07.003
  8. Christodorescu, M., S. Jha, S. A. Seshia, D. Song, and R. E. Bryant, "Semantics-aware malware detection," 2005 IEEE symposium on security and privacy (S&P'05). IEEE, 2005.
  9. Claudio, D. S., C. Sansone, and M. Vento, "To reject or not to reject: that is the question-an answer in case of neural classifiers," IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 30.1 (2000): 84-94. https://doi.org/10.1109/5326.827457
  10. Cohen, M. J. and S. Avidan, "Transformaly--Two (Feature Spaces) Are Better Than One," arXiv preprint arXiv:2112.04185 (2021).
  11. Cohen, N., R. Abutbul, and Y. Hoshen, "Out-of-Distribution Detection without Class Labels," arXiv preprint arXiv:2112.07662 (2021).
  12. Di Biase, G., H. Blum, R. Siegwart, and C. Cadena, "Pixel-wise anomaly detection in complex driving scenes," Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021.
  13. Dosovitskiy, A., L. Beyer, A. Kolensnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, "An image is worth 16x16 words: Transformers for image recognition at scale," arXiv preprint arXiv:2010.11929 (2020).
  14. Fernando, T., S. Denman, D.. Ahmedt-Aristizabal, S. Sridharan, K. R. Laurens, P.Johnston, and C. Fookes, "Neural memory plasticity for medical anomaly detection," Neural Networks 127 (2020): 67-81. https://doi.org/10.1016/j.neunet.2020.04.011
  15. Fu, K., D. Cheng, Y. Tu, and L. Zhang, "Credit card fraud detection using convolutional neural networks," International conference on neural information processing. Springer, Cham, 2016.
  16. Gaus, Y. F. A., N. Bhowmik, S. Akcay, P. M. Guillen-Garcia, J. W. Barker, and T. P. Breckon, "Evaluation of a dual convolutional neural network architecture for object-wise anomaly detection in cluttered X-ray security imagery," 2019 international joint conference on neural networks (IJCNN). IEEE, 2019.
  17. Ge, Z., S. Liu, F. Wang, Z. Li, and J. Sun, "YOLOX: Exceeding YOLO series in 2021," arXiv preprint arXiv:2107.08430 (2021).
  18. Ghosh, S. and D. L. Reilly, "Credit card fraud detection with a neural-network," System Sciences, 1994. Proceedings of the Twenty-Seventh Hawaii International Conference on. Vol. 3. IEEE, 1994.
  19. Girshick, R., "Fast R-CNN," Proceedings of the IEEE International Conference on Computer Vision (2015).
  20. Girshick, R., J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2014).
  21. Goldstein, M. and S. Uchida, "A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data," PloS one 11.4 (2016): e0152173. https://doi.org/10.1371/journal.pone.0152173
  22. Gornitz, N., M. Kloft, K. Rieck, and U. Brefeld, "Toward supervised anomaly detection," Journal of Artificial Intelligence Research 46 (2013): 235-262. https://doi.org/10.1613/jair.3623
  23. Hawkins, D. M., "Identification of outliers," Vol. 11. London: Chapman and Hall, 1980.
  24. He, K., X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016).
  25. Hochreiter, S. and J. Schmidhuber, "Long short-term memory," Neural computation 9.8 (1997): 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735
  26. Howard, A. G., M. Zhu, B. Chen, D. Kalenichenk o, W. Wang, T. Weyand, M. Andreetto, and H. Adam, "Mobilenets: Efficient convolutional neural networks for mobile vision applications," arXiv preprint arXiv:1704.04861. (2017).
  27. Jain, A. K. and R. C. Dubes, "Algorithms for clustering data. Prentice-Hall," Inc., 1988.
  28. Jena, B., G. K. Nayak, and S. Saxena, "Convolutional neural network and its pretrained models for image classification and object detection: A survey," Concurrency and Computation: Practice and Experience 34.6 (2022): e6767.
  29. Ji, X., J. F. Henriques, and A. Vedaldi, "Invariant information clustering for unsupervised image classification and segmentation," Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019.
  30. Johnson, J., A. Karpathy, and L. Fei-Fei, "Densecap: Fully convolutional localization networks for dense captioning," Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
  31. Kiran, B. R., D. M. Thomas, and R. Parakkal, "An overview of deep learning based methods for unsupervised and semi-supervised anomaly detection in videos," Journal of Imaging 4.2 (2018): 36. https://doi.org/10.3390/jimaging4020036
  32. Lecun, Y., B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, "Backpropagation applied to handwritten zip code recognition," Neural computation 1.4 (1989): 541-551. https://doi.org/10.1162/neco.1989.1.4.541
  33. Li, W., V. Mahadevan, and N. Vasconcelos, "Anomaly detection and localization in crowded scenes." IEEE transactions on pattern analysis and machine intelligence 36.1 (2013): 18-32. https://doi.org/10.1109/TPAMI.2013.111
  34. Lin, T. Y., M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. L. Zitnick, "Microsoft coco: Common objects in context," European conference on computer vision. Springer, Cham, 2014.
  35. Liu, Y., S. Garg, J. Nie, Y, Zhang, Z. Xiong, J. Kang, and M. S. Hossain, "Deep anomaly detection for time-series data in industrial IoT: A communication-efficient on-device federated learning approach," IEEE Internet of Things Journal 8.8 (2020): 6348-6358.
  36. Logothetis, N. K. and D. L. Sheinberg, "Visual object recognition," Annual review of neuroscience 19.1 (1996): 577-621. https://doi.org/10.1146/annurev.ne.19.030196.003045
  37. Mitchell, W., G. Ilharco, S. Y. Gadre, R. Roelofs, R. Gontijo-Lopes, A.S. Morcos, H. Namkoong, A. Farhadi, Y. Carmon, S. Kornblith, and L. Schmidt, "Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time," arXiv pre print arXiv:2203.05482 (2022).
  38. Pierre, B. and K. Hornik, "Neural Networks and Principal Component Analysis: Learning from Examples Without Local Minima," Neural Networks, Vol.2, (1989), 53~58. https://doi.org/10.1016/0893-6080(89)90014-2
  39. Redmon, J. and A. Farhadi, "YOLOv3: An incremental improvement," arXiv preprint arXiv:1804.02767 (2018).
  40. Redmon, J., S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified, real-time object detection," Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
  41. Reiss, T., N. Cohen, L. Bergman, and Y. Hoshen, "Panda: Adapting pretrained features for anomaly detection and segmentation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
  42. Ren, S., K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards real-time object detection with region proposal networks," Advances in Neural Information Processing Systems (2015).
  43. Russakovsky, O., J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, "Imagenet large scale visual recognition challenge," International journal of computer vision 115.3 (2015): 211-252. https://doi.org/10.1007/s11263-015-0816-y
  44. Seonwoo M., L. Byunghan and Y. Sungroh, "Deep learning in bioinformatics," Briefings in bioinformatics 18.5 (2017): 851-869. https://doi.org/10.1093/bib/bbw068
  45. Shen, A., R. Tong, and Y. Deng, "Application of classification models on credit card fraud detection," 2007 International conference on service systems and service management. IEEE, 2007.
  46. Shvetsova, N., B. Bakker, I. Fedulova, H. Schulz and D. Dylov, "Anomaly detection in medical imaging with deep perceptual autoencoders," IEEE Access 9 (2021): 118571-118583. https://doi.org/10.1109/ACCESS.2021.3107163
  47. Simonyan, K. and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556 (2014).
  48. Szegedy, C., V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, "Rethinking the inception architecture for computer vision," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016).
  49. Szegedy, C., W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going deeper with convolutions," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015).
  50. Tao, X., D. Zhang, W. Ma, X. Liu, and D. Xu, "Automatic metallic surface defect detection and recognition with convolutional neural networks," Applied Sciences 8.9 (2018): 1575. https://doi.org/10.3390/app8091575
  51. Tao. Y., X. Xiao, and S. Zhou. "Mining distance-based outliers from large databases in any metric space," Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, 2006.
  52. Van, N. T., T. N. Thinh, and L. T. Sach, "An anomaly-based network intrusion detection system using deep learning." 2017 international conference on system science and engineering (ICSSE). IEEE, 2017.
  53. Vicente, S., J. Carreira, L. Agapito, and J. Batista, "Reconstructing pascal voc," Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.
  54. Xie, X., C. Wang, S. Chen, G. Shi, and Z. Zhao, "Real-time illegal parking detection system based on deep learning," Proceedings of the 2017 International Conference on Deep Learning Technologies. 2017.
  55. Xu, G., S. Niu, M. Tan, Y. Luo, Q. Du, and Q. Wu, "Towards accurate text-based image captioning with content diversity exploration," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
  56. Xu, K., J. L. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhutdinov, R. S. Zemel, and Y. Bengio, "Show, attend and tell: Neural image caption generation with visual attention," In International conference on machine learning, 2015, (pp. 2048-2057). PMLR.