DOI QR코드

DOI QR Code

Association Analysis for Detecting Abnormal in Graph Database Environment

그래프 데이터베이스 환경에서 이상징후 탐지를 위한 연관 관계 분석 기법

  • Received : 2020.05.07
  • Accepted : 2020.08.20
  • Published : 2020.08.28

Abstract

The 4th industrial revolution and the rapid change in the data environment revealed technical limitations in the existing relational database(RDB). As a new analysis method for unstructured data in all fields such as IDC/finance/insurance, interest in graph database(GDB) technology is increasing. The graph database is an efficient technique for expressing interlocked data and analyzing associations in a wide range of networks. This study extended the existing RDB to the GDB model and applied machine learning algorithms (pattern recognition, clustering, path distance, core extraction) to detect new abnormal signs. As a result of the performance analysis, it was confirmed that the performance of abnormal behavior(about 180 times or more) was greatly improved, and that it was possible to extract an abnormal symptom pattern after 5 steps that could not be analyzed by RDB.

4차 산업 혁명과 데이터 환경의 급격한 변화는 기존 관계형 데이터베이스(RDB)는 기술적 한계를 드러내고 있다. IDC/금융/보험 등 전 분야에서 비정형 데이터에 대한 새로운 분석방안으로 그래프 데이터베이스(GDB) 기술에 관심이 높아지고 있다. 그래프 데이터베이스는 상호 연동된 데이터를 표현하고 광범위한 네트워크에서 연관 관계 분석에 효율적인 기술이다. 본 연구는 기존 RDB를 GDB 모델로 확장하고, 새로운 이상징후 탐지를 위해 기계학습 알고리즘(패턴인식, 클러스터링, 경로거리, 핵심추출)을 적용하였다. 성능분석 결과 이상 행위 성능(약 180배 이상)이 크게 향상되었고, RDB로 분석 불가능한 5단계 이후 이상징후 패턴을 추출할 수 있음을 확인하였다.

Keywords

References

  1. J. Y. Kim & K. H. No. (2019). Construction of Knowledge Base Based on Graph Database for College Student Career Advice Using Public Data, Journal of the Institute of Electronics Engineers of Korea, 56(10), 41-48. DOI : 10.5573/ieie.2019.56.10.41
  2. S. Srivastava & A. K. Singh. (2018). Graph Based Analysis of Panama Papers, In 2018 Fifth International Conference on Parallel, Distributed and Grid Computing (PDGC) IEEE, 822-827. DOI : 10.1109/PDGC.2018.8745785
  3. L. Zhuhadar & M. Ciampa. (2019). Leveraging learning innovations in cognitive computing with massive data sets: Using the offshore Panama papers leak to discover patterns. Computers in Human Behavior, 92, 507-518. DOI : 10.1016/j.chb.2017.12.013
  4. S. M. Bae, J. H. Kim, J. M. Yoo, S. R. Yang & J. J. Jung. (2019). Structural Analysis and Performance Test of Graph Databases using Relational Data. Journal of Korea Multimedia Society, 22(9), 1036-1045. DOI : 10.9717/kmms.2019.22.9.1036
  5. K. Y. Lee, H. R. Kim & J. S. Kim. (2017). AI Platform Solution Service and Trends. Journal of Korea Bigdata Society, 2(2), 9-16. DOI : 10.36498/kbigdt.2017.2.2.9
  6. K. T. Song & S. H. Park (2017). A Recent Trend of Database for Big Data Handling using Key-value database, Journal of Knowledge Information Technology and Systems, 12(1), 47-57. DOI : 10.34163/jkits.2017.12.1.005
  7. N. Roy-Hubara, L. Rokach, B. Shapira & P. Shoval. (2017). Modeling graph database schema. IT Professional(Magazin) IEEE, 19(6), 34-43. DOI : 10.1109/MITP.2017.4241458
  8. R. Angles, M. Arenas, P. Barceló, A. Hogan, J. Reutter & D. Vrgoc. (2017). Foundations of modern query languages for graph databases, ACM Computing Surveys (CSUR), 50(5), 1-40. DOI : 10.1145/3104031
  9. B. M. Sasaki. (2018). Graph Databases for Beginners: Why Graph Technology Is the Future. Neo4j (Online). https://neo4j.com/
  10. J. Pokorny. (2019). Integration of relational and graph databases functionally. Foundations of Computing and Decision Sciences, 44(4), 427-441. DOI : 10.2478/fcds-2019-0021
  11. K. Wongsuphasawat et al. (2017). Visualizing dataflow graphs of deep learning models in tensorflow. IEEE transactions on visualization and computer graphics, 24(1), 1-12. DOI : 10.1109/TVCG.2017.2744878
  12. J. Y. Kim, K. H. Ro. (2019). Construction of Knowledge Base Based on Graph Database for College Student Career Advice Using Public Data, Journal of the Institute of Electronics and Information Engineers, 56(10), 41-48. DOI : 10.5573/ieie.2019.56.10.41
  13. U. C. Park. (2020). Is-A Node Type Modeling Methodology to Improve Pattern Query Performance in Graph Database. Journal of The Korea Society of Computer and Information, 25(4), 123-131. DOI : 10.9708/jksci.2020.25.04.123
  14. U. C. Park. (2017). Visualization of Recommendation Items Based on Graph Database, Journal of Korean Institute of Information Technology, 15(6), 1-9. DOI : 10.14801/jkiit.2017.15.6.1
  15. S. M. Park & J. I. Lim. (2017). Study On Identifying Cyber Attack Classification Through The Analysis of Cyber Attack Intention. Journal of The Korea Institute of Information Security and Cryptology, 27(1), 103-113. DOI : 10.13089/JKIISC.2017.27.1.103
  16. M. Abd Majid & K. Z. Ariffi. (2019). Success Factors for Cyber Security Operation Center (SOC) Establishment. Conference: Proceedings of the 1st International Conference on Informatics, Engineering, Science and Technology. DOI : 10.4108/eai.18-7-2019.2287841
  17. W. Tounsi & H. Rais. (2018). A survey on technical threat intelligence in the age of sophisticated cyber attacks, Journal of ScienceDirect(Computers & security), 72, 212-233. DOI : 10.1016/j.cose.2017.09.001
  18. J. S. Lee & S. C. Hong (2014). Study on the Application Methods of Big Data at a Corporation-Cases of A and Y corporation Big Data System Projects. Journal of Internet Computing and Services, 15(1), 103-112. DOI : 10.7472/jksii.2014.15.1.103
  19. D. Fernandes & J. Bernardino. (2018). Graph Databases Comparison: AllegroGraph, ArangoDB, InfiniteGraph, Neo4J, and OrientDB, Conference: 7th International Conference on Data Science, Technology and Applications, 373-380. DOI : 10.5220/0006910203730380
  20. Ryutaro Yada. (2012). How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation, Database Platform Group Global Infrastructure Development Dept. Rakuten, tech showcase(Onlie). https://global.rakuten.com/