• Title/Summary/Keyword: sequential pattern mining

Search Result 82, Processing Time 0.017 seconds

Emotion Prediction of Paragraph using Big Data Analysis (빅데이터 분석을 이용한 문단 내의 감정 예측)

  • Kim, Jin-su
    • Journal of Digital Convergence
    • /
    • v.14 no.11
    • /
    • pp.267-273
    • /
    • 2016
  • Creation and Sharing of information which is structured data as well as various unstructured data. makes progress actively through the spread of mobile. Recently, Big Data extracts the semantic information from SNS and data mining is one of the big data technique. Especially, the general emotion analysis that expresses the collective intelligence of the masses is utilized using large and a variety of materials. In this paper, we propose the emotion prediction system architecture which extracts the significant keywords from social network paragraphs using n-gram and Korean morphological analyzer, and predicts the emotion using SVM and these extracted emotion features. The proposed system showed 82.25% more improved recall rate in average than previous systems and it will help extract the semantic keyword using morphological analysis.

An Interpretable Log Anomaly System Using Bayesian Probability and Closed Sequence Pattern Mining (베이지안 확률 및 폐쇄 순차패턴 마이닝 방식을 이용한 설명가능한 로그 이상탐지 시스템)

  • Yun, Jiyoung;Shin, Gun-Yoon;Kim, Dong-Wook;Kim, Sang-Soo;Han, Myung-Mook
    • Journal of Internet Computing and Services
    • /
    • v.22 no.2
    • /
    • pp.77-87
    • /
    • 2021
  • With the development of the Internet and personal computers, various and complex attacks begin to emerge. As the attacks become more complex, signature-based detection become difficult. It leads to the research on behavior-based log anomaly detection. Recent work utilizes deep learning to learn the order and it shows good performance. Despite its good performance, it does not provide any explanation for prediction. The lack of explanation can occur difficulty of finding contamination of data or the vulnerability of the model itself. As a result, the users lose their reliability of the model. To address this problem, this work proposes an explainable log anomaly detection system. In this study, log parsing is the first to proceed. Afterward, sequential rules are extracted by Bayesian posterior probability. As a result, the "If condition then results, post-probability" type rule set is extracted. If the sample is matched to the ruleset, it is normal, otherwise, it is an anomaly. We utilize HDFS datasets for the experiment, resulting in F1score 92.7% in test dataset.