• Title/Summary/Keyword: 시간 마이닝

Search Result 400, Processing Time 0.03 seconds

The use of Local API(Anomaly Process Instances) Detection for Analyzing Container Terminal Event (로컬 API(Anomaly Process Instances) 탐지법을 이용한 컨테이너 터미널 이벤트 분석)

  • Jeon, Daeuk;Bae, Hyerim
    • The Journal of Society for e-Business Studies
    • /
    • v.20 no.4
    • /
    • pp.41-59
    • /
    • 2015
  • Information systems has been developed and used in various business area, therefore there are abundance of history data (log data) stored, and subsequently, it is required to analyze those log data. Previous studies have been focusing on the discovering of relationship between events and no identification of anomaly instances. Previously, anomaly instances are treated as noise and simply ignored. However, this kind of anomaly instances can occur repeatedly. Hence, a new methodology to detect the anomaly instances is needed. In this paper, we propose a methodology of LAPID (Local Anomaly Process Instance Detection) for discriminating an anomalous process instance from the log data. We specified a distance metric from the activity relation matrix of each instance, and use it to detect API (Anomaly Process Instance). For verifying the suggested methodology, we discovered characteristics of exceptional situations from log data. To demonstrate our proposed methodology, we performed our experiment on real data from a domestic port terminal.

A Study of protective measures of the source program for the development of the Internet of Things (IoT): Protection of the program as well as plagiarism research (사물인터넷(IoT)발전을 위한 소스프로그램 보호방안 연구: 프로그램의 보호와 유사표절 연구)

  • Lee, Jong-Sik
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.4
    • /
    • pp.31-45
    • /
    • 2018
  • Recent dramatical development of computer technology related to internet technology intensifies the dispute over software of computer or smart device. Research on software has been flourished with political issuing of fierce competition among nations for software development. Particularly industrial growth in ethernet based big data and IoT (Internet of Things) has promoted to build and develop open source programs based on java, xcode and C. On these circumstances, issue on software piracy has been confronted despite the basic security policy protecting intellectual property rights of software and thus it is of substantial importance to protect the rights of originality of source program license. However, the other issue on source technology protection of developer is the possibility of hindrance to advancement in industry and culture by developing programs. This study discuss the way of enhancing legal stability of IoT application program development and reinforcing precision in inspection of program plagiarism by analyzing the source programs with newly introducing text mining technique, thus suggests an alternative protective way of infringement of personal information due to duplicating program.

Weighted Bayesian Automatic Document Categorization Based on Association Word Knowledge Base by Apriori Algorithm (Apriori알고리즘에 의한 연관 단어 지식 베이스에 기반한 가중치가 부여된 베이지만 자동 문서 분류)

  • 고수정;이정현
    • Journal of Korea Multimedia Society
    • /
    • v.4 no.2
    • /
    • pp.171-181
    • /
    • 2001
  • The previous Bayesian document categorization method has problems that it requires a lot of time and effort in word clustering and it hardly reflects the semantic information between words. In this paper, we propose a weighted Bayesian document categorizing method based on association word knowledge base acquired by mining technique. The proposed method constructs weighted association word knowledge base using documents in training set. Then, classifier using Bayesian probability categorizes documents based on the constructed association word knowledge base. In order to evaluate performance of the proposed method, we compare our experimental results with those of weighted Bayesian document categorizing method using vocabulary dictionary by mutual information, weighted Bayesian document categorizing method, and simple Bayesian document categorizing method. The experimental result shows that weighted Bayesian categorizing method using association word knowledge base has improved performance 0.87% and 2.77% and 5.09% over weighted Bayesian categorizing method using vocabulary dictionary by mutual information and weighted Bayesian method and simple Bayesian method, respectively.

  • PDF

A Functional Unit Dynamic API Birthmark for Windows Programs Code Theft Detection (Windows 프로그램 도용 탐지를 위한 기능 단위 동적 API 버스마크)

  • Choi, Seok-Woo;Cho, Woo-Young;Han, Tai-Sook
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.9
    • /
    • pp.767-776
    • /
    • 2009
  • A software birthmark is a set of characteristics that are extracted from a program itself to detect code theft. A dynamic API birthmark is extracted from the run-time API call sequences of a program. The dynamic Windows API birthmarks of Tamada et al. are extracted from API call sequences during the startup period of a program. Therefore. the dynamic birthmarks cannot reflect characteristics of main functions of the program. In this paper. we propose a functional unit birthmark(FDAPI) that is defined as API call sequences recorded during the execution of essential functions of a program. To find out that some functional units of a program are copied from an original program. two FDAPIs are extracted by executing the programs with the same input. The FDAPIs are compared using the semi-global alignment algorithm to compute a similarity between two programs. Programs with the same functionality are compared to show credibility of our birthmark. Binary executables that are compiled differently from the same source code are compared to prove resilience of our birthmark. The experimental result shows that our birthmark can detect module theft of software. to which the existing birthmarks of Tamada et al. cannot be applied.

Automatic Processing Techniques of Rotorcraft Flight Data Using Data Mining (회전익항공기 운동모델 개발을 위한 데이터마이닝을 이용한 비행데이터 자동 처리 기법)

  • Oh, Hyeju;Jo, Sungbeom;Choi, Keeyoung;Roh, Eun-Jung;Kang, Byung-Ryong
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.46 no.10
    • /
    • pp.823-832
    • /
    • 2018
  • In general, the fidelity of the aircraft dynamic model is verified by comparison with the flight test results of the target aircraft. Therefore, the reference flight data for performance comparisons must be extracted. This process requires a lot of time and manpower to extract useful data from the vast quantity of flight test data containing various noise for comparing fidelity. In particular, processing of flight data is complex because rotorcraft have high non-linearity characteristics such as coupling and wake interference effect and perform various maneuvers such as hover and backward flight. This study defines flight data processing criteria for rotorcraft and provides procedures and methods for automated processing of static and dynamic flight data using data mining techniques. Finally, the methods presented are validated using flight data.

A Study of Research on Methods of Automated Biomedical Document Classification using Topic Modeling and Deep Learning (토픽모델링과 딥 러닝을 활용한 생의학 문헌 자동 분류 기법 연구)

  • Yuk, JeeHee;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.2
    • /
    • pp.63-88
    • /
    • 2018
  • This research evaluated differences of classification performance for feature selection methods using LDA topic model and Doc2Vec which is based on word embedding using deep learning, feature corpus sizes and classification algorithms. In addition to find the feature corpus with high performance of classification, an experiment was conducted using feature corpus was composed differently according to the location of the document and by adjusting the size of the feature corpus. Conclusionally, in the experiments using deep learning evaluate training frequency and specifically considered information for context inference. This study constructed biomedical document dataset, Disease-35083 which consisted biomedical scholarly documents provided by PMC and categorized by the disease category. Throughout the study this research verifies which type and size of feature corpus produces the highest performance and, also suggests some feature corpus which carry an extensibility to specific feature by displaying efficiency during the training time. Additionally, this research compares the differences between deep learning and existing method and suggests an appropriate method by classification environment.

Analysis on Status and Trends of SIAM Journal Papers using Text Mining (텍스트마이닝 기법을 활용한 미국산업응용수학 학회지의 연구 현황 및 동향 분석)

  • Kim, Sung-Yeun
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.7
    • /
    • pp.212-222
    • /
    • 2020
  • The purpose of this study is to understand the current status and trends of the research studies published by the Society for Industrial and Applied Mathematics which is a leader in the field of industrial mathematics around the world. To perform this purpose, titles and abstracts were collected from 6,255 research articles between 2016 and 2019, and the R program was used to analyze the topic modeling model with LDA techniques and a regression model. As the results of analyses, first, a variety of studies have been studied in the fields of industrial mathematics, such as algebra, discrete mathematics, geometry, topological mathematics, probability and statistics. Second, it was found that the ascending research subjects were fluid mechanics, graph theory, and stochastic differential equations, and the descending research subjects were computational theory and classical geometry. The results of the study, based on the understanding of the overall flows and changes of the intellectual structure in the fields of industrial mathematics, are expected to provide researchers in the field with implications of the future direction of research and how to build an industrial mathematics curriculum that reflects the zeitgeist in the field of education.

Classification of False Alarms based on the Decision Tree for Improving the Performance of Intrusion Detection Systems (침입탐지시스템의 성능향상을 위한 결정트리 기반 오경보 분류)

  • Shin, Moon-Sun;Ryu, Keun-Ho
    • Journal of KIISE:Databases
    • /
    • v.34 no.6
    • /
    • pp.473-482
    • /
    • 2007
  • Network-based IDS(Intrusion Detection System) gathers network packet data and analyzes them into attack or normal. They raise alarm when possible intrusion happens. But they often output a large amount of low-level of incomplete alert information. Consequently, a large amount of incomplete alert information that can be unmanageable and also be mixed with false alerts can prevent intrusion response systems and security administrator from adequately understanding and analyzing the state of network security, and initiating appropriate response in a timely fashion. So it is important for the security administrator to reduce the redundancy of alerts, integrate and correlate security alerts, construct attack scenarios and present high-level aggregated information. False alarm rate is the ratio between the number of normal connections that are incorrectly misclassified as attacks and the total number of normal connections. In this paper we propose a false alarm classification model to reduce the false alarm rate using classification analysis of data mining techniques. The proposed model can classify the alarms from the intrusion detection systems into false alert or true attack. Our approach is useful to reduce false alerts and to improve the detection rate of network-based intrusion detection systems.

Storage Policies for Versions Management of XML Documents using a Change Set (변경 집합을 이용한 XML 문서의 버전 관리를 위한 저장 기법)

  • Yun Hong Won
    • The KIPS Transactions:PartD
    • /
    • v.11D no.7 s.96
    • /
    • pp.1349-1356
    • /
    • 2004
  • The interest of version management is increasing in electronic commerce requiring data mining and documents processing system related to digital governmentapplications. In this paper, we define a change set that is to manage historicalinformation and to maintain XML documents during a long period of time and propose several storage policies of XML documents using a change set. A change set includes a change oper-ation set and temporal dimensions and a change operation set is composed with schema change operations and data change operations. We pro-pose three storage policies using a change set. Three storage policies are (1) storing all the change sets, (2) storing the change sets and the versions periodically. (3) storing the aggregation of change sets and the versions at a point of proper time. Also, we compare the performance between the existing storage policy and the proposed storage policies. Though the performance evaluation, we show that the method to store the aggregation of change sets and the versions at a point of proper time outperforms others.

Automatic Construction of Class Hierarchies and Named Entity Dictionaries using Korean Wikipedia (한국어 위키피디아를 이용한 분류체계 생성과 개체명 사전 자동 구축)

  • Bae, Sang-Joon;Ko, Young-Joong
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.4
    • /
    • pp.492-496
    • /
    • 2010
  • Wikipedia as an open encyclopedia contains immense human knowledge written by thousands of volunteer editors and its reliability is also high. In this paper, we propose to automatically construct a Korean named entity dictionary using the several features of the Wikipedia. Firstly, we generate class hierarchies using the class information from each article of Wikipedia. Secondly, the titles of each article are mapped to our class hierarchies, and then we calculate the entropy value of the root node in each class hierarchy. Finally, we construct named entity dictionary with high performance by removing the class hierarchies which have a higher entropy value than threshold. Our experiment results achieved overall F1-measure of 81.12% (precision : 83.94%, recall : 78.48%).