• Title/Summary/Keyword: 빈발 패턴

Search Result 128, Processing Time 0.031 seconds

Semi-Automatic Ontology Generation about XML Documents using Data Mining Method (데이터 마이닝 기법을 이용한 XML 문서의 온톨로지 반자동 생성)

  • Gu Mi-Sug;Hwang Jeong-Hee;Ryu Keun-Ho;Hong Jang-Eui
    • The KIPS Transactions:PartD
    • /
    • v.13D no.3 s.106
    • /
    • pp.299-308
    • /
    • 2006
  • As recently XML is becoming the standard of exchanging web documents and public documentations, XML data are increasing in many areas. To retrieve the information about XML documents efficiently, the semantic web based on the ontology is appearing. The existing ontology has been constructed manually and it was time and cost consuming. Therefore in this paper, we propose the semi-automatic ontology generation technique using the data mining technique, the association rules. The proposed method solves what type and how many conceptual relationships and determines the ontology domain level for the automatic ontology generation, using the data mining algorithm. Appying the association rules to the XML documents, we intend to find out the conceptual relationships to construct the ontology, finding the frequent patterns of XML tags in the XML documents. Using the conceptual ontology domain level extracted from the data mining, we implemented the semantic web based on the ontology by XML Topic Maps (XTM) and the topic map engine, TM4J.

Personalized Recommendation System using FP-tree Mining based on RFM (RFM기반 FP-tree 마이닝을 이용한 개인화 추천시스템)

  • Cho, Young-Sung;Ho, Ryu-Keun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.2
    • /
    • pp.197-206
    • /
    • 2012
  • A exisiting recommedation system using association rules has the problem, such as delay of processing speed from a cause of frequent scanning a large data, scalability and accuracy as well. In this paper, using a Implicit method which is not used user's profile for rating, we propose the personalized recommendation system which is a new method using the FP-tree mining based on RFM. It is necessary for us to keep the analysis of RFM method and FP-tree mining to be able to reflect attributes of customers and items based on the whole customers' data and purchased data in order to find the items with high purchasability. The proposed makes frequent items and creates association rule by using the FP-tree mining based on RFM without occurrence of candidate set. We can recommend the items with efficiency, are used to generate the recommendable item according to the basic threshold for association rules with support, confidence and lift. To estimate the performance, the proposed system is compared with existing system. As a result, it can be improved and evaluated according to the criteria of logicality through the experiment with dataset, collected in a cosmetic internet shopping mall.

An Anomaly Intrusion Detection Method using Multiple System Log (사용자 로그의 분석을 통한 실시간 비정상행위 탐지 기술)

  • Kim, Myung-Soo;Shin, Jong-Cheol;Jung, Jae-Myung;Ko, You-Sun;Lee, Won-Suk
    • 한국IT서비스학회:학술대회논문집
    • /
    • 2009.05a
    • /
    • pp.361-364
    • /
    • 2009
  • 침입의 방법이 점차 치밀해지고 다양해짐에 따라 새로운 방식의 침입 탐지 기법 역시 지속적으로 요구되어진다. 기존의 오용 탐지 방법론은 탐지율은 뛰어나지만 새로운 침입형태에 대한 대응 능력이 부족하다. 이러한 단점을 보완하고자 등장한 것이 비정상 행위 탐지 방법론이다. 하지만 현재까지의 연구는 네트워크나 서버 OS, 데이터베이스 등 각 개별 분야에 대해서만 진행되고 있어 그 탐지 능력에 한계가 있다. 본 논문에서는 이러한 한계를 극복하고자 사용자의 네트워크 및 운영체제 로그를 통합 하고, 데이터마이닝 기법 중 빈발 패턴 마이닝 기법을 이용한 보다 정확한 비정상 행위 탐지 기술을 제안한다.

  • PDF

A Methodology for Searching Frequent Pattern Using Graph-Mining Technique (그래프마이닝을 활용한 빈발 패턴 탐색에 관한 연구)

  • Hong, June Seok
    • Journal of Information Technology Applications and Management
    • /
    • v.26 no.1
    • /
    • pp.65-75
    • /
    • 2019
  • As the use of semantic web based on XML increases in the field of data management, a lot of studies to extract useful information from the data stored in ontology have been tried based on association rule mining. Ontology data is advantageous in that data can be freely expressed because it has a flexible and scalable structure unlike a conventional database having a predefined structure. On the contrary, it is difficult to find frequent patterns in a uniformized analysis method. The goal of this study is to provide a basis for extracting useful knowledge from ontology by searching for frequently occurring subgraph patterns by applying transaction-based graph mining techniques to ontology schema graph data and instance graph data constituting ontology. In order to overcome the structural limitations of the existing ontology mining, the frequent pattern search methodology in this study uses the methodology used in graph mining to apply the frequent pattern in the graph data structure to the ontology by applying iterative node chunking method. Our suggested methodology will play an important role in knowledge extraction.

A Technique for Making Efficient Travel Routes using the Mining Method of Frequent Patterns-growth (FP-growth 마이닝을 이용한 효율적인 여행경로 수립 기법)

  • Yoo, Kibeom;Cho, Kyungsoo;Kim, Ung-Mo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2010.11a
    • /
    • pp.10-13
    • /
    • 2010
  • 컴퓨터의 활용이 다양해 지면서 예전과 다르게 다양한 이유로 많은 사람들이 여행을 하고 나서 여행에 대한 정보 블로그나 웹 상에 저장하고 공개한다. 이렇게 웹 상에 많은 양의 여행 관련 데이터가 존재함에도 불구하고 데이터들이 산발적으로 존재하고 체계적으로 데이터 베이스화 되어 있지 않아서 여전히 정보를 검색하고 여행 일정을 세우는 데에 많은 시간과 노력이 필요하다. 따라서 본 논문은 FP-tree 기반의 빈발 패턴 증가 기법을 이용한 여행 계획 수립 기법을 제안한다. 제안되는 기법에서 데이터들은 FP-tree 방식으로 저장되어 검색에 필요한 시간과 노력을 극적으로 줄이고, FP-growth 마이닝 기법을 이용해 효과적인 여행 경로를 선택할 수 있게 도와준다.

Mining Technique of Tour Destination by weighted FP-tree (가중치가 부여된 FP-tree를 이용한 여행지 추출 기법)

  • MinJu Kim;EunJu Lee;Eung-Mo Kim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.233-236
    • /
    • 2008
  • 최근 컴퓨터와 통신의 기술이 빠르게 발달함에 따라 사회 각 부분은 그동안 경험하지 못했던 정보화라는 새로운 변화를 겪었다. 그 결과 정보화 수준이 점점 고도화 될수록 더욱 다양하고 방대한 데이터가 생성되어 데이터베이스를 이루게 되었다. 방대한 데이터에서 유용한 정보를 얻는 데이터마이닝 기법이 중요한 문제로 대두되었다. 데이터마이닝 기법은 점점 더 많은 분야에서 합리적인 선택을 위해 필수적으로 사용된다. 본 논문은 마이닝 기법을 적용하여 방대한 데이터베이스가 최적의 여행 경로 선택을 제공한다. 본 논문은 빈발 패턴 증가 기법에 가중치를 두어 여행자가 여행지를 선별하기 좋은 환경을 제공한다. 미래 산업 중 가장 중요한 산업 중 하나인 관광 산업은 계속적으로 성장하고 있으며 논문에서 제시하는 데이터 마이닝 기법으로 더 큰 발전을 기대한다.

Performance Analysis of Siding Window based Stream High Utility Pattern Mining Methods (슬라이딩 윈도우 기반의 스트림 하이 유틸리티 패턴 마이닝 기법 성능분석)

  • Ryang, Heungmo;Yun, Unil
    • Journal of Internet Computing and Services
    • /
    • v.17 no.6
    • /
    • pp.53-59
    • /
    • 2016
  • Recently, huge stream data have been generated in real time from various applications such as wireless sensor networks, Internet of Things services, and social network services. For this reason, to develop an efficient method have become one of significant issues in order to discover useful information from such data by processing and analyzing them and employing the information for better decision making. Since stream data are generated continuously and rapidly, there is a need to deal with them through the minimum access. In addition, an appropriate method is required to analyze stream data in resource limited environments where fast processing with low power consumption is necessary. To address this issue, the sliding window model has been proposed and researched. Meanwhile, one of data mining techniques for finding meaningful information from huge data, pattern mining extracts such information in pattern forms. Frequency-based traditional pattern mining can process only binary databases and treats items in the databases with the same importance. As a result, frequent pattern mining has a disadvantage that cannot reflect characteristics of real databases although it has played an essential role in the data mining field. From this aspect, high utility pattern mining has suggested for discovering more meaningful information from non-binary databases with the consideration of the characteristics and relative importance of items. General high utility pattern mining methods for static databases, however, are not suitable for handling stream data. To address this issue, sliding window based high utility pattern mining has been proposed for finding significant information from stream data in resource limited environments by considering their characteristics and processing them efficiently. In this paper, we conduct various experiments with datasets for performance evaluation of sliding window based high utility pattern mining algorithms and analyze experimental results, through which we study their characteristics and direction of improvement.

Mining Maximal Frequent Contiguous Sequences in Biological Data Sequences (생물학적 데이터 서열들에서 빈번한 최대길이 연속 서열 마이닝)

  • Kang, Tae-Ho;Yoo, Jae-Soo
    • The KIPS Transactions:PartD
    • /
    • v.15D no.2
    • /
    • pp.155-162
    • /
    • 2008
  • Biological sequences such as DNA sequences and amino acid sequences typically contain a large number of items. They have contiguous sequences that ordinarily consist of hundreds of frequent items. In biological sequences analysis(BSA), a frequent contiguous sequence search is one of the most important operations. Many studies have been done for mining sequential patterns efficiently. Most of the existing methods for mining sequential patterns are based on the Apriori algorithm. In particular, the prefixSpan algorithm is one of the most efficient sequential pattern mining schemes based on the Apriori algorithm. However, since the algorithm expands the sequential patterns from frequent patterns with length-1, it is not suitable for biological dataset with long frequent contiguous sequences. In recent years, the MacosVSpan algorithm was proposed based on the idea of the prefixSpan algorithm to significantly reduce its recursive process. However, the algorithm is still inefficient for mining frequent contiguous sequences from long biological data sequences. In this paper, we propose an efficient method to mine maximal frequent contiguous sequences in large biological data sequences by constructing the spanning tree with the fixed length. To verify the superiority of the proposed method, we perform experiments in various environments. As the result, the experiments show that the proposed method is much more efficient than MacosVSpan in terms of retrieval performance.

Multi-layer Caching Scheme Considering Sub-graph Usage Patterns (서브 그래프의 사용 패턴을 고려한 다중 계층 캐싱 기법)

  • Yoo, Seunghun;Jeong, Jaeyun;Choi, Dojin;Park, Jaeyeol;Lim, Jongtae;Bok, Kyoungsoo;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.3
    • /
    • pp.70-80
    • /
    • 2018
  • Due to the recent development of social media and mobile devices, graph data have been using in various fields. In addition, caching techniques for reducing I/O costs in the process of large capacity graph data have been studied. In this paper, we propose a multi-layer caching scheme considering the connectivity of the graph, which is the characteristics of the graph topology, and the history of the past subgraph usage. The proposed scheme divides a cache into Used Data Cache and Prefetched Cache. The Used Data Cache maintains data by weights according to the frequently used sub-graph patterns. The Prefetched Cache maintains the neighbor data of the recently used data that are not used. In order to extract the graph patterns, their past history information is used. Since the frequently used sub-graphs have high probabilities to be reused, they are cached. It uses a strategy to replace new data with less likely data to be used if the memory is full. Through the performance evaluation, we prove that the proposed caching scheme is superior to the existing cache management scheme.

An Effective Memory Compression Scheme for Embedded System (임베디드 시스템을 위한 효율적인 메모리 압축 기법)

  • Woo JangBok;Choi ByeongChang;Suh Hyo-Joong
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.11a
    • /
    • pp.871-873
    • /
    • 2005
  • 최근 임베디스 시스템의 성능이 향상됨에 따라, 임베디드 시스템을 구성하는 CPU와 주변 장치들의 성능 격차를 해소하는 문제가 점차 중요해지고 있다. 그 중에서 시스템의 성능에 가장 큰 영향을 미치는 것이 CPU와 메모리간의 통신이다. 고성능 컴퓨터 시스템에서는 그동안 CPU와 메모리간의 성능 격차를 줄이기 위한 여러 가지 연구들이 활발하게 진행되었는데, 여러 가지 연구들 중에서 메모리를 압축하여 메모리의 기억공간을 효율적으로 확장하는 방법이 효과적으로 사용되고 있다. 임베디드 시스템에서도 이러한 기법을 적절하게 적용하면 메모리를 압축함으로써 동일 공간에 보다 많은 데이터를 저장할 수 있고, 버스를 이용하여 데이터를 전송할 때, 보다 많은 정보를 전송할 수 있게 된다. 또한, CPU와 메모리 간의 전송되는 정보의 크기를 줄일 수 있으므로 임베디드 시스템에서 전력소모의 대부분을 차지하고 있는 CPU와 메모리 간의 전력소모를 크게 줄일 수 있는 장점이 있다. 본 논문에서는 빈발 패턴 압축 기법을 적절하게 변형하여 임베디드 시스템을 위한 효율적인 메모리 압축 기법을 제시하고자 한다.

  • PDF