• Title/Summary/Keyword: 연관 규칙 알고리즘

Search Result 200, Processing Time 0.032 seconds

Advanced Improvement for Frequent Pattern Mining using Bit-Clustering (비트 클러스터링을 이용한 빈발 패턴 탐사의 성능 개선 방안)

  • Kim, Eui-Chan;Kim, Kye-Hyun;Lee, Chul-Yong;Park, Eun-Ji
    • Journal of Korea Spatial Information System Society
    • /
    • v.9 no.1
    • /
    • pp.105-115
    • /
    • 2007
  • Data mining extracts interesting knowledge from a large database. Among numerous data mining techniques, research work is primarily concentrated on clustering and association rules. The clustering technique of the active research topics mainly deals with analyzing spatial and attribute data. And, the technique of association rules deals with identifying frequent patterns. There was an advanced apriori algorithm using an existing bit-clustering algorithm. In an effort to identify an alternative algorithm to improve apriori, we investigated FP-Growth and discussed the possibility of adopting bit-clustering as the alternative method to solve the problems with FP-Growth. FP-Growth using bit-clustering demonstrated better performance than the existing method. We used chess data in our experiments. Chess data were used in the pattern mining evaluation. We made a creation of FP-Tree with different minimum support values. In the case of high minimum support values, similar results that the existing techniques demonstrated were obtained. In other cases, however, the performance of the technique proposed in this paper showed better results in comparison with the existing technique. As a result, the technique proposed in this paper was considered to lead to higher performance. In addition, the method to apply bit-clustering to GML data was proposed.

  • PDF

Personalized Group Recommendation Using Collaborative Filtering and Frequent Pattern (협업 필터링과 빈발 패턴을 이용한 개인화된 그룹 추천)

  • Kim, Jung Woo;Park, Kwang-Hyun
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.41 no.7
    • /
    • pp.768-774
    • /
    • 2016
  • This paper deals with a method to recommend the combination of items as a group according to similarity to handle application area such as fashion and cooking, while the previous methods recommend single item such as a book, music or movie. Collaborative filtering is a method to recommend an item selected by users with similar tendency based on similarity between users. In this paper, the proposed method generates a set of frequent items based on collaborative filtering and association rules and recommends a group by similarity between groups. To show the validity of the proposed method, experiments are performed with purchase data collected from e-commerce for four months.

An Efficient Algorithm for Detecting Tables in HTML Documents (HTML 문서의 테이블 식별을 위한 효율적인 알고리즘)

  • Kim Yeon-Seok;Lee Kyong-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.10
    • /
    • pp.1339-1353
    • /
    • 2004
  • < TABLE > tags in HTML documents are widely used for formatting layout of Web documents as well as for describing genuine tables with relational information. As a prerequisite for information extraction from the Web, this paper presents an efficient method for sophisticated table detection. The proposed method consists of two phases: preprocessing and attribute-value relations extraction. For the preprocessing where genuine or ungenuine tables are filtered out, appropriate rules are devised based on a careful examination of general characteristics of < TABLE > tags. The remaining is detected at the attribute-value relations extraction phase. Specifically, a value area is extracted and checked out whether there is a syntactic coherency Futhermore, the method looks for a semantic coherency between an attribute area and a value area of a table that may be inappropriate for the syntactic coherency checkup. Experimental results with 11,477 < TABLE > tags from 1,393 HTML documents show at the method has performed better compared with previous works, resulting in a precision of 97.54% and a recall of 99.22% in average.

  • PDF

A Topic Analysis of Abstracts in Journal of Korean Data Analysis Society (한국자료분석학회지에 대한 토픽분석)

  • Kang, Changwan;Kim, Kyu Kon;Choi, Seungbae
    • Journal of the Korean Data Analysis Society
    • /
    • v.20 no.6
    • /
    • pp.2907-2915
    • /
    • 2018
  • Journal of the Korean Data Analysis Society founded in 1998 has played the role of a major application journal. In this study, we checked the objective of this journal by checking the abstracts for 10 years. Abstract data was crawled from the online journal site (kdas.jems.or.kr) and analyzed by topic model. As a result, we found 18 topics from 2680 abstracts that had several contents, for example, nursing, marketing, economics, regression, factor analysis, data mining and statistical inferences. Topic1 (regression) is most frequent with 460 documents and we found the usefulness of regression in the applied science area. We confirmed the significant 10 association rules using by Fisher's exact test. Also, for exploring the trend of topics, we conducted the topic analysis for two periods which are 2006-2011 period and 2012-2016 period. We found that the control study was more frequent than survey study over time and regression and factor analysis were frequent regardless of time.

Adaptive Frequent Pattern Algorithm using CAWFP-Tree based on RHadoop Platform (RHadoop 플랫폼기반 CAWFP-Tree를 이용한 적응 빈발 패턴 알고리즘)

  • Park, In-Kyu
    • Journal of Digital Convergence
    • /
    • v.15 no.6
    • /
    • pp.229-236
    • /
    • 2017
  • An efficient frequent pattern algorithm is essential for mining association rules as well as many other mining tasks for convergence with its application spread over a very broad spectrum. Models for mining pattern have been proposed using a FP-tree for storing compressed information about frequent patterns. In this paper, we propose a centroid frequent pattern growth algorithm which we called "CAWFP-Growth" that enhances he FP-Growth algorithm by making the center of weights and frequencies for the itemsets. Because the conventional constraint of maximum weighted support is not necessary to maintain the downward closure property, it is more likely to reduce the search time and the information loss of the frequent patterns. The experimental results show that the proposed algorithm achieves better performance than other algorithms without scarifying the accuracy and increasing the processing time via the centroid of the items. The MapReduce framework model is provided to handle large amounts of data via a pseudo-distributed computing environment. In addition, the modeling of the proposed algorithm is required in the fully distributed mode.

On Design of the intelligent Intrusion Detection System (지능형 침입 탐지 시스템에 관한 연구)

  • 이민규;한명묵
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2002.05a
    • /
    • pp.23-27
    • /
    • 2002
  • 본 논문에서는 정보보호에서 지능형 침입탐지시스템(Intrusion Detection System :IDS) 의한 모델을 제안한다. 이 모델은 데이터 마이닝 분야와 정보보호 분야의 결합된 방법을 이용한다. 즉, 계산환경을 격상하거나 새로운 공격 방법들 때문에 내장된 IDS를 보완 할 필요가 종종 있다. 현재 사용하고 있는 많은 IDS들은 전문적인 지식을 손으로 작성했기 때문에 IDS들의 변환은 가격이 매우 비싸며, 속도가 느리다는 단점이 있다. 이에 본 모델은 침입탐지 모델을 적응 적으로 구축하는데 데이터 마이닝 구조를 활용한다. 데이터 마이닝(Data Mining : DM)의 기술인 연관 규칙, 순차 패턴, 분류, 군집화, 유전자 알고리즘 기법(GA)인 Selection, Crossover, Mutation, Evaluation, Fitness Function의 기능을 접목하여 단점을 보안하고 처리 성능을 최대로 하는 즉, 보다 안전한 지능형 침입 탐지 시스템(IDS) 모델을 제안한다.

  • PDF

A Study on e-Document Encryption using Key Management Method based on the RRM (RRM기반 키 관리 방안에 의한 전자문서 암호화에 관한 연구)

  • Sung, Kyung-Sang;Oh, Hae-Seok
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2009.01a
    • /
    • pp.395-400
    • /
    • 2009
  • 전자문서를 대상으로 하는 다양한 보안 기술들이 연구 제시되고 있으나, 키 관리에 대한 어려움과 암호 알고리즘의 무거운 특성으로 안전성과 효율성의 반비례 관계가 발생하고 있다. 본 연구의 목적은 위와 같은 문제를 해결하기 위해 전자문서 암호 시스템에 적용 가능한 제안하는 RRM 기법을 응용하여 키 관리 방안에 적용함으로써 효율적인 암호화 과정을 수행하여 전자문서 보호 문제를 개선하는 것이다. 이를 위하여 난수정보에 규칙성을 부여함으로써 키 생성에 대한 이려움을 극복하고 키 테이블과 키셋 정보를 통해 키 관리 문제를 해결하며, 키셋 정보를 통해 복호화를 위한 연산 수행속도를 빠르게 진행할 수 있는 개선된 전자문서 암호화 시스템 수행을 위한 키 관리 방안을 제안한다. 제안하는 키 관리 방안을 통해 키 생성 연관성 문제를 해결함으로써 키 노출문제에 대한 안정성과 단순한 암복호화 과정에 비해 동일한 복잡도와 수행시간을 갖는 연산 기법을 이용하여 효율성을 높였으며, 전자 문서를 암호화 수행 후 관리를 함으로써 유출문제에 대한 문제도 해결할 수 있다.

  • PDF

An Algorithm for Mining Association Rules by Minimizing the Number of Candidate 2-Itemset (후보 2-항목집합의 개수를 최소화한 연관규칙 탐사 알고리즘)

  • 황종원;강맹규
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.21 no.48
    • /
    • pp.53-63
    • /
    • 1998
  • Mining for association rules between items in a large database of sales transaction has been described as an important data mining problem. The mining of association rules can be mapped into the problem of discovering large itemsets. In this paper we present an efficient algorithm for mining association rules by minimizing the total numbers of candidate 2-itemset, │C$_2$│. More the total numbers of candidate 2-itemset, less the time of executing the algorithm for mining association rules. The total performance of algorithm depends on the time of finding large 2-itemsets. Hence, minimizing the total numbers of candidate 2-itemset is very important. We have performed extensive experiments and compared the performance of our algorithm with the DHP algorithm, the best existing algorithm.

  • PDF

A Study on the Advanced Association Rules Algorithm of n-Items (개선된 n-항목 연관 규칙 알고리즘 연구)

  • 황현숙;어윤양
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.27 no.4
    • /
    • pp.29-39
    • /
    • 2002
  • The transaction tables of the existing association algorithms have two column attributes : It is composed of transaction identifier (Transaction_id) and an item identifier (item). In this kind of structure, as the volume of data becomes larger, the performance for the SQL query statements came applicable decreases. Therefore, we propose advanced association rules algorithm of n-items which can transact multiple items (Transaction_id, Item 1, Item 2…, Item n). In this structure, performance hours can be contracted more than the single item structures, because count can be computed by query of the input transaction tables. Our experimental results indicate that performance of the n items structure is up to 2 times better than the single item. As a result of this paper, the proposed algorithm can be applied to internet shopping, searching engine and etc.

A Study on the Error Detection based on Ontology (온톨로지 기반의 에러검출 방법에 관한 연구)

  • Seo, Jin-Won;Lim, Jae-Hyun;Kim, Chi-Su
    • Proceedings of the KAIS Fall Conference
    • /
    • 2008.05a
    • /
    • pp.220-223
    • /
    • 2008
  • 본 논문은 소프트웨어 설계 시 향상된 오류 검출방법을 통해서 소프트웨어 설계의 질을 향상시켜 그에 따른 소프트웨어 제품의 질을 향상시키데 목적을 두고 있다. 또한 소프트웨어 설계 방법론인 MOA(Methodology for Object to Agents)를 기초로 하고 있으며, MOA는 보편적인 정보 모델로써 온톨로지 기반 모델인 OSSD( Ontology for Sortware Specification and Desigh)모델을 이용한다. 본 논문은 OSSD 모델, 뷰-간 비일관성 검사기법, 일관성 프레임워크의 온톨로지적 특성과 연관된 규칙의 조합을 이용하여 UML모델에서 OSSD 모델로의 변환과정에서 수행되는 새로운 형식의 오류 검출방법을 정의한다. OSSD 모델로의 변환과정은 OSSD 모델의 인스턴스를 생성하기 위한 알고리즘에서 복수의 사상테이블을 이용하는 소프트웨어 설계의 어휘분석과 의미분석을 포함한다.

  • PDF