Search | Korea Science

Explanation-based Data Mining in Data Warehouse (데이타 웨어하우스 환경에서의 설명기반 데이타 마이닝)

김현수;이창호
- Proceedings of the Korea Database Society Conference
- /
- 1999.06a
- /
- pp.115-123
- /
- 1999
산업계 전반에 걸친 오랜 정보시스템 운용의 결과로 대용량의 데이타들이 축적되고 있다. 이러한 데이타로부터 유용한 지식을 추출하기 위해 여러 가지 데이타 마이닝 기법들이 연구되어왔다. 특히 데이타 웨어하우스의 등장은 이러한 데이타 마이닝에 있어 필요한 데이타 제공 환경을 제공해 주고 있다. 그러나 전문가의 적절한 판단과 해석을 거치지 않은 데이타 마이닝의 결과는 당연한 사실이거나, 사실과 다른 가짜이거나 또는 관련성이 없는(trivial, spurious and irrelevant) 내용만 무수히 쏟아낼 수 있다. 그러므로 데이타 마이닝의 결과가 비록 통계적 유의성을 가진다 하더라고 그 정당성과 유용성에 대한 검증과정과 방법론의 정립이 필요하다. 데이타 마이닝의 가장 어려운 점은 귀납적 오류를 없애기 위해 사람이 직접 그 결과를 해석하고 판단하며 아울러 새로운 탐색 방향을 제시해야 한다는 것이다. 본 논문의 목적은 이러한 데이타 마이닝에서 추출된 결과를 검증하고 아울러 새로운 지식 탐색 방향을 제시하는 방법론을 정립하는데 있다. 본 논문에서는 데이타 마이닝 기법 중 연관규칙탐사로 얻어진 결과를 설명가능성 여부의 판단을 통해 검증하는 기법을 제안하며, 이를 통해 얻어진 검증된 지식을 토대로 일반화를 통한 새로운 가설을 생성하여 데이타 웨어하우스로부터 연관규칙을 검증하는 일련의 아키텍쳐(architecture)를 제시하고자 한다. 먼저 데이타 마이닝 결과에 대한 설명의 필요성을 제시하고, 데이타 웨어하우스와 데이타 마이닝 기법들에 대한 간략한 설명과 연관규칙탐사에 대한 정의 및 방법을 보이고, 대상 영역에 대한 데이타 웨어하우스의 스키마를 보였다. 다음으로 도메인 지식(domain knowledge)과 연관규칙탐사를 통해 얻어진 결과를 표현하기 위한 지식표현 방법으로 Relational predicate Logic을 제안하였다. 연관규칙탐사로 얻어진 결과를 설명하기 위한 방법으로는 연관규칙탐사로 얻어진 연관규칙에 대해 Relational Predicate Logic으로 표현된 도메인 지식으로서 설명됨을 보이게 한다. 또한 이러한 설명(explanation)을 토대로 검증된 지식을 일반화하여 새로운 가설을 연역적으로 생성하고 이를 연관규칙탐사론 통해 검증한 후 새로운 지식을 얻는 반복적인 Explanation-based Data Mining Architecture를 제시하였다. 본 연구의 의의로는 데이타 마이닝을 통한 귀납적 지식생성에 있어 귀납적 오류의 발생을 고메인 지식을 통해 설명가능 함을 보임으로 검증하고 아울러 이러한 설명을 통해 연역적으로 새로운 가설지식을 생성시켜 이를 가설검증방식으로 검증함으로써 귀납적 접근과 연역적 접근의 통합 데이타 마이닝 접근을 제시하였다는데 있다.
PDF

Explanation-based Data Mining in Data Warehouse (데이터 웨어하우스 환경에서의 설명기반 데이터 마이닝)

김현수;이창호
- Proceedings of the Korea Inteligent Information System Society Conference
- /
- 1999.03a
- /
- pp.115-123
- /
- 1999
산업계 전반에 걸친 오랜 정보시스템 운용의 결과로 대용량의 데이터들이 축적되고 있다. 이러한 데이터로부터 유용한 지식을 추출하기 위해 여러 가지 데이터 마이닝 기법들이 연구되어왔다. 특히 데이터 웨어하우스의 등장은 이러한 데이터 마이닝에 있어 필요한 데이터 제공 환경을 제공해 주고 있다. 그러나 전문가의 적절한 판단과 해석을 거치지 않은 데이터 마이닝의 결과는 당연한 사실이거나, 사실과 다른 가짜이거나 또는 관련성 없는(trivial, spurious and irrelevant)내용만 무수히 쏟아낼 수 있다. 그러므로 데이터 마이닝의 결과가 비록 통계적 유의성을 가진다 하더라도 그 정당성과 유용성에 대한 검증과정과 방법론의 정립이 필요하다. 데이터 마이닝의 가장 어려운 점은 귀납적 오류를 없애기 위해 사람이 직접 그 결과를 해석하고 판단하며 아울러 새로운 탐색 방향을 제시해야 한다는 것이다. 본 논문에서는 데이터 마이닝 기법 중 연관규칙탐사로 얻어진 결과를 설명가능성 여부의 판단을 통해 검증하는 기법을 제안하며, 이를 통해 얻어진 검증된 지식을 토대로 일반화를 통한 새로운 가설을 생성하여 데이터 웨어하우스로부터 연관규칙을 검증하는 일련의 아텍쳐(architecture)를 제시하고다 한다. 먼저 데이터 마이닝 결과에 대한 설명의 필요성을 제시하고, 데이터 웨어하우스와 데이터 마이닝 기법들에 대한 간략한 설명과 연관규칙탐사에 대한 정의 및 방법을 보이고, 대상 영역에 대한 데이터 웨어하우스으 스키마를 보였다. 다음으로 도메인 지식(domain knowledge)과 연관규칙탐사를 통해 얻어진 결과를 표현하기위한 지식표현 방법으로 Relational Predicate Logic을 제안하였다. 연관규칙탐사로 얻어진 결과를 설명하기 위한 방법으로는 연관규칙탐사로 얻어진 연관규칙에 대해 Relational Predicate Logic으로 표현된 도메인 지식으로서 설명됨을 보이게 한다. 또한 이러한 설명(explanation)을 토대로 검증된 지식을 일반화하여 새로운 가설을 연역적으로 생성하고 이를 연관규칙탐사를 통해 검증한 후 새로운 지식을 얻는 반복적인 Explanation-based Data Mining Architecture를 제시하였다. 본 연구의 의의로는 데이터 마이닝을 통한 귀납적 지식생성에 있어 귀납적 오류의 발생을 도메인 지식을 통해 설명가능 함을 보임으로 검증하고 아울러 이러한 설명을 통해 연역적으로 새로운 가설지식을 생성시켜 이를 가설검증방식으로 검증함으로써 귀납적 접근과 연역적 접근의 통합 데이터 마이닝 접근을 제시하였다는데 있다.
PDF

A Rule Generation Technique Utilizing a Parallel Expansion Method (병렬확장을 활용한 규칙생성 기법)

Lee, Kee-Cheol;Kim, Jin-Bong
- The Transactions of the Korea Information Processing Society
- /
- v.5 no.4
- /
- pp.942-950
- /
- 1998
Extraction of knowledge, especially in the form of rules, from raw data is very important in data mining, the aim of which is to help users who feel the lack of knowledge in spite of the abundance of data. Logic minimization tools are ones which derive optimized knowledge given ON set and DC set. First, the parallel expansion scheme of logic minimization is extracted and used to obtain intial knowledge to get final rules, which are successfully applicable to real world data. The prototype system based on this new approach has been experimented with real world data to show that it is as practical as conventional long studied decision tree methods like C4.5 system.
PDF

Intrusion Detection Model based on Intelligent System (지능형 시스템기반의 침입탐지모델)

김명준;양지흥;한명묵
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 2002.12a
- /
- pp.243-248
- /
- 2002
빠르게 변해 가는 정보화사회에서 침입 탐지 시스템은 정밀성과 적웅성, 그리고 확장성을 필요로 한다. 또한 복잡한 Network 환경에서 중요하고 기밀성이 유지되어야 할 리소스를 보호하기 위해, 더욱 구조적이고 지능적인 IDS(Intrusion Detection System)개발의 필요성이 요구되고 있다. 본 연구는 이를 위한, 지능적인 IDS를 위해 침입패턴을 생성하기 위한 모델을 도출함에 목적이 있다. 침입 패턴은 방대한 양의 데이터를 갖게 되고, 이를 정확하고 효율적으로 관리하기 위해서 데이터마이닝의 주요 2분야인 Link analysis와 Sequence analysis를 이용하여 정확하고 신뢰성 있는 침입규칙을 생성하기 위한 모델을 도출해낸다 이 모델은 "Time Based Traffic Model", "Host Based Traffic Model", "Content Model"로 각각 상이한 침입 패턴을 생성하게 된다. 이 모델을 이용하면 좀더 효율적이고 안정적으로 패턴을 생성 할 수 있다, 즉 지능형 시스템기반의 침입 탐지 모델을 구현할 수 있다. 이러한 모델로 생성한 규칙은 침입데이터를 대표하는 규칙이 되고, 이는 비정상 사용자와 정상 사용자를 분류하게 된다 모델에 사용된 데이터는 KDD컨테스트의 데이터를 이용하였다. 사용된 데이터는 KDD컨테스트의 데이터를 이용하였다.

An Efficient Algorithm Using the locality of Data for Mining Quantitative Association Rules (수량 연관규칙 생성을 위한 데이터의 지역성을 고려한 효과적인 알고리즘 제안)

이혜정;박원환;박두순
- Proceedings of the Korea Multimedia Society Conference
- /
- 2003.05b
- /
- pp.126-129
- /
- 2003
최근 대용량의 데이터베이스로부터 연관규칙을 발견하여 이를 활용하는 단계에서 이러한 연관규칙을 수량항목에도 적용할 수 있도록 확장하는 연구가 소개되고 있다. 본 논문에서는 수량 항목을 이진항목으로 변환하기 위하여 빈발구간 항목집합(Large Interval Itemsets)을 생성할 때 수량 항목이 특정 영역에 집중하여 발생하거나 골고루 분포되어 있지 않은 경우, 이러한 지역성(locality)을 고려하여 빈발구간 항목집합을 생성하는 방법을 제안한다. 이 방법은 기존의 방법보다 많은 수의 세밀한 빈발구간 항목들을 생성할 수 있을 뿐만 아니라 의미 있는 구간을 중심으로 빈발구간 항목들이 순서대로 생성되기 때문에 세밀도를 판단하여 활용할 수 있으며, 원 데이터가 가지고 있는 특성의 손실을 최소화할 수 있는 특징이 있다 또한 인구센서스등 실 데이터를 사용한 성능평가를 통하여 기존의 방법보다 우수함을 보였다.
PDF

The Optimal Reduction of Fuzzy Rules using a Rough Set (러프집합을 이용한 퍼지 규칙의 효율적인 감축)

No, Eun-Yeong;Jeong, Hwan-Muk
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 2007.11a
- /
- pp.261-264
- /
- 2007
퍼지 추론은 애매한 지식을 효과적으로 처리할 수 있는 장점이 있다. 그러나 규칙의 연관속성은 규칙을 과다하게 생성하기 때문에 유용하고 중요한 규칙을 결정하는데 여러 가지 문제점이었다. 본 논문에서는 퍼지 규칙에서 규칙간의 상관성을 고려하여 불필요한 속성을 제거하고, 퍼지규칙의 상대농도를 이용하여 추론결과의 정확성을 유지하면서 규칙의 수를 최소화 하는 방법을 제안한다. 제안한 방법의 타당성을 검증하기 위하여 기존의 규칙 감축 방법에 따른 출론 결과와 비교 검증하였다.
PDF

Fuzzy Rules Generation and Inference System of Scatter Partition Method (분산 분할 방식의 퍼지 규칙 생성 및 추론 시스템)

Park, Keon-jun;Jang, Tae-Su;Kim, Sung-Hun;Kim, Yong-kab
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2012.10a
- /
- pp.35-36
- /
- 2012
The generation of fuzzy rules is inevitable in order to construct fuzzy modeling and in general, has the problem that the number of rules increases exponentially with increasing dimension. To solve this problem, we introduce the system that generate the fuzzy rules and make a inference based on FCM clustering algorithm that partition the input space in the scatter form. The parameters in the premise part of the fuzzy rules is determined as membership matrix by the FCM clustering algorithm and the consequence part of the fuzzy rules is are expressed as a polynomial function. Proposed model evaluated using the numerical data.
PDF

A Clustering Technique Using Association Rules for The Library and Information Science Terminology (연관규칙을 이용한 문헌정보학 전문용어 클러스터링 기법에 관한 연구)

Seung, Hyon-Woo;Park, Mi-Young
- Journal of the Korean Society for Library and Information Science
- /
- v.37 no.2
- /
- pp.89-105
- /
- 2003
In this paper, an effective method for clustering terminologies extracted from text is proposed, in order to develope a search engine to extract relevant information from large web documents. To prevent frequency of the meaningless association rules among general terminologies, only useful association rules among terminologies are produced using database tables which consist of domain-specific terminologies. Such association rules are produced by applying the Apriori algorithm after forming transaction units from groups of association rules in a document. A group of association rules produced from a terminology forms in a cluster.
https://doi.org/10.4275/KSLIS.2003.37.2.089 인용 PDF

Generation of Efficient Fuzzy Classification Rules for Intrusion Detection (침입 탐지를 위한 효율적인 퍼지 분류 규칙 생성)

Kim, Sung-Eun;Khil, A-Ra;Kim, Myung-Won
- Journal of KIISE:Software and Applications
- /
- v.34 no.6
- /
- pp.519-529
- /
- 2007
In this paper, we investigate the use of fuzzy rules for efficient intrusion detection. We use evolutionary algorithm to optimize the set of fuzzy rules for intrusion detection by constructing fuzzy decision trees. For efficient execution of evolutionary algorithm we use supervised clustering to generate an initial set of membership functions for fuzzy rules. In our method both performance and complexity of fuzzy rules (or fuzzy decision trees) are taken into account in fitness evaluation. We also use evaluation with data partition, membership degree caching and zero-pruning to reduce time for construction and evaluation of fuzzy decision trees. For performance evaluation, we experimented with our method over the intrusion detection data of KDD'99 Cup, and confirmed that our method outperformed the existing methods. Compared with the KDD'99 Cup winner, the accuracy was increased by 1.54% while the cost was reduced by 20.8%.
PDF KSCI

Fuzzy Modeling and Fuzzy Rule Generation in Global Approximate Response Surfaces (전역근사화 반응표면의 생성을 위한 퍼지모델링 및 퍼지규칙의 생성)

Lee, Jong-Soo;Hwang, Jeong-Su
- Journal of the Korean Institute of Intelligent Systems
- /
- v.12 no.3
- /
- pp.231-238
- /
- 2002
As a modeling method where the merits of fuzzy inference system and evolutionary computation are put together, evolutionary fuzzy modeling performs global approximate optimization. The paper proposes fuzzy clustering as fuzzy rule generation process which is one of the most important steps in evolutionary fuzzy modeling. With application of fuzzy clustering into the experiment or simulation results, fuzzy rules which properly describe non-linear and complex design problem can be obtained. The efficiency of evolutionary fuzzy modeling can be improved utilizing the membership degrees of data to clusters from the results of fuzzy clustering. To ensure the validity of the proposed method, the real design problem of an automotive inner trim is applied and the global approximation is achieved. Evolutionary fuzzy modeling is performed for several cases which differ in the number of clusters and the criterion of rule selection and their results are compared to prove that the proposed method can provide proper fuzzy rules for a given system and reduce computation time while maintaining the errors of modeling as a satisfactory level.
https://doi.org/10.5391/JKIIS.2002.12.3.231 인용 PDF KSCI

Search Result 1,200, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)