• Title/Summary/Keyword: 연관규칙 분석

Search Result 347, Processing Time 0.033 seconds

A study on the relatively causal strength measures in a viewpoint of interestingness measure (흥미도 측도 관점에서 상대적 인과 강도의 고찰)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.1
    • /
    • pp.49-56
    • /
    • 2017
  • Among the techniques for analyzing big data, the association rule mining is a technique for searching for relationship between some items using various relevance evaluation criteria. This associative rule scheme is based on the direction of rule creation, and there are positive, negative, and inverse association rules. The purpose of this paper is to investigate the applicability of various types of relatively causal strength measures to the types of association rules from the point of view of interestingness measure. We also clarify the relationship between various types of confidence measures. As a result, if the rate of occurrence of the posterior item is more than 0.5, the first measure ($RCS_{IJ1}$) proposed by Good (1961) is more preferable to the first measure ($RCS_{LR1}$) proposed by Lewis (1986) because the variation of the value is larger than that of $RCS_{LR1}$, and if the ratio is less than 0.5, $RCS_{LR1}$ is more preferable to $RCS_{IJ1}$.

Design and Implementation of Mining System for Audit Data Analysis (감사데이터 분석을 위한 마이닝 시스템 설계 및 구현)

  • 김은희;문호성;신문선;류근호;김기영
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.10c
    • /
    • pp.4-6
    • /
    • 2002
  • 네트워크의 광역화와 새로운 공격 유형의 발생으로 침입 탐지 시스템에서 새로운 시퀀스의 추가나 침입탐지 모델 구축의 수동적인 접근부분이 문제가 되고 있다. 특히 기존의 침입탐지 시스템들은 대량의 네트워크 하부구조를 가진 네트워크 정보를 수집 및 분석하는데 있어 각각 전담 시스템들이 담당하고 있다. 따라서 침입탐지 시스템에서 증가하는 많은 양의 감사데이터를 분석하여 다양한 공격 유형들에 대해서 능동적으로 대처할 수 있도록 하는 것이 필요하다. 최근, 침입 탐지 시스템에 데이터 마이닝 기법을 적용하여 능동적인 침입탐지시스템을 구축하고자 하는 연구들이 활발히 이루어지고 있다. 이 논문에서는 대량의 감사 데이터를 정확하고 효율적으로 분석하기 위한 마이닝 시스템을 설계하고 구현한다. 감사데이터는 트랜잭션데이터베이스와는 다른 특성을 가지는 데이터이므로 이를 고려한 마이닝 시스템을 설계하였다. 구현된 마이닝 시스템은 연관규칙 기법을 이용하여 감사데이터 속성간의 연관성을 탐사하고, 빈발 에피소드 기법을 적용하여 주어진 시간 내에서 상호 연관성 있게 발생한 이벤트들을 모음으로써 연속적인 시간간격 내에서 빈번하게 발생하는 사건들의 발견과 알려진 사건에서 시퀀스의 행동을 예측하거나 기술할 수 있는 규칙을 생성한 수 있다. 감사데이터의 마이닝 결과 생성된 규칙들은 능동적인 보안정책을 구축하는데 활용필 수 있다. 또한 데이터양의 감소로 침입 탐지시간을 최소화하는데도 기여한 것이다.

  • PDF

Utilizing the Effect of Market Basket Size for Improving the Practicality of Association Rule Measures (연관규칙 흥미성 척도의 실용성 향상을 위한 장바구니 크기 효과 반영 방안)

  • Kim, Won-Seo;Jeong, Seung-Ryul;Kim, Nam-Gyu
    • The KIPS Transactions:PartD
    • /
    • v.17D no.1
    • /
    • pp.1-8
    • /
    • 2010
  • Association rule mining techniques enable us to acquire knowledge concerning sales patterns among individual items from voluminous transactional data. Certainly, one of the major purposes of association rule mining is utilizing the acquired knowledge to provide marketing strategies such as catalogue design, cross-selling and shop allocation. However, this requires too much time and high cost to only extract the actionable and profitable knowledge from tremendous numbers of discovered patterns. In currently available literature, a number of interest measures have been devised to accelerate and systematize the process of pattern evaluation. Unfortunately, most of such measures, including support and confidence, are prone to yielding impractical results because they are calculated only from the sales frequencies of items. For instance, traditional measures cannot differentiate between the purchases in a small basket and those in a large shopping cart. Therefore, some adjustment should be made to the size of market baskets because there is a strong possibility that mutually irrelevant items could appear together in a large shopping cart. Contrary to the previous approaches, we attempted to consider market basket's size in calculating interest measures. Because the devised measure assigns different weights to individual purchases according to their basket sizes, we expect that the measure can minimize distortion of results caused by accidental patterns. Additionally, we performed intensive computer simulations under various environments, and we performed real case analyses to analyze the correctness and consistency of the devised measure.

Weighted Association Rule Discovery for Item Groups with Different Properties (상이한 특성을 갖는 아이템 그룹에 대한 가중 연관 규칙 탐사)

  • 김정자;정희택
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.6
    • /
    • pp.1284-1290
    • /
    • 2004
  • In market-basket analysis, weighted association rule(WAR) discovery can mine the rules which include more beneficial information by reflecting item importance for special products. However, when items are divided into more than one group and item importance for each group must be measured by different measurement or separately, we cannot directly apply traditional weighted association rule discovery. To solve this problem, we propose a novel methodology to discovery the weighted association rule in this paper In this methodology, the items should be first divided into sub-groups according to the properties of the items, and the item importance is defined or calculated only with the items enclosed to the sub-group. Our algorithm makes qualitative evaluation for network risk assessment possible by generating risk rule set for risk factor using network sorority data, and quantitative evaluation possible by calculating risk value using statistical factors such as weight applied in rule generation. And, It can be widely used for new model of more delicate analysis in market-basket database in which the data items are distinctly separated.

A New Association Rule Mining based on Coverage and Exclusion for Network Intrusion Detection (네트워크 침입 탐지를 위한 Coverage와 Exclusion 기반의 새로운 연관 규칙 마이닝)

  • Tae Yeon Kim;KyungHyun Han;Seong Oun Hwang
    • Journal of Internet of Things and Convergence
    • /
    • v.9 no.1
    • /
    • pp.77-87
    • /
    • 2023
  • Applying various association rule mining algorithms to the network intrusion detection task involves two critical issues: too large size of generated rule set which is hard to be utilized for IoT systems and hardness of control of false negative/positive rates. In this research, we propose an association rule mining algorithm based on the newly defined measures called coverage and exclusion. Coverage shows how frequently a pattern is discovered among the transactions of a class and exclusion does how frequently a pattern is not discovered in the transactions of the other classes. We compare our algorithm experimentally with the Apriori algorithm which is the most famous algorithm using the public dataset called KDDcup99. Compared to Apriori, the proposed algorithm reduces the resulting rule set size by up to 93.2 percent while keeping accuracy completely. The proposed algorithm also controls perfectly the false negative/positive rates of the generated rules by parameters. Therefore, network analysts can effectively apply the proposed association rule mining to the network intrusion detection task by solving two issues.

Association Rule Discovery for Sequence Analysis (서열 분석을 위한 연관 규칙 탐사)

  • 김정자;이도헌
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.04b
    • /
    • pp.91-93
    • /
    • 2001
  • 최근 지놈(Genome) 프로젝트를 통해 핵산, 단백질 서열 정보가 밝혀짐에 따라 분자 수준의 유전자 정보를 다루는 기법들이 활발히 연구되면서 방대한 서열 정보를 데이터 베이스화하고, 부족하기 위한 효과적인 도구와 컴퓨터 알고리즘의 개발을 필요로 하고 있다. 본 논문에서는 여러 단백질에 공통적으로 존재하는 서열 정보간에 존재하는 연관성을 탐사하기 위한 서열 연관 규칙 알고리즘을 제안한다. 원자 항목을 취급하였던 기존 알고리즘과는 달리 중복을 반영해야 하는 서열 데이터의 특성을 고려하여야 한다. 실험을 단백질 서열 데이터를 대상으로 수행하였다. 먼저 여러 서열에 빈발하게 발생하는 부 서열 집합을 찾고, 부 서열 집합들간에 존재하는 관련성을 탐사한다. 본 연구의 결과는 탐사된 규칙으로부터 다른 단백질의 구조와 기능을 예측할 수 있고, 이 정보는 필요로 하는 생물학적 분석을 방향을 제시할 것이다. 이는 생물학적 실험 대상의 후부조합을 최소화함으로써 많은 시간과 노력 비용을 절감할 수 있다.

  • PDF

Mining Positive and Negative Association Rules Algorithm based on Correlation and Chi-squared analysis (상관관계와 카이-제곱 분석에 기반한 긍정과 부정 연관 규칙 알고리즘)

  • Kim, Na-hee;Youn, Sung-dae
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2009.10a
    • /
    • pp.223-226
    • /
    • 2009
  • Recently, Mining negative association rules has received some attention and proved to be useful. Negative association rules are useful in market-basket analysis to identify products that conflict with each other or products that complement each other. Several algorithms have been proposed. However, there are some questions with those algorithms, for example, misleading rules will occur when the positive and negative rules are mined simultaneously. The chi-squared test that based on the mature theory and Correlation Coefficient can avoid the problem. In this paper, We proposed the algorithm PNCCR based on chi-squared test and correlation is proposed. The experiment results show that the misleading rules are pruned. It suggests that the algorithm is correct and efficient.

  • PDF

A Personalized Clothing Recommender System Based on the Algorithm for Mining Association Rules (연관 규칙 생성 알고리즘 기반의 개인화 의류 추천 시스템)

  • Lee, Chong-Hyeon;Lee, Suk-Hoon;Kim, Jang-Won;Baik, Doo-Kwon
    • Journal of the Korea Society for Simulation
    • /
    • v.19 no.4
    • /
    • pp.59-66
    • /
    • 2010
  • We present a personalized clothing recommender system - one that mines association rules from transaction described in ontologies and infers a recommendation from the rules. The recommender system can forecast frequently changing trends of clothing using the Onto-Apriori algorithm, and it makes appropriate recommendations for each users possible through the inference marked as meta nodes. We simulates the rule generator and the inferential search engine of the system with focus on accuracy and efficiency, and our results validate the system.

Mining Association Rules on Significant Rare Data using Relative Support (상대 지지도를 이용한 의미 있는 희소 항목에 대한 연관 규칙 탐사 기법)

  • Ha, Dan-Shim;Hwang, Bu-Hyun
    • Journal of KIISE:Databases
    • /
    • v.28 no.4
    • /
    • pp.577-586
    • /
    • 2001
  • Recently data mining, which is analyzing the stored data and discovering potential knowledge and information in large database is a key research topic in database research data In this paper, we study methods of discovering association rules which are one of data mining techniques. And we propose a technique of discovering association rules using the relative support to consider significant rare data which have the high relative support among some data. And we compare and evaluate existing methods and the proposed method of discovering association rules for discovering significant rare data.

  • PDF

Keyword Collection System based on Association Rules to Track Pornography of Children on Dark Webs (다크웹 아동 음란물 추적을 위한 연관규칙 기반 키워드 수집체계)

  • Jin-Gyeong Kim;Jiyeon Kim;Chang-Hoon Kim
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.07a
    • /
    • pp.207-208
    • /
    • 2023
  • 다크웹을 통한 마약, 금융거래, 해킹 등 사이버 범죄가 증가하면서 다크웹 상의 범죄 추적을 위한 사이버 수사 필요성이 증대되고 있다. Tor와 같은 다크웹 접속 브라우저는 강력한 익명성을 제공하기 때문에 주로 다크웹 운영상의 취약점 분석, 악성코드를 활용한 함정수사 기법이 실효성 높은 다크웹 수사 기술로 간주된다. 그러나 사이트 개설 및 폐쇄가 빈번하게 발생하는 다크웹의 특성상 최신 범죄 정보를 수집하기 위해서는 방대한 다크웹 정보를 실시간 수집하고, 능동적으로 검색 키워드를 확장할 수 있는 고도화된 크롤러 기술 개발이 필요하다. 본 논문은 다양한 다크웹 사이트 중, 아동 음란물 사이트를 크롤링을 통해 수집하고, 수집된 텍스트의 연관 분석을 통해 검색 키워드를 확장하는 수집 체계를 제안한다.

  • PDF