• Title/Summary/Keyword: Association Rule Algorithm

Search Result 138, Processing Time 0.028 seconds

Network Anomaly Detection using Association Rule Mining in Network Packets (네트워크 패킷에 대한 연관 마이닝 기법을 적용한 네트워크 비정상 행위 탐지)

  • Oh, Sang-Hyun;Chang, Joong-Hyuk
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.14 no.3
    • /
    • pp.22-29
    • /
    • 2009
  • In previous work, anomaly-based intrusion detection techniques have been widely used to effectively detect various intrusions into a computer. This is because the anomaly-based detection techniques can effectively handle previously unknown intrusion methods. However, most of the previous work assumed that the normal network connections are fixed. For this reason, a new network connection may be regarded as an anomalous event. This paper proposes a new anomaly detection method based on an association-mining algorithm. The proposed method is composed of two phases: intra-packet association mining and inter-packet association mining. The performances of the proposed method are comparatively verified with JAM, which is a conventional representative intrusion detection method.

An Efficient Tree Structure Method for Mining Association Rules (트리 구조를 이용한 연관규칙의 효율적 탐색)

  • Kim, Chang-Oh;Ahn, Kwang-Il;Kim, Seong-Jip;Kim, Jae-Yearn
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.27 no.1
    • /
    • pp.30-36
    • /
    • 2001
  • We present a new algorithm for mining association rules in the large database. Association rules are the relationships of items in the same transaction. These rules provide useful information for marketing. Since Apriori algorithm was introduced in 1994, many researchers have worked to improve Apriori algorithm. However, the drawback of Apriori-based algorithm is that it scans the transaction database repeatedly. The algorithm which we propose scans the database twice. The first scanning of the database collects frequent length l-itemsets. And then, the algorithm scans the database one more time to construct the data structure Common-Item Tree which stores the information about frequent itemsets. To find all frequent itemsets, the algorithm scans Common-Item Tree instead of the database. As scanning Common-Item Tree takes less time than scanning the database, the algorithm proposed is more efficient than Apriori-based algorithm.

  • PDF

Design and implementation of data mining tool using PHP and WEKA (피에이치피와 웨카를 이용한 데이터마이닝 도구의 설계 및 구현)

  • You, Young-Jae;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.2
    • /
    • pp.425-433
    • /
    • 2009
  • Data mining is the method to find useful information for large amounts of data in database. It is used to find hidden knowledge by massive data, unexpectedly pattern, relation to new rule. We need a data mining tool to explore a lot of information. There are many data mining tools or solutions; E-Miner, Clementine, WEKA, and R. Almost of them are were focused on diversity and general purpose, and they are not useful for laymen. In this paper we design and implement a web-based data mining tool using PHP and WEKA. This system is easy to interpret results and so general users are able to handle. We implement Apriori algorithm of association rule, K-means algorithm of cluster analysis, and J48 algorithm of decision tree.

  • PDF

Personalized Recommendation Algorithm of Interior Design Style Based on Local Social Network

  • Guohui Fan;Chen Guo
    • Journal of Information Processing Systems
    • /
    • v.19 no.5
    • /
    • pp.576-589
    • /
    • 2023
  • To upgrade home style recommendations and user satisfaction, this paper proposes a personalized and optimized recommendation algorithm for interior design style based on local social network, which includes data acquisition by three-dimensional (3D) model, home-style feature definition, and style association mining. Through the analysis of user behaviors, the user interest model is established accordingly. Combined with the location-based social network of association rule mining algorithm, the association analysis of the 3D model dataset of interior design style is carried out, so as to get relevant home-style recommendations. The experimental results show that the proposed algorithm can complete effective analysis of 3D interior home style with the recommendation accuracy of 82% and the recommendation time of 1.1 minutes, which indicates excellent application effect.

Automatic Error Detection of Morpho-syntactic Errors of English Writing Using Association Rule Analysis Algorithm (연관 규칙 분석 알고리즘을 활용한 영작문 형태.통사 오류 자동 발견)

  • Kim, Dong-Sung
    • Annual Conference on Human and Language Technology
    • /
    • 2010.10a
    • /
    • pp.3-8
    • /
    • 2010
  • 본 연구에서는 일련의 연구에서 수집된 영작문 오류 유형의 정제된 자료를 토대로 연관 규칙을 생성하고, 학습을 통해서 효용성이 검증된 연관 규칙을 활용해서 영작문 데이터의 형태 통사 오류를 자동으로 탐지한다. 영작문 데이터에서 형태 통사 오류를 찾아내는 작업은 많은 시간과 자원이 소요되는 작업이므로 자동화가 필수적이다. 기존의 연구들이 통계적 모델을 활용한 어휘적 오류에 치중하거나 언어 이론적 틀에 근거한 통사 처리에 집중하는 반면에, 본 연구는 데이터 마이닝을 통해서 정제된 데이터에서 연관 규칙을 생성하고 이를 검증한 후 형태 통사 오류를 감지한다. 이전 연구들에서는 이론적 틀에 맞추어진 규칙 생성이나 언어 모델 생성을 위한 대량의 코퍼스 데이터와 같은 다량의 지식 베이스 생성이 필수적인데, 본 연구는 적은 양의 정제된 데이터를 활용한다. 영작문 오류 유형의 형태 통사 연관 규칙을 생성하기 위해서 Apriori 알고리즘을 활용하였다. 알고리즘을 통해서 생성된 연관 규칙 중 잘못된 규칙이 생성될 가능성이 있으므로, 상관성 검정, 코사인 유사도와 같은 규칙 효용성의 통계적 검증을 활용해서 타당한 규칙만을 학습하였다. 이를 통해서 축적된 연관 규칙들을 영작문 오류를 자동으로 탐지하는 실험에 활용하였다.

  • PDF

Algorithm mining Association Rules by considering Weight Support (중요지지도를 고려한 연관규칙 탐사 알고리즘)

  • Kim, Keun-Hyung;Whang, Byung-Woong;Kim, Min-Chul
    • The KIPS Transactions:PartD
    • /
    • v.11D no.3
    • /
    • pp.545-552
    • /
    • 2004
  • Association rules mining, which is one of data mining technologies, searches data among which are frequent and related to each other in database. But, although the data are not of frequent and rare in database, they have the enough worth of business information if the data ares important and strongly related to each other, In this paper, we propose the algorithm discovering association rules that consist of data, which are rare but, important and strongly related to each other in database. The proposed algorithm was evaluated through simulation. We found that the proposed algorithm discovered efficiently association rules among data, which are not frequent but, important.

An Efficient Algorithm for Mining Ranged Association Rules (영역 연관규칙 탐사를 위한 효율적 알고리즘)

  • 조일래
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.1 no.2
    • /
    • pp.169-181
    • /
    • 1997
  • Some association rules can have very high confidence in a sub-interval or a subrange of the domain, though not quite high confidence in the whole domain. In this paper, we define a ranged association rule, an association with high confidence worthy of special attention in a sub-domain, and further propose an efficient algorithm which finds out ranged association rules. The proposed algorithm is data-driven method in a sense that hypothetical subranges are built based on data distribution itself. In addition, to avoid redundant database scanning, we devise an effective in-memory data structure, that is collected through single database scanning. The simulation shows that the suggested algorithm has reliable performance at the acceptable time cost in actual application areas.

  • PDF

A Prefetch Algorithm for a Mobile Host using Association Rules (연관 규칙을 이용한 이동 호스트의 선반입 알고리즘)

  • 김호숙;용환승
    • Journal of KIISE:Databases
    • /
    • v.31 no.2
    • /
    • pp.163-173
    • /
    • 2004
  • Recently, location-based services are becoming very Popular in mobile environments. In this paper, we propose a new association based prefetch algorithm (called by STAP) that efficiently supports information service based on the large quantity of spatial database in mobile environments. We apply the spatial-temporal relations that are meaningful for location-based queries in mobile environments. Moreover, STAP considers user's mobility and the weight of spatial data. The relation of services is a new aspect not considered in previous cache politics. So STAP is the first prefetch algorithm considering the spatial-temporal relations and thus the cache policy begins to gain a new dimension. We evaluate the performance of STAP and prove the efficiency of STAP.

An Efficient Algorithm For Mining Association Rules In Main Memory Systems (대용량 주기억장치 시스템에서 효율적인 연관 규칙 탐사 알고리즘)

  • Lee, Jae-Mun
    • The KIPS Transactions:PartD
    • /
    • v.9D no.4
    • /
    • pp.579-586
    • /
    • 2002
  • This paper propose an efficient algorithm for mining association rules in the large main memory systems. To do this, the paper attempts firstly to extend the conventional algorithms such as DHP and Partition in order to be compatible to the large main memory systems and proposes secondly an algorithm to improve Partition algorithm by applying the techniques of the hash table and the bit map. The proposed algorithm is compared to the extended DHP within the experimental environments and the results show up to 65% performance improvement in comparison to the expanded DHP.

Comparison of Association Rule Learning and Subgroup Discovery for Mining Traffic Accident Data (교통사고 데이터의 마이닝을 위한 연관규칙 학습기법과 서브그룹 발견기법의 비교)

  • Kim, Jeongmin;Ryu, Kwang Ryel
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.1-16
    • /
    • 2015
  • Traffic accident is one of the major cause of death worldwide for the last several decades. According to the statistics of world health organization, approximately 1.24 million deaths occurred on the world's roads in 2010. In order to reduce future traffic accident, multipronged approaches have been adopted including traffic regulations, injury-reducing technologies, driving training program and so on. Records on traffic accidents are generated and maintained for this purpose. To make these records meaningful and effective, it is necessary to analyze relationship between traffic accident and related factors including vehicle design, road design, weather, driver behavior etc. Insight derived from these analysis can be used for accident prevention approaches. Traffic accident data mining is an activity to find useful knowledges about such relationship that is not well-known and user may interested in it. Many studies about mining accident data have been reported over the past two decades. Most of studies mainly focused on predict risk of accident using accident related factors. Supervised learning methods like decision tree, logistic regression, k-nearest neighbor, neural network are used for these prediction. However, derived prediction model from these algorithms are too complex to understand for human itself because the main purpose of these algorithms are prediction, not explanation of the data. Some of studies use unsupervised clustering algorithm to dividing the data into several groups, but derived group itself is still not easy to understand for human, so it is necessary to do some additional analytic works. Rule based learning methods are adequate when we want to derive comprehensive form of knowledge about the target domain. It derives a set of if-then rules that represent relationship between the target feature with other features. Rules are fairly easy for human to understand its meaning therefore it can help provide insight and comprehensible results for human. Association rule learning methods and subgroup discovery methods are representing rule based learning methods for descriptive task. These two algorithms have been used in a wide range of area from transaction analysis, accident data analysis, detection of statistically significant patient risk groups, discovering key person in social communities and so on. We use both the association rule learning method and the subgroup discovery method to discover useful patterns from a traffic accident dataset consisting of many features including profile of driver, location of accident, types of accident, information of vehicle, violation of regulation and so on. The association rule learning method, which is one of the unsupervised learning methods, searches for frequent item sets from the data and translates them into rules. In contrast, the subgroup discovery method is a kind of supervised learning method that discovers rules of user specified concepts satisfying certain degree of generality and unusualness. Depending on what aspect of the data we are focusing our attention to, we may combine different multiple relevant features of interest to make a synthetic target feature, and give it to the rule learning algorithms. After a set of rules is derived, some postprocessing steps are taken to make the ruleset more compact and easier to understand by removing some uninteresting or redundant rules. We conducted a set of experiments of mining our traffic accident data in both unsupervised mode and supervised mode for comparison of these rule based learning algorithms. Experiments with the traffic accident data reveals that the association rule learning, in its pure unsupervised mode, can discover some hidden relationship among the features. Under supervised learning setting with combinatorial target feature, however, the subgroup discovery method finds good rules much more easily than the association rule learning method that requires a lot of efforts to tune the parameters.