• Title/Summary/Keyword: 빈발 항목

Search Result 99, Processing Time 0.023 seconds

Mining Association Rule on Service Data using Frequency and Weight (빈발도와 가중치를 이용한 서비스 연관 규칙 마이닝)

  • Hwang, Jeong Hee
    • Journal of Digital Contents Society
    • /
    • v.17 no.2
    • /
    • pp.81-88
    • /
    • 2016
  • The general frequent pattern mining considers frequency and support of items. To extract useful information, it is necessary to consider frequency and weight of items that reflects the changing of user interest as time passes. The suitable services considering time or location is requested by user so that the weighted mining method is necessary. We propose a method of weighted frequent pattern mining based on service ontology. The weight considering time and location is given to service items and it is applied to association rule mining method. The extracted rule is combined with stored service rule and it is based on timely service to offer for user.

Frequent Items Mining based on Regression Model in Data Streams (스트림 데이터에서 회귀분석에 기반한 빈발항목 예측)

  • Lee, Uk-Hyun
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.1
    • /
    • pp.147-158
    • /
    • 2009
  • Recently, the data model in stream data environment has massive, continuous, and infinity properties. However the stream data processing like query process or data analysis is conducted using a limited capacity of disk or memory. In these environment, the traditional frequent pattern discovery on transaction database can be performed because it is difficult to manage the information continuously whether a continuous stream data is the frequent item or not. In this paper, we propose the method which we are able to predict the frequent items using the regression model on continuous stream data environment. We can use as a prediction model on indefinite items by constructing the regression model on stream data. We will show that the proposed method is able to be efficiently used on stream data environment through a variety of experiments.

Search Method of the time sensitive frequent itemsets (시간에 따른 가변성을 고려한 상대적인 빈발항목 탐색방법)

  • Park, Tae-Su;Lee, Ju-Hong;Park, Sun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2005.11a
    • /
    • pp.97-100
    • /
    • 2005
  • 최근 유비쿼터스 컴퓨팅 및 인터넷 서비스에 대한 관심이 증대되면서, 대용량의 데이터에 내재되어 있는 정보를 빠른 시간 내에 처리하여 새로운 지식을 창출하려는 요구가 증가하고 있다. 데이터 마이닝 기법을 이용하여 데이터 스트림에서 빈발항목을 탐색하는 기존의 연구는 시간을 고려하지 않고 단순히 집계를 통하여 빈발항목을 탐색하기 때문에 정확성을 보장하지 못한다. 따라서 본 논문에서는 데이터 스트림에서 시간적 측면을 고려하여 상대적인 빈발항목을 탐색하기 위한 새로운 알고리즘을 제안하고자 한다. 논문에서 제안하는 알고리즘의 성능은 다양한 실험을 통해서 검증된다.

  • PDF

A New Method for Efficiently Generating of Frequent Items by IRG in Data Mining (데이터 마이닝에서 IRG에 의한 효율적인 빈발항목 생성방법)

  • 허용도;이광형
    • Journal of Korea Multimedia Society
    • /
    • v.5 no.1
    • /
    • pp.120-127
    • /
    • 2002
  • The common problems found in the data mining methods current in use have following problems. First: It is ineffective in searching for frequent items due to changing of minimal support values. Second: It is not adaptable to occurring of unuseful relation rules. Third: It is very difficult to re-use preceding results while adding new transactions. In this paper, we introduce a new method named as SPM-IRG(Selective Patters Mining using item Relation Graph), that is designed to solve above listed problems. SPM-IRG method creates a frequent items using minimal support values obtained by investigating direct or indirect relation of all items in transaction. Moreover, the new method can minimize inefficiency of existing method by constructing frequent items using only the items that we are interested.

  • PDF

A Method for Generating Large-Interval Itemset using Locality of Data (데이터의 지역성을 이용한 빈발구간 항목집합 생성방법)

  • 박원환;박두순
    • Journal of Korea Multimedia Society
    • /
    • v.4 no.5
    • /
    • pp.465-475
    • /
    • 2001
  • Recent1y, there is growing attention on the researches of inducing association rules from large volume of database. One of them is the method that can be applied to quantitative attribute data. This paper presents a new method for generating large-interval itemsets, which uses locality for partitioning the range of data. This method can minimize the loss of data-inherent characteristics by generating denser large-interval items than other methods. Performance evaluation results show that our new approach is more efficient than previously proposed techniques.

  • PDF

Dummy Data Insert Scheme for Privacy Preserving Frequent Itemset Mining in Data Stream (데이터 스트림 빈발항목 마이닝의 프라이버시 보호를 위한 더미 데이터 삽입 기법)

  • Jung, Jay Yeol;Kim, Kee Sung;Jeong, Ik Rae
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.23 no.3
    • /
    • pp.383-393
    • /
    • 2013
  • Data stream mining is a technique to obtain the useful information by analyzing the data generated in real time. In data stream mining technology, frequent itemset mining is a method to find the frequent itemset while data is transmitting, and these itemsets are used for the purpose of pattern analyze and marketing in various fields. Existing techniques of finding frequent itemset mining are having problems when a malicious attacker sniffing the data, it reveals data provider's real-time information. These problems can be solved by using a method of inserting dummy data. By using this method, a attacker cannot distinguish the original data from the transmitting data. In this paper, we propose a method for privacy preserving frequent itemset mining by using the technique of inserting dummy data. In addition, the proposed method is effective in terms of calculation because it does not require encryption technology or other mathematical operations.

A Large-Interval Itemsets Generation Method for Mining Quantitative Association Rules (수량 연관규칙 탐사를 위한 빈발구간 항목집합 생성방법)

  • 박원환;박두순;유기형;손진곤
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2001.11a
    • /
    • pp.402-407
    • /
    • 2001
  • 대용량의 데이터베이스로부터 연관규칙을 발견하고자 하는 연구가 활발하며, 수량 데이터의 항복에도 적용할 수 있도록 이들 방법을 확장하는 연구가 최근에 소개되고 있다. 본 논문에서는 수량 데이터 항목을 이진 항목으로 변환하기 위하여 빈발구간 항목집합을 생성할 때, 수량 데이터 항목의 정의 영역 내에서 특정 영역에 집중하여 발생하는 특성인 지역성을 이용하는 방법을 제안한다. 이 방법은 기존의 방법보다 많은 수의 세밀한 빈발구간 항목들을 생성할 수 있을 뿐만 아니라 세밀의 정도를 판단하여 활용할 수 있는 생성순서 정보도 포함하고 있어, 원 데이터가 가지고 있는 특성의 손실을 최소화한 수 있는 특징이 있다. 성능평가를 통하여 기존의 방법보다 우수함을 보였다.

  • PDF

Improved Association Rule Mining by Modified Trimming (트리밍 방식 수정을 통한 연관규칙 마이닝 개선)

  • Hwang, Won-Tae;Kim, Dong-Seung
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.3
    • /
    • pp.15-21
    • /
    • 2008
  • This paper presents a new association mining algorithm that uses two phase sampling for shortening the execution time at the cost of precision of the mining result. Previous FAST(Finding Association by Sampling Technique) algorithm has the weakness in that it only considered the frequent 1-itemsets in trimming/growing, thus, it did not have ways of considering mulit-itemsets including 2-itemsets. The new algorithm reflects the multi-itemsets in sampling transactions. It improves the mining results by adjusting the counts of both missing itemsets and false itemsets. Experimentally, on a representative synthetic database, the algorithm produces a sampled subset of results with an increased accuracy in terms of the 2-itemsets while it maintains the same 1uality of the data set.

An Algorithm for reducing the search time of Frequent Items (빈발 항목의 탐색 시간을 단축하기 위한 알고리즘)

  • Yun, So-Young;Youn, Sung-Dae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.1
    • /
    • pp.147-156
    • /
    • 2011
  • With the increasing utility of the recent information system, the methods to pick up necessary products rapidly by using a lot of data has been studied. Association rule search methods to find hidden patterns has been drawing much attention, and the Apriori algorithm is a major method. However, the Apriori algorithm increases search time due to its repeated scans. This paper proposes an algorithm to reduce searching time of frequent items. The proposed algorithm creates matrix using transaction database and search for frequent items using the mean number of items of transactions at matrix and a defined minimum support. The mean number of items of transactions is used to reduce the number of transactions, and the minimum support to cut down on items. The performance of the proposed algorithm is assessed by the comparison of search time and precision with existing algorithms. The findings from this study indicated that the proposed algorithm has been searched more quickly and efficiently when extracting final frequent items, compared to existing Apriori and Matrix algorithm.

Adaptive Frequent Pattern Algorithm using CAWFP-Tree based on RHadoop Platform (RHadoop 플랫폼기반 CAWFP-Tree를 이용한 적응 빈발 패턴 알고리즘)

  • Park, In-Kyu
    • Journal of Digital Convergence
    • /
    • v.15 no.6
    • /
    • pp.229-236
    • /
    • 2017
  • An efficient frequent pattern algorithm is essential for mining association rules as well as many other mining tasks for convergence with its application spread over a very broad spectrum. Models for mining pattern have been proposed using a FP-tree for storing compressed information about frequent patterns. In this paper, we propose a centroid frequent pattern growth algorithm which we called "CAWFP-Growth" that enhances he FP-Growth algorithm by making the center of weights and frequencies for the itemsets. Because the conventional constraint of maximum weighted support is not necessary to maintain the downward closure property, it is more likely to reduce the search time and the information loss of the frequent patterns. The experimental results show that the proposed algorithm achieves better performance than other algorithms without scarifying the accuracy and increasing the processing time via the centroid of the items. The MapReduce framework model is provided to handle large amounts of data via a pseudo-distributed computing environment. In addition, the modeling of the proposed algorithm is required in the fully distributed mode.