• Title/Summary/Keyword: Pattern mining

Search Result 624, Processing Time 0.037 seconds

Product Recommendation System on VLDB using k-means Clustering and Sequential Pattern Technique (k-means 클러스터링과 순차 패턴 기법을 이용한 VLDB 기반의 상품 추천시스템)

  • Shim, Jang-Sup;Woo, Seon-Mi;Lee, Dong-Ha;Kim, Yong-Sung;Chung, Soon-Key
    • The KIPS Transactions:PartD
    • /
    • v.13D no.7 s.110
    • /
    • pp.1027-1038
    • /
    • 2006
  • There are many technical problems in the recommendation system based on very large database(VLDB). So, it is necessary to study the recommendation system' structure and the data-mining technique suitable for the large scale Internet shopping mail. Thus we design and implement the product recommendation system using k-means clustering algorithm and sequential pattern technique which can be used in large scale Internet shopping mall. This paper processes user information by batch processing, defines the various categories by hierarchical structure, and uses a sequential pattern mining technique for the search engine. For predictive modeling and experiment, we use the real data(user's interest and preference of given category) extracted from log file of the major Internet shopping mall in Korea during 30 days. And we define PRP(Predictive Recommend Precision), PRR(Predictive Recommend Recall), and PF1(Predictive Factor One-measure) for evaluation. In the result of experiments, the best recommendation time and the best learning time of our system are much as O(N) and the values of measures are very excellent.

A Methodology for Improving fitness of the Latent Growth Modeling using Association Rule Mining (연관규칙을 이용한 잠재성장모형의 개선방법론)

  • Cho, Yeong Bin;Jun, Jae-Hoon;Choi, Byungwoo
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.2
    • /
    • pp.217-225
    • /
    • 2019
  • The Latent Growth Modeling(LGM) is known as the typical analysis method of longitudinal data and it could be classified into unconditional model and conditional model. It is common to assume that the growth trajectory of unconditional model of LGM is linear. In the case of quasi-linear, the methodology for improving the model fitness using Sequential Pattern of Association Rule Mining is suggested. To do this, we divide longitudinal data into quintiles and extract periodic changes of the longitudinal data in each quintiles and make sequential pattern based on this periodic changes. To evaluate the effectiveness, the LGM module in SPSS AMOS was used and the dataset of the Youth Panel from 2001 to 2006 of Korea Employment Information Service. Our methodology was able to increase the fitness of the model compared to the simple linear growth trajectory.

A Study on Improving the predict accuracy rate of Hybrid Model Technique Using Error Pattern Modeling : Using Logistic Regression and Discriminant Analysis

  • Cho, Yong-Jun;Hur, Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.2
    • /
    • pp.269-278
    • /
    • 2006
  • This paper presents the new hybrid data mining technique using error pattern, modeling of improving classification accuracy. The proposed method improves classification accuracy by combining two different supervised learning methods. The main algorithm generates error pattern modeling between the two supervised learning methods(ex: Neural Networks, Decision Tree, Logistic Regression and so on.) The Proposed modeling method has been applied to the simulation of 10,000 data sets generated by Normal and exponential random distribution. The simulation results show that the performance of proposed method is superior to the existing methods like Logistic regression and Discriminant analysis.

  • PDF

Optimization-Based Pattern Generation for LAD (최적화에 근거한 LAD의 패턴생성 기법)

  • Jang, In-Yong;Ryoo, Hong-Seo
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2005.10a
    • /
    • pp.409-413
    • /
    • 2005
  • The logical analysis of data(LAD) is an effective Boolean-logic based data mining tool. A critical step in analyzing data by LAD is the pattern generation stage where useful knowledge and hidden structural information in data is discovered in the form of patterns. A conventional method for pattern generation in LAD is based on term enumeration that renders the generation of higher degree patterns practically impossible. In this paper, we present a new optimization-based pattern generation methodology and propose two mathematical programming medels, a mixed 0-1 integer and linear programming(MILP) formulation and a well-studied set covering problem(SCP) formulation for the generation of optimal and heuristic patterns, respectively. With benchmark datasets, we demonstrate the effectiveness of our models by automatically generating with much ease patterns of high complexity that cannot be generated with the conventional approach.

  • PDF

웹 페이지 방문 시간을 고려한 연관 규칙 탐색

  • Gang, Hyeong-Chang;Kim, Ik-Chan;Kim, Cheol-Su
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2005.05a
    • /
    • pp.263-269
    • /
    • 2005
  • Users who use Web site wish to get information conveniently. To users who web site operators use Web site differentiation to provide done service pattern analysis by user do must. Association rule is one of data Mining techniques for pattern discovery. If search for pattern by user, differentiation by user done service offer can. Association rule search result that pattern by user can know, and considers web page visiting time for association rule search differentiation done web structure service and recommendation service possible.

  • PDF

Association Rule by Considering Users Web Site Visiting Time (사용자 웹 사이트 방문 시간을 고려한 연관 규칙)

  • Kang, Hyung-Chang;Kim, Chul-Soo;Lee, Dong-Cheol
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.29 no.2
    • /
    • pp.104-109
    • /
    • 2006
  • We can offer suitable information to users analyzing the pattern of users. An association rule is one of data mining techniques which can discover the pattern. We use an association rule which considers the web page visiting time and we should the pattern analyse of users. The offered method puts the weights in Web page visiting time of the user and produces an association rule. Weight is web page visiting time unit divide to total of web page visiting time. We offer rather meaningful result the association rule by Apriori algorithm. This method that proposes in the paper offers rather meaningful result Apriori algorithm

Mining Interesting Sequential Pattern with a Time-interval Constraint for Efficient Analyzing a Web-Click Stream (웹 클릭 스트림의 효율적 분석을 위한 시간 간격 제한을 활용한 관심 순차패턴 탐색)

  • Chang, Joong-Hyuk
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.16 no.2
    • /
    • pp.19-29
    • /
    • 2011
  • Due to the development of web technologies and the increasing use of smart devices such as smart phone, in recent various web services are widely used in many application fields. In this environment, the topic of supporting personalized and intelligent web services have been actively researched, and an analysis technique on a web-click stream generated from web usage logs is one of the essential techniques related to the topic. In this paper, for efficient analyzing a web-click stream of sequences, a sequential pattern mining technique is proposed, which satisfies the basic requirements for data stream processing and finds a refined mining result. For this purpose, a concept of interesting sequential patterns with a time-interval constraint is defined, which uses not on1y the order of items in a sequential pattern but also their generation times. In addition, A mining method to find the interesting sequential patterns efficiently over a data stream such as a web-click stream is proposed. The proposed method can be effectively used to various computing application fields such as E-commerce, bio-informatics, and USN environments, which generate data as a form of data streams.

Finding Frequent Itemsets Over Data Streams in Confined Memory Space (한정된 메모리 공간에서 데이터 스트림의 빈발항목 최적화 방법)

  • Kim, Min-Jung;Shin, Se-Jung;Lee, Won-Suk
    • The KIPS Transactions:PartD
    • /
    • v.15D no.6
    • /
    • pp.741-754
    • /
    • 2008
  • Due to the characteristics of a data stream, it is very important to confine the memory usage of a data mining process regardless of the amount of information generated in the data stream. For this purpose, this paper proposes the Prime pattern tree(PPT) for finding frequent itemsets over data streams with using the confined memory space. Unlike a prefix tree, a node of a PPT can maintain the information necessary to estimate the current supports of several itemsets together. The length of items in a prime pattern can be reduced the total number of nodes and controlled by split_delta $S_{\delta}$. The size and the accuracy of the PPT is determined by $S_{\delta}$. The accuracy is better as the value of $S_{\delta}$ is smaller since the value of $S_{\delta}$ is large, many itemsets are estimated their frequencies. So it is important to consider trade-off between the size of a PPT and the accuracy of the mining result. Based on this characteristic, the size and the accuracy of the PPT can be flexibly controlled by merging or splitting nodes in a mining process. For finding all frequent itemsets over the data stream, this paper proposes a PPT to replace the role of a prefix tree in the estDec method which was proposed as a previous work. It is efficient to optimize the memory usage for finding frequent itemsets over a data stream in confined memory space. Finally, the performance of the proposed method is analyzed by a series of experiments to identify its various characteristics.

Spatiotemporal Pattern Mining Technique for Location-Based Service System

  • Vu, Nhan Thi Hong;Lee, Jun-Wook;Ryu, Keun-Ho
    • ETRI Journal
    • /
    • v.30 no.3
    • /
    • pp.421-431
    • /
    • 2008
  • In this paper, we offer a new technique to discover frequent spatiotemporal patterns from a moving object database. Though the search space for spatiotemporal knowledge is extremely challenging, imposing spatial and timing constraints on moving sequences makes the computation feasible. The proposed technique includes two algorithms, AllMOP and MaxMOP, to find all frequent patterns and maximal patterns, respectively. In addition, to support the service provider in sending information to a user in a push-driven manner, we propose a rule-based location prediction technique to predict the future location of the user. The idea is to employ the algorithm AllMOP to discover the frequent movement patterns in the user's historical movements, from which frequent movement rules are generated. These rules are then used to estimate the future location of the user. The performance is assessed with respect to precision and recall. The proposed techniques could be quite efficiently applied in a location-based service (LBS) system in which diverse types of data are integrated to support a variety of LBSs.

  • PDF

Design AND IMPLEMENTATION of A News letter system using fuzzy association rules (퍼지 연관규칙을 이용한 뉴스레터 시스템 설계 및 구현)

  • 정연홍;박우수;박규석
    • Journal of Internet Computing and Services
    • /
    • v.3 no.5
    • /
    • pp.41-49
    • /
    • 2002
  • Web mining can be broadly defined as the discovery and analysis of useful information from the World Wide Web. In this paper. we tried to analyze a user access pattern and designed a system which can supply useful information to users through the web mining, The proposed system can search the information of users pattern through the web site and news letters, and pass through classification of category through filtering, The fuzzy association rules are applied to the users who access recently, to each category that generated though these processes, and compares the generated sets to each users-access pages set, and it can send appropriate news letter to each user.

  • PDF