• Title/Summary/Keyword: pattern mining

Search Result 621, Processing Time 0.027 seconds

Characteristics on Inconsistency Pattern Modeling as Hybrid Data Mining Techniques (혼합 데이터 마이닝 기법인 불일치 패턴 모델의 특성 연구)

  • Hur, Joon;Kim, Jong-Woo
    • Journal of Information Technology Applications and Management
    • /
    • v.15 no.1
    • /
    • pp.225-242
    • /
    • 2008
  • PM (Inconsistency Pattern Modeling) is a hybrid supervised learning technique using the inconsistence pattern of input variables in mining data sets. The IPM tries to improve prediction accuracy by combining more than two different supervised learning methods. The previous related studies have shown that the IPM was superior to the single usage of an existing supervised learning methods such as neural networks, decision tree induction, logistic regression and so on, and it was also superior to the existing combined model methods such as Bagging, Boosting, and Stacking. The objectives of this paper is explore the characteristics of the IPM. To understand characteristics of the IPM, three experiments were performed. In these experiments, there are high performance improvements when the prediction inconsistency ratio between two different supervised learning techniques is high and the distance among supervised learning methods on MDS (Multi-Dimensional Scaling) map is long.

  • PDF

Detecting smartphone user habits using sequential pattern analysis

  • Lu, Dang Nhac;Nguyen, Thu Trang;Nguyen, Thi Hau;Nguyen, Ha Nam;Choi, Gyoo Seok
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.7 no.1
    • /
    • pp.20-22
    • /
    • 2015
  • Recently, the study of smart phone user habits has become a highly focused topic due to the rapid growth of the smart phone market. Indeed, sequential pattern analysis methods were efficiently used for web-based user habit mining long time ago. However, by means of simulations, it has been observed that these methods might fail for smart phone-based user habit mining. In this paper, we propose a novel approach that leads to a considerably increased performance of the traditional sequential pattern analysis methods by reasonably cutting off each chronological sequence of user logs on a device into shorter ones, which represent the sequential user activities in various periods of a day.

A Comparison of Clustering Algorithm in Data Mining

  • Lee, Yung-Seop;An, Mi-Young
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.4
    • /
    • pp.725-736
    • /
    • 2003
  • To provide the information needed to make a decision, it is important to know the relationship or pattern between variables in database. Grouping objects which have similar characteristics of pattern is called as cluster analysis, one of data mining techniques. In this study, it is compared with several partitioning clustering algorithms, based on the statistical distance or total variance in each cluster.

  • PDF

A Method for Optimal Moving Pattern Mining using Frequency of Moving Sequence (이동 시퀀스의 빈발도를 이용한 최적 이동 패턴 탐사 기법)

  • Lee, Yon-Sik;Ko, Hyun
    • The KIPS Transactions:PartD
    • /
    • v.16D no.1
    • /
    • pp.113-122
    • /
    • 2009
  • Since the traditional pattern mining methods only probe unspecified moving patterns that seem to satisfy users' requests among diverse patterns within the limited scopes of time and space, they are not applicable to problems involving the mining of optimal moving patterns, which contain complex time and space constraints, such as 1) searching the optimal path between two specific points, and 2) scheduling a path within the specified time. Therefore, in this paper, we illustrate some problems on mining the optimal moving patterns with complex time and space constraints from a vast set of historical data of numerous moving objects, and suggest a new moving pattern mining method that can be used to search patterns of an optimal moving path as a location-based service. The proposed method, which determines the optimal path(most frequently used path) using pattern frequency retrieved from historical data of moving objects between two specific points, can efficiently carry out pattern mining tasks using by space generalization at the minimum level on the moving object's location attribute in consideration of topological relationship between the object's location and spatial scope. Testing the efficiency of this algorithm was done by comparing the operation processing time with Dijkstra algorithm and $A^*$ algorithm which are generally used for searching the optimal path. As a result, although there were some differences according to heuristic weight on $A^*$ algorithm, it showed that the proposed method is more efficient than the other methods mentioned.

Mining Frequent Closed Sequences using a Bitmap Representation (비트맵을 사용한 닫힌 빈발 시퀀스 마이닝)

  • Kim Hyung-Geun;Whang Whan-Kyu
    • The KIPS Transactions:PartD
    • /
    • v.12D no.6 s.102
    • /
    • pp.807-816
    • /
    • 2005
  • Sequential pattern mining finds all of the frequent sequences satisfying a minimum support threshold in a large database. However, when mining long frequent sequences, or when using very low support thresholds, the performance of currently reported algorithms often degrades dramatically. In this paper, we propose a novel sequential pattern algorithm using only closed frequent sequences which are small subset of very large frequent sequences. Our algorithm generates the candidate sequences by depth-first search strategy in order to effectively prune. using bitmap representation of underlying databases, we can effectively calculate supports in terms of bit operations and prune sequences in much less time. Performance study shows that our algorithm outperforms the previous algorithms.

Applications of artificial intelligence and data mining techniques in soil modeling

  • Javadi, A.A.;Rezania, M.
    • Geomechanics and Engineering
    • /
    • v.1 no.1
    • /
    • pp.53-74
    • /
    • 2009
  • In recent years, several computer-aided pattern recognition and data mining techniques have been developed for modeling of soil behavior. The main idea behind a pattern recognition system is that it learns adaptively from experience and is able to provide predictions for new cases. Artificial neural networks are the most widely used pattern recognition methods that have been utilized to model soil behavior. Recently, the authors have pioneered the application of genetic programming (GP) and evolutionary polynomial regression (EPR) techniques for modeling of soils and a number of other geotechnical applications. The paper reviews applications of pattern recognition and data mining systems in geotechnical engineering with particular reference to constitutive modeling of soils. It covers applications of artificial neural network, genetic programming and evolutionary programming approaches for soil modeling. It is suggested that these systems could be developed as efficient tools for modeling of soils and analysis of geotechnical engineering problems, especially for cases where the behavior is too complex and conventional models are unable to effectively describe various aspects of the behavior. It is also recognized that these techniques are complementary to conventional soil models rather than a substitute to them.

Parallel Data Mining with Distributed Frequent Pattern Trees (분산형 FP트리를 활용한 병렬 데이터 마이닝)

  • 조두산;김동승
    • Proceedings of the IEEK Conference
    • /
    • 2003.07c
    • /
    • pp.2561-2564
    • /
    • 2003
  • Data mining is an effective method of the discovery of useful information such as rules and previously unknown patterns existing in large databases. The discovery of association rules is an important data mining problem. We have developed a new parallel mining called Distributed Frequent Pattern Tree (abbreviated by DFPT) algorithm on a distributed shared nothing parallel system to detect association rules. DFPT algorithm is devised for parallel execution of the FP-growth algorithm. It needs only two full disk data scanning of the database by eliminating the need for generating the candidate items. We have achieved good workload balancing throughout the mining process by distributing the work equally to all processors. We implemented the algorithm on a PC cluster system, and observed that the algorithm outperformed the Improved Count Distribution scheme.

  • PDF

Context Ontology and Trigger Rule Design for Service Pattern Mining (서비스 패턴 마이닝을 위한 컨텍스트 온톨로지 및 트리거 규칙 설계)

  • Hwang, Jeong-Hee
    • Journal of Digital Contents Society
    • /
    • v.13 no.3
    • /
    • pp.291-299
    • /
    • 2012
  • Ubiquitous computing is a technique to provide users with appropriate services, collecting the context information in somewhere by attached sensor. An intelligent system needs to automatically update services according to the user's various circumstances. To do this, in this paper, we propose a design of context ontology, trigger rule for mining service pattern related to users activity and an active mining architecture integrating trigger system. The proposed system is a framework for active mining user activity and service pattern by considering the relation between user context and object based on trigger system.

Temporal Classification Method for Forecasting Power Load Patterns From AMR Data

  • Lee, Heon-Gyu;Shin, Jin-Ho;Park, Hong-Kyu;Kim, Young-Il;Lee, Bong-Jae;Ryu, Keun-Ho
    • Korean Journal of Remote Sensing
    • /
    • v.23 no.5
    • /
    • pp.393-400
    • /
    • 2007
  • We present in this paper a novel power load prediction method using temporal pattern mining from AMR(Automatic Meter Reading) data. Since the power load patterns have time-varying characteristic and very different patterns according to the hour, time, day and week and so on, it gives rise to the uninformative results if only traditional data mining is used. Also, research on data mining for analyzing electric load patterns focused on cluster analysis and classification methods. However despite the usefulness of rules that include temporal dimension and the fact that the AMR data has temporal attribute, the above methods were limited in static pattern extraction and did not consider temporal attributes. Therefore, we propose a new classification method for predicting power load patterns. The main tasks include clustering method and temporal classification method. Cluster analysis is used to create load pattern classes and the representative load profiles for each class. Next, the classification method uses representative load profiles to build a classifier able to assign different load patterns to the existing classes. The proposed classification method is the Calendar-based temporal mining and it discovers electric load patterns in multiple time granularities. Lastly, we show that the proposed method used AMR data and discovered more interest patterns.

A Personalized Automatic TV Program Scheduler using Sequential Pattern Mining (순차 패턴 마이닝 기법을 이용한 개인 맞춤형 TV 프로그램 스케줄러)

  • Pyo, Shin-Jee;Kim, Eun-Hui;Kim, Mun-Churl
    • Journal of Broadcast Engineering
    • /
    • v.14 no.5
    • /
    • pp.625-637
    • /
    • 2009
  • With advent of TV environment and increasing of variety of program contents, users are able to experience more various and complex environment for watching TV contents. According to the change of content watching environment, users have to make more efforts to choose his/her interested TV program contents or TV channels than before. Also, the users usually watch the TV program contents with their own regular way. So, in this paper, we suggests personalized TV program schedule recommendation system based on the analyzing users' TV watching history data. And we extract the users' watched program patterns using the sequential pattern mining method. Also, we proposed a new sequential pattern mining which is suitable for TV watching environment and verify our proposed method have better performance than existing sequential pattern mining method in our application area. In the future, we will consider a VoD characteristic for extending to IPTV program schedule recommendation system.