• Title/Summary/Keyword: Pattern mining

Search Result 624, Processing Time 0.036 seconds

An Efficient Search Method for High Confidence Association Rules Using CP(Confidence Pattern)-Tree Structure (CP-Tree구조를 이용한 높은 신뢰도를 갖는 연관 규칙의 효율적 탐색 방법)

  • 송한규;김재련
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.25 no.1
    • /
    • pp.1-8
    • /
    • 2002
  • The traditional approaches of association rule mining have relied on high support condition to find interesting rules. However, in some application such as analyzing the web page link and discovering some unusual combinations of some factors that have always caused some disease, we are interested in rules with high confidence that have very low support or need not have high support. In these cases, the traditional algorithms are not suitable since it relies on first satisfying high support. In this paper, we propose a new model, CP(Confidence Pattern)-Tree, to identify high confidence rule between 2-items without support constraint. constraint. In addition, we discuss confidence association rule between two more items without support constraint.

A Criterion on Profiling for Anomaly Detection (이상행위 탐지를 위한 프로파일링 기준)

  • 조혁현;정희택;김민수;노봉남
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.3
    • /
    • pp.544-551
    • /
    • 2003
  • Internet as being generalized, intrusion detection system is needed to protect computer system from intrusions synthetically. We propose a criterion on profiling for intrusion detection system using anomaly detection. We present the cause of false positive on profiling and propose anomaly method to control this. Finally, we propose similarity function to decide whether anomaly action or not for user pattern using pattern database.

A GEOSENSOR FILTER FOR PROCESSING GEOSENSOR QUERIES ON DATA STREAMS

  • Lee, Dong-Gyu;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • 2008.10a
    • /
    • pp.119-121
    • /
    • 2008
  • Pattern matching is increasingly being employed in various researches as health care service, RFID-based system, facility management, and surveillance. Geosensor filter correlates a data stream to match specific patterns in distribution environments. In this paper, we present a geosensor query language to represent efficiently declarative geosensor query. Geosensor operators are proposed to use for fast query processing in terms of spatial and temporal area in distribution environments. We also propose a geosensor filter to match new query predicates into incoming stream predicates. Our filter can reduce the volume of transmission data and save power consumption of sensors. It can be utilized the stream data mining system to process in real-time various data as location, time, and geosensor information in distribution environments.

  • PDF

Frequent Itemset Search Using LSI Similarity (LSI 유사도를 이용한 효율적인 빈발항목 탐색 알고리즘)

  • Ko, Younhee;Kim, Hyeoncheol;Lee, Wongyu
    • The Journal of Korean Association of Computer Education
    • /
    • v.6 no.1
    • /
    • pp.1-8
    • /
    • 2003
  • We introduce a efficient vertical mining algorithm that reduces searching complexity for frequent k-itemsets significantly. This method includes sorting items by their LSI(Least Support Itemsets) similarity and then searching frequent itemsets in tree-based manner. The search tree structure provides several useful heuristics and therefore, reduces search space significantly at early stages. Experimental results on various data sets shows that the proposed algorithm improves searching performance compared to other algorithms, especially for a database having long pattern.

  • PDF

Pattern Matching Automata for the Extraction of Protein Names (단백질 이름 추출을 위한 패턴 매칭 오토마타)

  • Park Jun-Hyung;Hong Ki-Ho;Yang Ji-Hoon
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.06a
    • /
    • pp.28-30
    • /
    • 2006
  • 텍스트마이닝(text mining) 기법을 통해 생물학 문헌으로부터 단백질 이름과 그들 간의 상호 관계를 추출하는 시스템이 제안된 바 있다[1]. 이 시스템에서 단백질 이름을 추출하는 과정을 패턴 일치 오토마타(PMA: Pattern Matching Automata)라는 방법을 이용하여 좀 더 유연하고 높은 성능을 가지도록 개선할 수 있었다. 본 논문은 예제를 통해 PMA의 학습, 테스트 과정과 결과를 설명함으로써 단백질 이름 추출작업에서의 PMA의 가능성과 성능 향상을 위한 앞으로의 방안을 제시한다.

  • PDF

A multi-layed neural network learning procedure and generating architecture method for improving neural network learning capability (다층신경망의 학습능력 향상을 위한 학습과정 및 구조설계)

  • 이대식;이종태
    • Korean Management Science Review
    • /
    • v.18 no.2
    • /
    • pp.25-38
    • /
    • 2001
  • The well-known back-propagation algorithm for multi-layered neural network has successfully been applied to pattern c1assification problems with remarkable flexibility. Recently. the multi-layered neural network is used as a powerful data mining tool. Nevertheless, in many cases with complex boundary of classification, the successful learning is not guaranteed and the problems of long learning time and local minimum attraction restrict the field application. In this paper, an Improved learning procedure of multi-layered neural network is proposed. The procedure is based on the generalized delta rule but it is particular in the point that the architecture of network is not fixed but enlarged during learning. That is, the number of hidden nodes or hidden layers are increased to help finding the classification boundary and such procedure is controlled by entropy evaluation. The learning speed and the pattern classification performance are analyzed and compared with the back-propagation algorithm.

  • PDF

CLUSTER ANALYSIS FOR REGION ELECTRIC LOAD FORECASTING SYSTEM

  • Park, Hong-Kyu;Kim, Young-Il;Park, Jin-Hyoung;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • 2007.10a
    • /
    • pp.591-593
    • /
    • 2007
  • This paper is to cluster the AMR (Automatic Meter Reading) data. The load survey system has been applied to record the power consumption of sampling the contract assortment in KEPRI AMR. The effect of the contract assortment change to the customer power consumption is determined by executing the clustering on the load survey results. We can supply the power to customer according to usage to the analysis cluster. The Korea a class of the electricity supply type is less than other country. Because of the Korea electricity markets exists one electricity provider. Need to further divide of electricity supply type for more efficient supply. We are found pattern that is different from supplied type to customer. Out experiment use the Clementine which data mining tools.

  • PDF

Suggesting Blasting Design for Kazakhstan mine using Korea Mining Technology (국내 광산 기술을 적용한 카자흐스탄 광산 발파설계 제안)

  • Jin, Yeon-Ho;Min, Hyung-Dong;Jeong, Min-Su;Park, Yoon-Suk;Heo, Eui-Haeng;Nurmatov, Murod
    • Explosives and Blasting
    • /
    • v.32 no.1
    • /
    • pp.10-17
    • /
    • 2014
  • In this study, the information achieved from the visit to Kazakhmys mine in Kazakhstan was introduced. An optimal blasting pattern designed for the mine with the application of Korean blast technology was suggested. As a result, it was found that the blast design can reduce the consumption of explosives and the number of drill holes. The blast design can reduce the overall production cost in the mine.

Analysis of Commercial Facility Locational Pattern Using GIS and Spatial Data Mining (GIS와 공간데이터마이닝을 이용한 상업시설물의 입지패턴 분석)

  • Hong, Sung-Eon;Lee, Yong-Ik
    • Proceedings of the KAIS Fall Conference
    • /
    • 2010.05b
    • /
    • pp.630-633
    • /
    • 2010
  • 입지분석은 공간 및 비공간적 특성이 중요하게 다루어져야 함에도 불구하고 공간데이터 타입(spatial data type), 공간관계(spatial relationship), 그리고 공간 자기상관성(spatial autocorrelation)의 복잡성에 기인한 처리의 어려움으로 인해 기하학적거리나 공간적 위치와 같은 단순 공간적 특성만 이용되었다. 본 연구에서는 서울시 대형할인점을 사례로하여로 GIS에 의한 공간데이터와 비공간데이터(인구통계 등)를 통합 구축한 후, 공간데이터마이닝 기법을 이용하여 입지패턴(location pattern)을 분석 추출하여 보고자 한다.

  • PDF

A Construction of Fuzzy Model for Data Mining (데이터 마이닝을 위한 퍼지 모델 동정)

  • Kim, Do-Wan;Park, Jin-Bae;Kim, Jung-Chan;Joo, Young-Hoon
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2002.12a
    • /
    • pp.191-194
    • /
    • 2002
  • In this paper, a new GA-based methodology with information granules is suggested for construction of the fuzzy classifier. We deal with the selection of the fuzzy region as well as two major classification problems-the feature selection and the pattern classification. The proposed method consists of three steps: the selection of the fuzzy region, the construction of the fuzzy sets, and the tuning of the fuzzy rules. The genetic algorithms (GAs) are applied to the development of the information granules so as to decide the satisfactory fuzzy regions. Finally, the GAs are also applied to the tuning procedure of the fuzzy rules in terms of the management of the misclassified data (e.g., data with the strange pattern or on the boundaries of the classes). To show the effectiveness of the proposed method, an example-the classification of the Iris data, is provided.