• Title/Summary/Keyword: Pattern mining

Search Result 624, Processing Time 0.033 seconds

Currents in Integrative Biochip Informatics

  • Kim, Ju-Han
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2001.10a
    • /
    • pp.1-9
    • /
    • 2001
  • scale genomic and postgenomic data means that many of the challenges in biomedical research are now challenges in computational sciences and information technology. The informatics revolutions both in clinical informatics and bioinformatics will change the current paradigm of biomedical sciences and practice of clinical medicine, including diagnostics, therapeutics, and prognostics. Postgenome informatics, powered by high throughput technologies and genomic-scale databases, is likely to transform our biomedical understanding forever much the same way that biochemistry did a generation ago. In this talk, 1 will describe how these technologies will in pact biomedical research and clinical care, emphasizing recent advances in biochip-based functional genomics. Basic data preprocessing with normalization and filtering, primary pattern analysis, and machine teaming algorithms will be presented. Issues of integrated biochip informatics technologies including multivariate data projection, gene-metabolic pathway mapping, automated biomolecular annotation, text mining of factual and literature databases, and integrated management of biomolecular databases will be discussed. Each step will be given with real examples from ongoing research activities in the context of clinical relevance. Issues of linking molecular genotype and clinical phenotype information will be discussed.

  • PDF

Local T2 Control Charts for Process Control in Local Structure and Abnormal Distribution Data (지역적이고 비정규분포를 갖는 데이터의 공정관리를 위한 지역기반 T2관리도)

  • Kim, Jeong-Hun;Kim, Seoung-Bum
    • Journal of Korean Society for Quality Management
    • /
    • v.40 no.3
    • /
    • pp.337-346
    • /
    • 2012
  • Purpose: A Control chart is one of the important statistical process control tools that can improve processes by reducing variability and defects. Methods: In the present study, we propose the local $T^2$ multivariate control chart that can efficiently detect abnormal observations by considering the local pattern of the in-control observations. Results: A simulation study has been conducted to examine the property of the proposed control chart and compare it with existing multivariate control charts. Conclusion: The results demonstrate the usefulness and effectiveness of the proposed control chart.

An Application of Decision Tree Method for Fault Diagnosis of Induction Motors

  • Tran, Van Tung;Yang, Bo-Suk;Oh, Myung-Suck
    • Proceedings of the Korea Committee for Ocean Resources and Engineering Conference
    • /
    • 2006.11a
    • /
    • pp.54-59
    • /
    • 2006
  • Decision tree is one of the most effective and widely used methods for building classification model. Researchers from various disciplines such as statistics, machine learning, pattern recognition, and data mining have considered the decision tree method as an effective solution to their field problems. In this paper, an application of decision tree method to classify the faults of induction motors is proposed. The original data from experiment is dealt with feature calculation to get the useful information as attributes. These data are then assigned the classes which are based on our experience before becoming data inputs for decision tree. The total 9 classes are defined. An implementation of decision tree written in Matlab is used for these data.

  • PDF

K-means Clustering using a Center Of Gravity for grid-based sample

  • Park, Hee-Chang;Lee, Sun-Myung
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2004.04a
    • /
    • pp.51-60
    • /
    • 2004
  • K-means clustering is an iterative algorithm in which items are moved among sets of clusters until the desired set is reached. K-means clustering has been widely used in many applications, such as market research, pattern analysis or recognition, image processing, etc. It can identify dense and sparse regions among data attributes or object attributes. But k-means algorithm requires many hours to get k clusters that we want, because it is more primitive, explorative. In this paper we propose a new method of k-means clustering using a center of gravity for grid-based sample. It is more fast than any traditional clustering method and maintains its accuracy.

  • PDF

Load Pattern Analysis of Distribution Transformer using Data Mining Techniques (데이터마이닝 기법을 이용한 변압기 부하패턴 분석)

  • Shin, Jin-Ho;Kim, Young-Il;Yi, Bong-Jae;Song, Jae-Ju;Yang, Il-Kwon
    • Proceedings of the KIEE Conference
    • /
    • 2008.07a
    • /
    • pp.1879-1880
    • /
    • 2008
  • 시간 데이터마이닝은 기존 데이터마이닝에 시간 개념을 추가하여 시간 속성을 가진 데이터로부터 이전에 잘 알려지지는 않았지만 묵시적이고 잠재적으로 유용한 시간 지식을 탐사하는 기술이다. 이 논문에서는 시간 속성을 가진 변압기 부하 패턴에 대해 시간의 변화에 따른 적용 시점이 명확한 지식 탐사가 가능하고, 향후 부하 예측에 있어 탐사된 규칙과 시간 지식을 이용함으로써 기존의 정적인 분류규칙을 적용한 방법보다 더 정확한 예측을 할 수 있는 새로운 시간 패턴 마이닝 기법을 제안한다.

  • PDF

An MILP Approach to a Nonlinear Pattern Classification of Data (혼합정수 선형계획법 기반의 비선형 패턴 분류 기법)

  • Kim, Kwangsoo;Ryoo, Hong Seo
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.32 no.2
    • /
    • pp.74-81
    • /
    • 2006
  • In this paper, we deal with the separation of data by concurrently determined, piecewise nonlinear discriminant functions. Toward the end, we develop a new $l_1$-distance norm error metric and cast the problem as a mixed 0-1 integer and linear programming (MILP) model. Given a finite number of discriminant functions as an input, the proposed model considers the synergy as well as the individual role of the functions involved and implements a simplest nonlinear decision surface that best separates the data on hand. Hence, exploiting powerful MILP solvers, the model efficiently analyzes any given data set for its piecewise nonlinear separability. The classification of four sets of artificial data demonstrates the aforementioned strength of the proposed model. Classification results on five machine learning benchmark databases prove that the data separation via the proposed MILP model is an effective supervised learning methodology that compares quite favorably to well-established learning methodologies.

A Research on User′s Query Processing in Search Engine for Ocean using the Association Rules (연관 규칙 탐사 기법을 이용한 해양 전문 검색 엔진에서의 질의어 처리에 관한 연구)

  • 하창승;윤병수;류길수
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2002.11a
    • /
    • pp.266-272
    • /
    • 2002
  • Recently various of information suppliers provide information via WWW so the necessary of search engine grows larger. However the efficiency of most search engines is low comparatively because of using simple pattern match technique between user's query and web document. And a manifest contents of query for special expert field so much worse A specialized search engine returns the specialized information depend on each user's search goal. It is trend to develop specialized search engines in many countries. For example, in America, there are a site that searches only the recently updated headline news and the federal law and the government and and so on. However, most such engines don't satisfy the user's needs. This paper proposes the specialized search engine for ocean information that uses user's query related with ocean and search engine uses the association rules in web data mining. So specialized search engine for ocean provides more information related to ocean because of raising recall about user's query

  • PDF

Complex analysis of rock cutting with consideration of rock-tool interaction using distinct element method (DEM)

  • Zhang, Guangzhe;Dang, Wengang;Herbst, Martin;Song, Zhengyang
    • Geomechanics and Engineering
    • /
    • v.20 no.5
    • /
    • pp.421-432
    • /
    • 2020
  • Cutting of rocks is very common encountered in tunneling and mining during underground excavations. A deep understanding of rock-tool interaction can promote industrial applications significantly. In this paper, a distinct element method based approach, PFC3D, is adopted to simulate the rock cutting under different operation conditions (cutting velocity, depth of cut and rake angle) and with various tool geometries (tip angle, tip wear and tip shape). Simulation results showed that the cutting force and accumulated number of cracks increase with increasing cutting velocity, cut depth, tip angle and pick abrasion. The number of cracks and cutting force decrease with increasing negative rake angle and increase with increasing positive rake angle. The numerical approach can offer a better insight into the rock-tool interaction during the rock cutting process. The proposed numerical method can be used to assess the rock cuttability, to estimate the cutting performance, and to design the cutter head.

Pattern Mining of Biological Data by Co-evolutionary Learning with Multi-populations (다중 개체 집단의 공진화적 학습에 의한 바이오 데이터의 패턴 마이닝)

  • Kim Soo-Jin;Joung Je-Gun;Zhang Byoung-Tak
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.06a
    • /
    • pp.46-48
    • /
    • 2006
  • 현재 각 분야에서 다양한 실험 데이터가 산출되면서 이종(heterogeneous) 데이터간의 상관관계 분석에 대한 중요성이 더욱 부각되고 있다. 특히, 대규모 실험에 의해 급속하게 증가하고 있는 대량의 바이오 데이터에서 이런 문제를 해결하기 위한 새로운 데이터 마이닝 방법이 요구된다. 본 논문은 특성이 다른 두 데이터 셋에서 서로 상관관계가 있는 부분 패턴을 파악할 수 있는 새로운 알고리즘을 제안한다. 제안한 알고리즘은 다중 개체 집단을 유지하면서 상호간 공진화하는 확률적 진화컴퓨팅 방법에 기반하고, 전체의 탐색 포인트들을 분해하여 최적해를 찾는 점에서 장점을 가지고 있다. 실험 결과, 본 논문에서는 효모 유전자에 대한 발현 데이터와 모티프 데이터의 이종 데이터에 적용해 보았으며, 이러한 데이터에 있어서 주요 상관관계가 있는 패턴들을 추출한 결과를 제시한다.

  • PDF

Similarity Pattern Analysis of Web Log Data using Multidimensional FCM (다차원 FCM을 이용한 웹 로그 데이터의 유사 패턴 분석)

  • 김미라;조동섭
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.10d
    • /
    • pp.190-192
    • /
    • 2002
  • 데이터 마이닝(Data Mining)이란 저장된 많은 양의 자료로부터 통계적 수학적 분석방법을 이용하여 다양한 가치 있는 정보를 찾아내는 일련의 과정이다. 데이터 클러스터링은 이러한 데이터 마이닝을 위한 하나의 중요한 기법이다. 본 논문에서는 Fuzzy C-Means 알고리즘을 이용하여 웹 사용자들의 행위가 기록되어 있는 웹 로그 데이터를 데이터 클러스터링 하는 방법에 관하여 연구하고자 한다. Fuzzv C-Means 클러스터링 알고리즘은 각 데이터와 각 클러스터 중심과의 거리를 고려한 유사도 측정에 기초한 목적 함수의 최적화 방식을 사용한다. 웹 로그 데이터의 여러 필드 중에서 사용자 IP, 시간, 웹 페이지 필드를 WLDF(Web Log Data for FCM)으로 가공한 후, 다차원 Fuzzy C-Means 클러스터링을 한다. 그리고 이를 이용하여 샘플 데이터와 임의의 데이터간의 유사 패턴 분석을 하고자 한다.

  • PDF