• 제목/요약/키워드: Bicluster

검색결과 6건 처리시간 0.017초

Validation Measures of Bicluster Solutions

  • Lee, Young-Rok;Lee, Jeong-Hwa;Jun, Chi-Hyuck
    • Industrial Engineering and Management Systems
    • /
    • 제8권2호
    • /
    • pp.101-108
    • /
    • 2009
  • Biclustering is a method to extract subsets of objects and features from a dataset which are characterized in some way. In contrast to traditional clustering algorithms which group objects similar in a whole feature set, biclustering methods find groups of objects which have similar values or patterns in some features. Both in clustering and biclustering, validating how much the result is informative or reliable is a very important task. Whereas validation methods of cluster solutions have been studied actively, there are only few measures to validate bicluster solutions. Furthermore, the existing validation methods of bicluster solutions have some critical problems to be used in general cases. In this paper, we review several well-known validation measures for cluster and bicluster solutions and discuss their limitations. Then, we propose several improved validation indices as modified versions of existing ones.

지역.시간별을 고려한 이차원 대기환경 군집 분석 (Two Dimensional Cluster Analysis of Air Quality by Time and Area)

  • 위성승;김재훈;안치경;최병수;김대선
    • 한국환경과학회지
    • /
    • 제17권5호
    • /
    • pp.517-524
    • /
    • 2008
  • The purpose of this study was to investigate the characteristics of air quality using data from which obtain local air quality monitoring system for cohort study in Chungju, Korea. We analyzed the concentration data of $NO_2,\;SO_2$, and $PM_{10}$ in Chungju and industrial cities in 2006. We compared a industrial area with a cohort study area using by bicluster algorithm. In the case of $SO_2$, the rate of the cluster time was $10{\sim}60%$ and the cluster time number of two areas was similar. In the case of $NO_2$ and $PM_{10}$, the number of cluster time between a industrial area and cohort study area was clearly different.

Bicluster and Pathway Enrichment Analysis of HCV-induced Cirrhosis and Hepatocellular Carcinoma

  • Cheng, Peng;Cheng, You;Su, Mei X.;Li, Dong;Zhao, Guo Z.;Gao, Hui;Li, Yan;Zhu, Jie Y.;Li, Hua;Zhang, Tao
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제13권8호
    • /
    • pp.3741-3745
    • /
    • 2012
  • Background: Hepatocellular carcinoma (HCC) is the sixth most common cancer worldwide and the most common form of liver cancer. However, while it is associated frequently with hepatitis C virus (HCV) there is only an elementary understanding of its molecular pathogenesis. Methods: To gain insight into the molecular mechanisms of HCV-induced hepatocarcinogenesis, we performed microarray analysis on 75 surgical liver samples from 48 HCV-infected patients. Results: There were 395 differentially expressed geness between cirrhotic samples and HCC samples. Of these, 125 genes were up-regulated and 270 genes were down-regulated. We performed pathway enrichment analysis and screened as described previously. Conclusions: The differentially expressed genes might be involved in hepatocarcinogenesis through upregulating the pathways of ECM-receptor interaction, focal adhesion, cell adhesion molecules and other cancer-related pathways, and downregulating the pathways of "complement and coagulation cascades". We hope our results could aid in seeking of therapeutic targets for HCV-induced hepatocellular carcinoma.

A Biclustering Method for Time Series Analysis

  • Lee, Jeong-Hwa;Lee, Young-Rok;Jun, Chi-Hyuck
    • Industrial Engineering and Management Systems
    • /
    • 제9권2호
    • /
    • pp.131-140
    • /
    • 2010
  • Biclustering is a method of finding meaningful subsets of objects and attributes simultaneously, which may not be detected by traditional clustering methods. It is popularly used for the analysis of microarray data representing the expression levels of genes by conditions. Usually, biclustering algorithms do not consider a sequential relation between attributes. For time series data, however, bicluster solutions should keep the time sequence. This paper proposes a new biclustering algorithm for time series data by modifying the plaid model. The proposed algorithm introduces a parameter controlling an interval between two selected time points. Also, the pruning step preventing an over-fitting problem is modified so as to eliminate only starting or ending points. Results from artificial data sets show that the proposed method is more suitable for the extraction of biclusters from time series data sets. Moreover, by using the proposed method, we find some interesting observations from real-world time-course microarray data sets and apartment price data sets in metropolitan areas.

유전자 발현 데이터에 적용한 거시적인 바이클러스터링 기법 (Macroscopic Biclustering of Gene Expression Data)

  • 안재균;윤영미;박상현
    • 정보처리학회논문지D
    • /
    • 제16D권3호
    • /
    • pp.327-338
    • /
    • 2009
  • 마이크로어레이 데이터는 유전자의 집합이 어떠한 조건 혹은 샘플의 집합 하에서 얼마나 발현되는지를 수치화한 2차원 행렬 데이터이다. 바이클러스터는 마이크로어레이의 샘플의 부분 집합과 이 샘플 부분 집합 하에서 일정한 증감 패턴을 보이는 유전자의 부분 집합을 말한다. 이렇게 같은 패턴을 보이는 유전자의 부분 집합은 일정한 정도의 유의 수준으로 비슷한 기능을 한다고 말할 수 있다. 따라서 바이클러스터링 알고리즘은 같은 기능에 연관된 유전자의 집합과, 이 기능이 발현되고 있는 조건의 집합을 밝혀내는데 있어서 매우 유용하다. 본 논문에서는 다항식 시간 복잡도를 유지하면서, 높은 기능적 상관관계를 가지는 바이클러스터를 밝혀 낼 수 있는 알고리즘을 제안한다. 이 알고리즘은 1) 마이크로어레이 데이터에 심한 노이즈가 있을 경우 패턴으로 인식하지 못하는 기존 알고리즘과 달리, 노이즈 레벨이 심하더라도 거시적으로 비슷한 모양을 보이는 패턴을 찾아내는 방식을 이용하여 숨어있는 패턴들을 찾아낼 수 있고, 2) 바이클러스터 상호간에 오버랩을 허용하며, 또한 다양성이 보장되는 복수의 바이클러스터를 찾아내며, 3) 찾아진 유전자 부분 집합의 기능적 상관관계가 매우 높은 특성을 지니고, 4) 유전자 및 샘플의 순서와 상관없이 결정적인(deterministic) 결과를 도출한다. 또한 본 논문에서는 알고리즘이 찾아낸 바이클러스터의 기능적 상관관계의 정도와, 비교 알고리즘이 찾아낸 바이클러스터의 기능적 상관관계의 정도를 유전자 온톨로지(Gene Ontology)를 통해서 측정함으로써 비교하고 있다.