• Title/Summary/Keyword: Bicluster

Search Result 6, Processing Time 0.026 seconds

Validation Measures of Bicluster Solutions

  • Lee, Young-Rok;Lee, Jeong-Hwa;Jun, Chi-Hyuck
    • Industrial Engineering and Management Systems
    • /
    • v.8 no.2
    • /
    • pp.101-108
    • /
    • 2009
  • Biclustering is a method to extract subsets of objects and features from a dataset which are characterized in some way. In contrast to traditional clustering algorithms which group objects similar in a whole feature set, biclustering methods find groups of objects which have similar values or patterns in some features. Both in clustering and biclustering, validating how much the result is informative or reliable is a very important task. Whereas validation methods of cluster solutions have been studied actively, there are only few measures to validate bicluster solutions. Furthermore, the existing validation methods of bicluster solutions have some critical problems to be used in general cases. In this paper, we review several well-known validation measures for cluster and bicluster solutions and discuss their limitations. Then, we propose several improved validation indices as modified versions of existing ones.

Two Dimensional Cluster Analysis of Air Quality by Time and Area (지역.시간별을 고려한 이차원 대기환경 군집 분석)

  • Wee, Seong-Seung;Kim, Jae-Hoon;Ahn, Chi-Kyung;Choi, Byong-Su;Kim, Dae-Seon
    • Journal of Environmental Science International
    • /
    • v.17 no.5
    • /
    • pp.517-524
    • /
    • 2008
  • The purpose of this study was to investigate the characteristics of air quality using data from which obtain local air quality monitoring system for cohort study in Chungju, Korea. We analyzed the concentration data of $NO_2,\;SO_2$, and $PM_{10}$ in Chungju and industrial cities in 2006. We compared a industrial area with a cohort study area using by bicluster algorithm. In the case of $SO_2$, the rate of the cluster time was $10{\sim}60%$ and the cluster time number of two areas was similar. In the case of $NO_2$ and $PM_{10}$, the number of cluster time between a industrial area and cohort study area was clearly different.

Bicluster and Pathway Enrichment Analysis of HCV-induced Cirrhosis and Hepatocellular Carcinoma

  • Cheng, Peng;Cheng, You;Su, Mei X.;Li, Dong;Zhao, Guo Z.;Gao, Hui;Li, Yan;Zhu, Jie Y.;Li, Hua;Zhang, Tao
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.13 no.8
    • /
    • pp.3741-3745
    • /
    • 2012
  • Background: Hepatocellular carcinoma (HCC) is the sixth most common cancer worldwide and the most common form of liver cancer. However, while it is associated frequently with hepatitis C virus (HCV) there is only an elementary understanding of its molecular pathogenesis. Methods: To gain insight into the molecular mechanisms of HCV-induced hepatocarcinogenesis, we performed microarray analysis on 75 surgical liver samples from 48 HCV-infected patients. Results: There were 395 differentially expressed geness between cirrhotic samples and HCC samples. Of these, 125 genes were up-regulated and 270 genes were down-regulated. We performed pathway enrichment analysis and screened as described previously. Conclusions: The differentially expressed genes might be involved in hepatocarcinogenesis through upregulating the pathways of ECM-receptor interaction, focal adhesion, cell adhesion molecules and other cancer-related pathways, and downregulating the pathways of "complement and coagulation cascades". We hope our results could aid in seeking of therapeutic targets for HCV-induced hepatocellular carcinoma.

A Biclustering Method for Time Series Analysis

  • Lee, Jeong-Hwa;Lee, Young-Rok;Jun, Chi-Hyuck
    • Industrial Engineering and Management Systems
    • /
    • v.9 no.2
    • /
    • pp.131-140
    • /
    • 2010
  • Biclustering is a method of finding meaningful subsets of objects and attributes simultaneously, which may not be detected by traditional clustering methods. It is popularly used for the analysis of microarray data representing the expression levels of genes by conditions. Usually, biclustering algorithms do not consider a sequential relation between attributes. For time series data, however, bicluster solutions should keep the time sequence. This paper proposes a new biclustering algorithm for time series data by modifying the plaid model. The proposed algorithm introduces a parameter controlling an interval between two selected time points. Also, the pruning step preventing an over-fitting problem is modified so as to eliminate only starting or ending points. Results from artificial data sets show that the proposed method is more suitable for the extraction of biclusters from time series data sets. Moreover, by using the proposed method, we find some interesting observations from real-world time-course microarray data sets and apartment price data sets in metropolitan areas.

Macroscopic Biclustering of Gene Expression Data (유전자 발현 데이터에 적용한 거시적인 바이클러스터링 기법)

  • Ahn, Jae-Gyoon;Yoon, Young-Mi;Park, Sang-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.16D no.3
    • /
    • pp.327-338
    • /
    • 2009
  • A microarray dataset is 2-dimensional dataset with a set of genes and a set of conditions. A bicluster is a subset of genes that show similar behavior within a subset of conditions. Genes that show similar behavior can be considered to have same cellular functions. Thus, biclustering algorithm is a useful tool to uncover groups of genes involved in the same cellular process and groups of conditions which take place in this process. We are proposing a polynomial time algorithm to identify functionally highly correlated biclusters. Our algorithm identifies 1) the gene set that has hidden patterns even if the level of noise is high, 2) the multiple, possibly overlapped, and diverse gene sets, 3) gene sets whose functional association is strongly high, and 4) deterministic biclustering results. We validated the level of functional association of our method, and compared with current methods using GO.