Search | Korea Science

Lee, Young-Rok;Lee, Jeong-Hwa;Jun, Chi-Hyuck
- Industrial Engineering and Management Systems
- /
- v.8 no.2
- /
- pp.101-108
- /
- 2009
Biclustering is a method to extract subsets of objects and features from a dataset which are characterized in some way. In contrast to traditional clustering algorithms which group objects similar in a whole feature set, biclustering methods find groups of objects which have similar values or patterns in some features. Both in clustering and biclustering, validating how much the result is informative or reliable is a very important task. Whereas validation methods of cluster solutions have been studied actively, there are only few measures to validate bicluster solutions. Furthermore, the existing validation methods of bicluster solutions have some critical problems to be used in general cases. In this paper, we review several well-known validation measures for cluster and bicluster solutions and discuss their limitations. Then, we propose several improved validation indices as modified versions of existing ones.
PDF KSCI

Mukharjee, Ajoy;Bose, Manoj Kumar
- Kyungpook Mathematical Journal
- /
- v.53 no.1
- /
- pp.125-133
- /
- 2013
In this paper, we introduce the notion of near pairwise compactness which generalizes the notion of pairwise compactness.
https://doi.org/10.5666/KMJ.2013.53.1.125 인용 PDF KSCI

Wee, Seong-Seung;Kim, Jae-Hoon;Ahn, Chi-Kyung;Choi, Byong-Su;Kim, Dae-Seon
- Journal of Environmental Science International
- /
- v.17 no.5
- /
- pp.517-524
- /
- 2008
The purpose of this study was to investigate the characteristics of air quality using data from which obtain local air quality monitoring system for cohort study in Chungju, Korea. We analyzed the concentration data of $NO_2,\;SO_2$, and $PM_{10}$ in Chungju and industrial cities in 2006. We compared a industrial area with a cohort study area using by bicluster algorithm. In the case of $SO_2$, the rate of the cluster time was $10{\sim}60%$ and the cluster time number of two areas was similar. In the case of $NO_2$ and $PM_{10}$, the number of cluster time between a industrial area and cohort study area was clearly different.
https://doi.org/10.5322/JES.2008.17.5.517 인용 PDF KSCI

Cheng, Peng;Cheng, You;Su, Mei X.;Li, Dong;Zhao, Guo Z.;Gao, Hui;Li, Yan;Zhu, Jie Y.;Li, Hua;Zhang, Tao
- Asian Pacific Journal of Cancer Prevention
- /
- v.13 no.8
- /
- pp.3741-3745
- /
- 2012
Background: Hepatocellular carcinoma (HCC) is the sixth most common cancer worldwide and the most common form of liver cancer. However, while it is associated frequently with hepatitis C virus (HCV) there is only an elementary understanding of its molecular pathogenesis. Methods: To gain insight into the molecular mechanisms of HCV-induced hepatocarcinogenesis, we performed microarray analysis on 75 surgical liver samples from 48 HCV-infected patients. Results: There were 395 differentially expressed geness between cirrhotic samples and HCC samples. Of these, 125 genes were up-regulated and 270 genes were down-regulated. We performed pathway enrichment analysis and screened as described previously. Conclusions: The differentially expressed genes might be involved in hepatocarcinogenesis through upregulating the pathways of ECM-receptor interaction, focal adhesion, cell adhesion molecules and other cancer-related pathways, and downregulating the pathways of "complement and coagulation cascades". We hope our results could aid in seeking of therapeutic targets for HCV-induced hepatocellular carcinoma.
https://doi.org/10.7314/APJCP.2012.13.8.3741 인용 PDF KSCI

Lee, Jeong-Hwa;Lee, Young-Rok;Jun, Chi-Hyuck
- Industrial Engineering and Management Systems
- /
- v.9 no.2
- /
- pp.131-140
- /
- 2010
Biclustering is a method of finding meaningful subsets of objects and attributes simultaneously, which may not be detected by traditional clustering methods. It is popularly used for the analysis of microarray data representing the expression levels of genes by conditions. Usually, biclustering algorithms do not consider a sequential relation between attributes. For time series data, however, bicluster solutions should keep the time sequence. This paper proposes a new biclustering algorithm for time series data by modifying the plaid model. The proposed algorithm introduces a parameter controlling an interval between two selected time points. Also, the pruning step preventing an over-fitting problem is modified so as to eliminate only starting or ending points. Results from artificial data sets show that the proposed method is more suitable for the extraction of biclusters from time series data sets. Moreover, by using the proposed method, we find some interesting observations from real-world time-course microarray data sets and apartment price data sets in metropolitan areas.
https://doi.org/10.7232/iems.2010.9.2.131 인용 PDF KSCI

Ahn, Jae-Gyoon;Yoon, Young-Mi;Park, Sang-Hyun
- The KIPS Transactions:PartD
- /
- v.16D no.3
- /
- pp.327-338
- /
- 2009
A microarray dataset is 2-dimensional dataset with a set of genes and a set of conditions. A bicluster is a subset of genes that show similar behavior within a subset of conditions. Genes that show similar behavior can be considered to have same cellular functions. Thus, biclustering algorithm is a useful tool to uncover groups of genes involved in the same cellular process and groups of conditions which take place in this process. We are proposing a polynomial time algorithm to identify functionally highly correlated biclusters. Our algorithm identifies 1) the gene set that has hidden patterns even if the level of noise is high, 2) the multiple, possibly overlapped, and diverse gene sets, 3) gene sets whose functional association is strongly high, and 4) deterministic biclustering results. We validated the level of functional association of our method, and compared with current methods using GO.
https://doi.org/10.3745/KIPSTD.2009.16-D.3.327 인용 PDF KSCI