• 제목/요약/키워드: Microarray gene expression data

검색결과 315건 처리시간 0.028초

Cluster Analysis of Incomplete Microarray Data with Fuzzy Clustering

  • Kim, Dae-Won
    • 한국지능시스템학회논문지
    • /
    • 제17권3호
    • /
    • pp.397-402
    • /
    • 2007
  • In this paper, we present a method for clustering incomplete Microarray data using alternating optimization in which a prior imputation method is not required. To reduce the influence of imputation in preprocessing, we take an alternative optimization approach to find better estimates during iterative clustering process. This method improves the estimates of missing values by exploiting the cluster Information such as cluster centroids and all available non-missing values in each iteration. The clustering results of the proposed method are more significantly relevant to the biological gene annotations than those of other methods, indicating its effectiveness and potential for clustering incomplete gene expression data.

병렬 프로세서 기반의 패턴 분류 기법을 이용한 유전자 발현 데이터 분석 (Gene Expression Data Analysis Using Parallel Processor based Pattern Classification Method)

  • 최선욱;이종호
    • 전자공학회논문지CI
    • /
    • 제46권6호
    • /
    • pp.44-55
    • /
    • 2009
  • 최근 활발히 연구가 진행 중인 마이크로어레이로부터 얻어지는 유전자 발현 데이터를 이용한 질병 진단은, 데이터를 직접적으로 분석하기 힘들기 때문에 일반적으로 기계 학습 알고리즘을 사용하여 이루어져왔다. 그러나 유전자 발현 데이터를 분석함에 있어서 유전자들 간의 상호작용을 고려하는 분석이 필요하다는 최근의 연구 결과들은 기존 기계 학습 알고리즘들을 이용한 분석에 한계가 있음을 의미한다고 볼 수 있다. 본 논문에서는 특징들 사이의 고차원 상관관계를 고려 가능한 하이퍼네트워크 모델을 이용하여 유전자 발현 데이터의 분류를 수행하고 기존의 기계 학습 알고리즘들과 분류 성능을 비교한다. 또한 기존 하이퍼네트워크 모델의 단점을 개선 한 모델을 제안하고, 이를 병렬 프로세서 상에서 구현하여 처리 성능을 비교한다. 실험 결과 제안 된 모델은 기존의 기계 학습 방법들과의 비교에서도 경쟁력 있는 분류 성능을 보여주었고, 기존 하이퍼네트워크 모델 보다 안정적이고 향상된 분류 성능을 보여주었다. 또한 이를 병렬 프로세서 상에서 구현 할 경우 처리 성능을 극대화 할 수 있음을 보였다.

HL-60 세포의 유전자 발현 및 topoisomerase의 기능 활성에 미치는 억제제의 영향 (Effects of Inhibitors on the Function and Activity of Topoisomerase, and Gene Expression in HL-60 Human Leukemia Cells)

  • 정인철;조무연;박장수
    • 생명과학회지
    • /
    • 제18권1호
    • /
    • pp.75-83
    • /
    • 2008
  • 인체 DNA topoisomerase는 DNA를 단일 또는 두 가닥을 일시적인 절단을 촉매하여 DNA의 topological 문제를 조절함으로써, DNA 복제, 전사, 재조합과 유사분열 과정 등에 관여한다. 이 효소는 많은 항생, 항암제의 표적효소로서 널리 알려져 있으며, 이들 유도체를 이용한 다양한 억제제의 개발과 임상적 응용에 관한 연구가 활발하게 진행되고 있다. 본 실험에서는 인체 백혈병 HL-60 세포에서 topoisomerase 억제제가 topoisomerase 기능 활성과 유전자 발현을 조절하는지를 규명하기 위하여 본 연구를 수행하였다. 연구 방법은 HL-60세포에 topoisomerase type I과 type II의 대표적 억제제인 10-hydroxycamptothecin (10-CPT)과 doxorubicin을 투여한 후 total RNA를 분리하였고, 10K-oligo-nucleotide microarray 방법으로 분석하여 유전자의 발현 양상을 조사하였다. 연구 결과에 의하면 10-CPT 또는 doxorubicin을 투여한 HL-60세포에서의 유전자 발현 양상은 주로 signal transduction, cell adhesion, cell cycle, cell growth, cell proliferation, cell differentiation, transcription 및 immune response 등과 관련이 있었다. Topoisomerase type I의 억제제인 10-CPT를 HL-60 세포주에 투여 하였을 때 type I으로 분류되는 topoisomerase III${\alpha}$, III${\beta}$ 및 I의 발현은 증가하였으나 type II인 topoisomerase II${\alpha}$와 II${\beta}$의 유전자의 발현은 감소되었다. 반대로 type II의 억제제인 doxorubicin을 투여하였을 때는 앞의 결과와 상반된 topoisomerase II${\alpha}$와 II${\beta}$의 유전자의 발현이 현저히 증가되었으며, topoisomerase III${\alpha}$와 III${\beta}$의 mRNA의 발현은 약간 감소하는 양상을 보였으나 의미 있는 차이는 없었다. 이 연구 결과는 앞으로 항암제의 기전을 밝히고 약물에 대한 치료 반응을 예측하고 새로운 약제 개발에 기초자료가 될 것으로 여겨진다.

Improving data reliability on oligonucleotide microarray

  • Yoon, Yeo-In;Lee, Young-Hak;Park, Jin-Hyun
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2004년도 The 3rd Annual Conference for The Korean Society for Bioinformatics Association of Asian Societies for Bioinformatics 2004 Symposium
    • /
    • pp.107-116
    • /
    • 2004
  • The advent of microarray technologies gives an opportunity to moni tor the expression of ten thousands of genes, simultaneously. Such microarray data can be deteriorated by experimental errors and image artifacts, which generate non-negligible outliers that are estimated by 15% of typical microarray data. Thus, it is an important issue to detect and correct the se faulty probes prior to high-level data analysis such as classification or clustering. In this paper, we propose a systematic procedure for the detection of faulty probes and its proper correction in Genechip array based on multivariate statistical approaches. Principal component analysis (PCA), one of the most widely used multivariate statistical approaches, has been applied to construct a statistical correlation model with 20 pairs of probes for each gene. And, the faulty probes are identified by inspecting the squared prediction error (SPE) of each probe from the PCA model. Then, the outlying probes are reconstructed by the iterative optimization approach minimizing SPE. We used the public data presented from the gene chip project of human fibroblast cell. Through the application study, the proposed approach showed good performance for probe correction without removing faulty probes, which may be desirable in the viewpoint of the maximum use of data information.

  • PDF

Analysis of Genes Regulated by HSP90 Inhibitor Geldanamycin in Neurons

  • ;;권오유
    • 대한의생명과학회지
    • /
    • 제15권1호
    • /
    • pp.97-99
    • /
    • 2009
  • Geldanamycin is a benzoquinone ansamycin antibiotic that binds to cytosol HSP90 (Heat Shock Protein 90) and changes its biological function. HSP90 is involved in the intracellular important roles for the regulation of the cell cycle, cell growth, cell survival, apoptosis, angiogenesis and oncogenesis. To identify genes expressed during geldanamycin treatment against neurons of rats (PC12 cells), DNA microarray method was used. We have isolated 2 gene groups (up-or down-regulated genes) which are geldanamycin differentially expressed in neurons. Granzyme B is the gene most significantly increased among 204 up-regulated genes (more than 2 fold over-expression) and Chemokine (C-C motif) ligand 20 is the gene most dramatically decreased among 491 down-regulated genes (more than 2 fold down-expression). The gene increased expression of Cxc110, Cyp11a1, Gadd45a, Gja1, Gpx2, Ifua4, Inpp5e, Sox4, and Stip1 are involved stress-response gene, and Cryab, Dnaja1, Hspa1a, Hspa8, Hspca, Hspcb, Hspd1, Hspd1, and Hsph1 are strongly associated with protein folding. Cell cycle associated genes (Bc13, Brca2, Ccnf, Cdk2, Ddit3, Dusp6, E2f1, Illa, and Junb) and inflammatory response associated genes (Cc12, Cc120, Cxc12, Il23a, Nos2, Nppb, Tgfb1, Tlr2, and Tnt) are down-regulated more than 2 times by geldanamycin treatment. We found that geldanamycin is related to expression of many genes associated with stress response, protein folding, cell cycle, and inflammation by DNA microarray analysis. Further experimental molecular studies will be needed to figure out the exact biological function of various genes described above and the physiological change of neuronal cells by geldanamycin. The resulting data will give the one of the good clues for understanding of geldanamycin under molecular level in the neurons.

  • PDF

Comparison of covariance thresholding methods in gene set analysis

  • Park, Sora;Kim, Kipoong;Sun, Hokeun
    • Communications for Statistical Applications and Methods
    • /
    • 제29권5호
    • /
    • pp.591-601
    • /
    • 2022
  • In gene set analysis with microarray expression data, a group of genes such as a gene regulatory pathway and a signaling pathway is often tested if there exists either differentially expressed (DE) or differentially co-expressed (DC) genes between two biological conditions. Recently, a statistical test based on covariance estimation have been proposed in order to identify DC genes. In particular, covariance regularization by hard thresholding indeed improved the power of the test when the proportion of DC genes within a biological pathway is relatively small. In this article, we compare covariance thresholding methods using four different regularization penalties such as lasso, hard, smoothly clipped absolute deviation (SCAD), and minimax concave plus (MCP) penalties. In our extensive simulation studies, we found that both SCAD and MCP thresholding methods can outperform the hard thresholding method when the proportion of DC genes is extremely small and the number of genes in a biological pathway is much greater than a sample size. We also applied four thresholding methods to 3 different microarray gene expression data sets related with mutant p53 transcriptional activity, and epithelium and stroma breast cancer to compare genetic pathways identified by each method.

A Clustering Approach for Feature Selection in Microarray Data Classification Using Random Forest

  • Aydadenta, Husna;Adiwijaya, Adiwijaya
    • Journal of Information Processing Systems
    • /
    • 제14권5호
    • /
    • pp.1167-1175
    • /
    • 2018
  • Microarray data plays an essential role in diagnosing and detecting cancer. Microarray analysis allows the examination of levels of gene expression in specific cell samples, where thousands of genes can be analyzed simultaneously. However, microarray data have very little sample data and high data dimensionality. Therefore, to classify microarray data, a dimensional reduction process is required. Dimensional reduction can eliminate redundancy of data; thus, features used in classification are features that only have a high correlation with their class. There are two types of dimensional reduction, namely feature selection and feature extraction. In this paper, we used k-means algorithm as the clustering approach for feature selection. The proposed approach can be used to categorize features that have the same characteristics in one cluster, so that redundancy in microarray data is removed. The result of clustering is ranked using the Relief algorithm such that the best scoring element for each cluster is obtained. All best elements of each cluster are selected and used as features in the classification process. Next, the Random Forest algorithm is used. Based on the simulation, the accuracy of the proposed approach for each dataset, namely Colon, Lung Cancer, and Prostate Tumor, achieved 85.87%, 98.9%, and 89% accuracy, respectively. The accuracy of the proposed approach is therefore higher than the approach using Random Forest without clustering.

Meta-analysis of Gene Expression Data Identifies Causal Genes for Prostate Cancer

  • Wang, Xiang-Yang;Hao, Jian-Wei;Zhou, Rui-Jin;Zhang, Xiang-Sheng;Yan, Tian-Zhong;Ding, De-Gang;Shan, Lei
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제14권1호
    • /
    • pp.457-461
    • /
    • 2013
  • Prostate cancer is a leading cause of death in male populations across the globe. With the advent of gene expression arrays, many microarray studies have been conducted in prostate cancer, but the results have varied across different studies. To better understand the genetic and biologic mechanisms of prostate cancer, we conducted a meta-analysis of two studies on prostate cancer. Eight key genes were identified to be differentially expressed with progression. After gene co-expression analysis based on data from the GEO database, we obtained a co-expressed gene list which included 725 genes. Gene Ontology analysis revealed that these genes are involved in actin filament-based processes, locomotion and cell morphogenesis. Further analysis of the gene list should provide important clues for developing new prognostic markers and therapeutic targets.

DNA Microarray Analysis of Gene Expression Profiles in Aging process of Mouse Brain

  • Lee Mi-Suk;Heo Jee-In;Kim Jae-Bong;Park Jae-Bong;Lee Jae-Yang;Han Jeong-A.;Kim Jong-Il
    • Genomics & Informatics
    • /
    • 제4권1호
    • /
    • pp.23-32
    • /
    • 2006
  • In order to investigate the molecular basis of the aging process in brain, we have employed high-density oligonucleotide microarrays providing data on 10,108 gene clusters to define transcriptional patterns in three brain regions, cerebral cortex, cerebellum, and hippocampus. Comparison of the expression patterns between young (6-week-old) and aged (17-month-old) C57BL/6 male micerevealed that about ten percent (1098) of the genes showed a significant change in the expression level in at least one of the three tissues. Among them, 23 genes were upregulated and 62 genes were downregulated in all three tissues of the old mice. The number of genes upregulated exclusively in hippocampus (337) was much larger compared to other tissues. Gene ontology-based analysis showed the genes related with signal transduction or molecular transports are more likely to be upregulated than downregulated in the aging process of hippocampus. These data may provide some useful means for elucidating the molecular aspect of aging in hippocampus and other regions in brain.

Robust inference with order constraint in microarray study

  • Kang, Joonsung
    • Communications for Statistical Applications and Methods
    • /
    • 제25권5호
    • /
    • pp.559-568
    • /
    • 2018
  • Gene classification can involve complex order-restricted inference. Examining gene expression pattern across groups with order-restriction makes standard statistical inference ineffective and thus, requires different methods. For this problem, Roy's union-intersection principle has some merit. The M-estimator adjusting for outlier arrays in a microarray study produces a robust test statistic with distribution-insensitive clustering of genes. The M-estimator in conjunction with a union-intersection principle provides a nonstandard robust procedure. By exact permutation distribution theory, a conditionally distribution-free test based on the proposed test statistic generates corresponding p-values in a small sample size setup. We apply a false discovery rate (FDR) as a multiple testing procedure to p-values in simulated data and real microarray data. FDR procedure for proposed test statistics controls the FDR at all levels of ${\alpha}$ and ${\pi}_0$ (the proportion of true null); however, the FDR procedure for test statistics based upon normal theory (ANOVA) fails to control FDR.