• Title/Summary/Keyword: Microarray Data

Search Result 473, Processing Time 0.023 seconds

A Method for Gene Group Analysis and Its Application (유전자군 분석의 방법론과 응용)

  • Lee, Tae-Won;Delongchamp, Robert R.
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.2
    • /
    • pp.269-277
    • /
    • 2012
  • In microarray data analysis, recent efforts have focused on the discovery of gene sets from a pathway or functional categories such as Gene Ontology terms(GO terms) rather than on individual gene function for its direct interpretation of genome-wide expression data. We introduce a meta-analysis method that combines $p$-values for changes of each gene in the group. The method measures the significance of overall treatment-induced change in a gene group. An application of the method to a real data demonstrates that it has benefits over other statistical methods such as Fisher's exact test and permutation methods. The method is implemented in a SAS program and it is available on the author's homepage(http://cafe.daum.net/go.analysis).

Permutation-Based Test with Small Samples for Detecting Differentially Expressed Genes (극소수 샘플에서 유의발현 유전자 탐색에 사용되는 순열에 근거한 검정법)

  • Lee, Ju-Hyoung;Song, Hae-Hiang
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.5
    • /
    • pp.1059-1072
    • /
    • 2009
  • In the analysis of microarray data with a small number of arrays, the most important task is the detection of differentially expressed genes by a significance test. For this purpose, one needs to construct a null distribution based on a large number of genes and one of the best way for constructing the null distribution for a small number of arrays is by means of permutation methods. In this paper we propose simple test statistics and permutation methods that are appropriate in constructing the null distribution. In a simulation study, we compare the null distributions generated by the proposed test statistics and permutation methods with the previous ones. With an example microarray data, differentially expressed genes are determined by applying these methods.

Clustering Gene Expression Data by MCL Algorithm (MCL 알고리즘을 사용한 유전자 발현 데이터 클러스터링)

  • Shon, Ho-Sun;Ryu, Keun-Ho
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.4
    • /
    • pp.27-33
    • /
    • 2008
  • The clustering of gene expression data is used to analyze the results of microarray studies. This clustering is one of the frequently used methods in understanding degrees of biological change and gene expression. In biological research, MCL algorithm is an algorithm that clusters nodes within a graph, and is quick and efficient. We have modified the existing MCL algorithm and applied it to microarray data. In applying the MCL algorithm we put forth a simulation that adjusts two factors, namely inflation and diagonal tent and converted them by making use of Markov matrix. Furthermore, in order to distinguish class more clearly in the modified MCL algorithm we took the average of each row and used it as a threshold. Therefore, the improved algorithm can increase accuracy better than the existing ones. In other words, in the actual experiment, it showed an average of 70% accuracy when compared with an existing class. We also compared the MCL algorithm with the self-organizing map(SOM) clustering, K-means clustering and hierarchical clustering (HC) algorithms. And the result showed that it showed better results than ones derived from hierarchical clustering and K-means method.

Alteration in miRNA Expression Profiling with Response to Nonylphenol in Human Cell Lines

  • Paul, Saswati;Kim, Seung-Jun;Park, Hye-Won;Lee, Seung-Yong;An, Yu-Ri;Oh, Moon-Ju;Jung, Jin-Wook;Hwang, Seung-Yong
    • Molecular & Cellular Toxicology
    • /
    • v.5 no.1
    • /
    • pp.67-74
    • /
    • 2009
  • Exposures to environmental chemicals that mimic endogenous hormones are proposed for a number of adverse health effects, including infertility, abnormal prenatal and childhood development and above all cancers. In addition, recently miRNA (micro RNA) has been recognized to play an important role in various diseases and in cellular and molecular responses to toxicants. In this study, endocrine disrupting environmental toxicant, nonylphenol (NP) was treated to MCF-7 (Human breast cancer cell) and HepG2 (Human hepatocellular liver carcinoma) cell line at 3 hrs and 48 hrs time point and miRNA analysis using $mirVana^{TM}$ miRNA bioarray was performed and compared with total mRNA microarray data for the same cell line and treatment. Robust data quality was achieved through the use of dye-swap. Analysis of microarray data identifies a total of 20 and 11 miRNA expressions at 3 hrs and 48 hrs exposure to NP in MCF-7 cell line and a total of 14 and 47 miRNA expression at 3 hrs and 48 hrs exposure respectively to NP in HepG2 cell line. Expression profiling of the selected miRNA (let-7c, miR-16, miR-195, miR-200b, miR200c, miR-205, and miR-589) reveals changes in the expression of target genes related to metabolism, immune response, apoptosis, and cell differentiation. The present study can be informative and helpful to understand the role of miRNA in molecular mechanism of chemical toxicity and their influence on hormone dependent disease. Also this study may prove to be a valuable tool for screening potential estrogen mimicking pollutants in the environment.

Macroscopic Biclustering of Gene Expression Data (유전자 발현 데이터에 적용한 거시적인 바이클러스터링 기법)

  • Ahn, Jae-Gyoon;Yoon, Young-Mi;Park, Sang-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.16D no.3
    • /
    • pp.327-338
    • /
    • 2009
  • A microarray dataset is 2-dimensional dataset with a set of genes and a set of conditions. A bicluster is a subset of genes that show similar behavior within a subset of conditions. Genes that show similar behavior can be considered to have same cellular functions. Thus, biclustering algorithm is a useful tool to uncover groups of genes involved in the same cellular process and groups of conditions which take place in this process. We are proposing a polynomial time algorithm to identify functionally highly correlated biclusters. Our algorithm identifies 1) the gene set that has hidden patterns even if the level of noise is high, 2) the multiple, possibly overlapped, and diverse gene sets, 3) gene sets whose functional association is strongly high, and 4) deterministic biclustering results. We validated the level of functional association of our method, and compared with current methods using GO.

New Normalization Methods using Support Vector Machine Regression Approach in cDNA Microarray Analysis

  • Sohn, In-Suk;Kim, Su-Jong;Hwang, Chang-Ha;Lee, Jae-Won
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.51-56
    • /
    • 2005
  • There are many sources of systematic variations in cDNA microarray experiments which affect the measured gene expression levels like differences in labeling efficiency between the two fluorescent dyes. Print-tip lowess normalization is used in situations where dye biases can depend on spot overall intensity and/or spatial location within the array. However, print-tip lowess normalization performs poorly in situation where error variability for each gene is heterogeneous over intensity ranges. We proposed the new print-tip normalization methods based on support vector machine regression(SVMR) and support vector machine quantile regression(SVMQR). SVMQR was derived by employing the basic principle of support vector machine (SVM) for the estimation of the linear and nonlinear quantile regressions. We applied our proposed methods to previous cDNA micro array data of apolipoprotein-AI-knockout (apoAI-KO) mice, diet-induced obese mice, and genistein-fed obese mice. From our statistical analysis, we found that the proposed methods perform better than the existing print-tip lowess normalization method.

  • PDF

Variable Selection in Normal Mixture Model Based Clustering under Heteroscedasticity (이분산 상황 하에서 정규혼합모형 기반 군집분석의 변수선택)

  • Kim, Seung-Gu
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.6
    • /
    • pp.1213-1224
    • /
    • 2011
  • In high dimensionality where the number of variables are excessively larger than observations, it is required to remove the noninformative variables to cluster observations. Most model-based approaches for variable selection have been considered under the assumption of homoscedasticity and their models are mainly estimated by a penalized likelihood method. In this paper, a different approach is proposed to remove the noninformative variables effectively and to cluster based on the modified normal mixture model simultaneously. The validity of the model was provided and an EM algorithm was derived to estimate the parameters. Simulation studies and an experiment using real microarray dataset showed the effectiveness of the proposed method.

Removing Non-informative Features by Robust Feature Wrapping Method for Microarray Gene Expression Data (유전자 알고리즘과 Feature Wrapping을 통한 마이크로어레이 데이타 중복 특징 소거법)

  • Lee, Jae-Sung;Kim, Dae-Won
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.8
    • /
    • pp.463-478
    • /
    • 2008
  • Due to the high dimensional problem, typically machine learning algorithms have relied on feature selection techniques in order to perform effective classification in microarray gene expression datasets. However, the large number of features compared to the number of samples makes the task of feature selection computationally inprohibitive and prone to errors. One of traditional feature selection approach was feature filtering; measuring one gene per one step. Then feature filtering was an univariate approach that cannot validate multivariate correlations. In this paper, we proposed a function for measuring both class separability and correlations. With this approach, we solved the problem related to feature filtering approach.

Identifying statistically significant gene sets based on differential expression and differential coexpression (특이발현과 특이공발현을 고려한 유의한 유전자 집단 탐색)

  • Lee, Sunho
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.3
    • /
    • pp.437-448
    • /
    • 2016
  • Gene set analysis utilizing biologic information is expected to produce more interpretable results because the occurrence of tumors (or diseases) is believed to be associated with the regulation of related genes. Many methods have been developed to identify statistically significant gene sets across different phenotypes; however, most focus exclusively on either the differential gene expression or the differential correlation structure in the gene set. This research provides a new method that simultaneously considers the differential expression of genes and differential coexpression with multiple genes in the gene set. Application of this NEW method is illustrated with real microarray data example, p53; subsequently, a simulation study compares its type I error rate and power with GSEA, SAMGS, GSCA and GSNCA.

Transcription Regulation Network Analysis of MCF7 Breast Cancer Cells Exposed to Estradiol

  • Wu, Jun-Zhao;Lu, Peng;Liu, Rong;Yang, Tie-Jian
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.13 no.8
    • /
    • pp.3681-3685
    • /
    • 2012
  • Background: In breast cancer, estrogen receptors have been demonstrated to interact with transcription factors to regulate target gene expression. However, high-throughput identification of the transcription regulation relationship between transcription factors and their target genes in response to estradiol is still in its infancy. Purpose: Thus, the objective of our study was to interpret the transcription regulation network of MCF7 breast cancer cells exposed to estradiol. Methods: In this work, GSE11352 microarray data were used to identify differentially expressed genes (DEGs). Results: Our results showed that the MYB (v-myb myeloblastosis viral oncogene homolog [avian]), PGR (progesterone receptor), and MYC (v-myc myelocytomatosis viral oncogene homolog [avian]) were hub nodes in our transcriptome network, which may interact with ER and, in turn, regulate target gene expression. MYB can up-regulate MCM3 (minichromosome maintenance 3) and MCM7 expression; PGR can suppress BCL2 (B-cell lymphoma 2) expression; MYC can inhibit TGFB2 (transforming growth factor, beta 2) expression. These genes are associated with breast cancer progression via cell cycling and the $TGF{\beta}$ signaling pathway. Conclusion: Analysis of transcriptional regulation may provide a better understanding of molecular mechanisms and clues to potential therapeutic targets in the treatment of breast cancer.