• Title/Summary/Keyword: Significant gene-sets

Search Result 58, Processing Time 0.018 seconds

Identifying Statistically Significant Gene-Sets by Gene Set Enrichment Analysis Using Fisher Criterion (Fisher Criterion을 이용한 Gene Set Enrichment Analysis 기반 유의 유전자 집합의 검출 방법 연구)

  • Kim, Jae-Young;Shin, Mi-Young
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.4
    • /
    • pp.19-26
    • /
    • 2008
  • Gene set enrichment analysis (GSEA) is a computational method to identify statistically significant gene sets showing significant differences between two groups of microarray expression profiles and simultaneously uncover their biological meanings in an elegant way by employing gene annotation databases, such as Cytogenetic Band, KEGG pathways, gene ontology, and etc. For the gone set enrichment analysis, all the genes in a given dataset are first ordered by the signal-to-noise ratio between the groups and then further analyses are proceeded. Despite of its impressive results in several previous studies, however, gene ranking by the signal-to-noise ratio makes it difficult to consider highly up-regulated genes and highly down-regulated genes at the same time as the candidates of significant genes, which possibly reflect certain situations incurred in metabolic and signaling pathways. To deal with this problem, in this article, we investigate the gene set enrichment analysis method with Fisher criterion for gene ranking and also evaluate its effects in Leukemia related pathway analyses.

A study on alternatives to the permutation test in gene-set analysis (유전자집합분석에서 순열검정의 대안)

  • Lee, Sunho
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.2
    • /
    • pp.241-251
    • /
    • 2018
  • The analysis of gene sets in microarray has advantages in interpreting biological functions and increasing statistical powers. Many statistical methods have been proposed for detecting significant gene sets that show relations between genes and phenotypes, but there is no consensus about which is the best to perform gene sets analysis and permutation based tests are considered as standard tools. When many gene sets are tested simultaneously, a large number of random permutations are needed for multiple testing with a high computational cost. In this paper, several parametric approximations are considered as alternatives of the permutation distribution and the moment based gene set test has shown the best performance for providing p-values of the permutation test closely and quickly on a general framework.

A Method of Identifying Disease-related Significant Pathways Using Time-Series Microarray Data (시간열 마이크로어레이 데이터를 이용한 질병 관련 유의한 패스웨이 유전자 집합의 검출)

  • Kim, Jae-Young;Shin, Mi-Young
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.5
    • /
    • pp.17-24
    • /
    • 2010
  • Recently the study of identifying bio-markers for disease diagnosis and prognosis has been actively performed. In particular, lots of attentions have been paid to the finding of pathway gene-sets differentially expressed in disease patients rather than the finding of individual gene markers. In this paper we propose a novel method to identify disease-related pathway gene-sets based on time-series microarray data. For this purpose, we firstly compute individual gene scores by the using maSigPro (microarray Significant Profiles) and then arrange all the genes in the decreasing order of the corresponding gene scores. The rank of each gene in the entire list is used to evaluate the statistical significance of candidate gene-sets with Wilcoxson rank sum test. For the generation of candidate gene-sets, MSigDB (Molecular Signatures Database) pathway information has been employed. The experiment was conducted with prostate cancer time-series microarray data and the results showed the usefulness of the proposed method by correctly identifying 6 out of 7 biological pathways already known as being actually related to prostate cancer.

Comparison of Univariate and Multivariate Gene Set Analysis in Acute Lymphoblastic Leukemia

  • Soheila, Khodakarim;Hamid, AlaviMajd;Farid, Zayeri;Mostafa, Rezaei-Tavirani;Nasrin, Dehghan-Nayeri;Syyed-Mohammad, Tabatabaee;Vahide, Tajalli
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.14 no.3
    • /
    • pp.1629-1633
    • /
    • 2013
  • Background: Gene set analysis (GSA) incorporates biological with statistical knowledge to identify gene sets which are differentially expressed that between two or more phenotypes. Materials and Methods: In this paper gene sets differentially expressed between acute lymphoblastic leukaemia (ALL) with BCR-ABL and those with no observed cytogenetic abnormalities were determined by GSA methods. The BCR-ABL is an abnormal gene found in some people with ALL. Results: The results of two GSAs showed that the Category test identified 30 gene sets differentially expressed between two phenotypes, while the Hotelling's $T^2$ could discover just 19 gene sets. On the other hand, assessment of common genes among significant gene sets showed that there were high agreement between the results of GSA and the findings of biologists. In addition, the performance of these methods was compared by simulated and ALL data. Conclusions: The results on simulated data indicated decrease in the type I error rate and increase the power in multivariate (Hotelling's $T^2$) test as increasing the correlation between gene pairs in contrast to the univariate (Category) test.

Gene Set Analysis - Absolute and Trim (절대치와 절삭을 이용한 유전자 집단 분석)

  • Lee, Kwang-Hyun;Lee, Sun-Ho
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.3
    • /
    • pp.523-535
    • /
    • 2008
  • Initial work of microarray data analysis focused on identification of differentially expressed genes, and recently, the focus has moved to discovering significant sets of functionally related genes. We describe some problems of GSEA and PAGE, and propose a modified method to identify significant gene sets. The results based on a simulated experiment and real data analysis using a set of publicly available data show the superiority of the newly proposed method, GSA-AT, in detecting significant pathways with the accurate prediction.

Identifying statistically significant gene sets based on differential expression and differential coexpression (특이발현과 특이공발현을 고려한 유의한 유전자 집단 탐색)

  • Lee, Sunho
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.3
    • /
    • pp.437-448
    • /
    • 2016
  • Gene set analysis utilizing biologic information is expected to produce more interpretable results because the occurrence of tumors (or diseases) is believed to be associated with the regulation of related genes. Many methods have been developed to identify statistically significant gene sets across different phenotypes; however, most focus exclusively on either the differential gene expression or the differential correlation structure in the gene set. This research provides a new method that simultaneously considers the differential expression of genes and differential coexpression with multiple genes in the gene set. Application of this NEW method is illustrated with real microarray data example, p53; subsequently, a simulation study compares its type I error rate and power with GSEA, SAMGS, GSCA and GSNCA.

Significant Gene Selection Using Integrated Microarray Data Set with Batch Effect

  • Kim Ki-Yeol;Chung Hyun-Cheol;Jeung Hei-Cheul;Shin Ji-Hye;Kim Tae-Soo;Rha Sun-Young
    • Genomics & Informatics
    • /
    • v.4 no.3
    • /
    • pp.110-117
    • /
    • 2006
  • In microarray technology, many diverse experimental features can cause biases including RNA sources, microarray production or different platforms, diverse sample processing and various experiment protocols. These systematic effects cause a substantial obstacle in the analysis of microarray data. When such data sets derived from different experimental processes were used, the analysis result was almost inconsistent and it is not reliable. Therefore, one of the most pressing challenges in the microarray field is how to combine data that comes from two different groups. As the novel trial to integrate two data sets with batch effect, we simply applied standardization to microarray data before the significant gene selection. In the gene selection step, we used new defined measure that considers the distance between a gene and an ideal gene as well as the between-slide and within-slide variations. Also we discussed the association of biological functions and different expression patterns in selected discriminative gene set. As a result, we could confirm that batch effect was minimized by standardization and the selected genes from the standardized data included various expression pattems and the significant biological functions.

Possibility of the Use of Public Microarray Database for Identifying Significant Genes Associated with Oral Squamous Cell Carcinoma

  • Kim, Ki-Yeol;Cha, In-Ho
    • Genomics & Informatics
    • /
    • v.10 no.1
    • /
    • pp.23-32
    • /
    • 2012
  • There are lots of studies attempting to identify the expression changes in oral squamous cell carcinoma. Most studies include insufficient samples to apply statistical methods for detecting significant gene sets. This study combined two small microarray datasets from a public database and identified significant genes associated with the progress of oral squamous cell carcinoma. There were different expression scales between the two datasets, even though these datasets were generated under the same platforms - Affymetrix U133A gene chips. We discretized gene expressions of the two datasets by adjusting the differences between the datasets for detecting the more reliable information. From the combination of the two datasets, we detected 51 significant genes that were upregulated in oral squamous cell carcinoma. Most of them were published in previous studies as cancer-related genes. From these selected genes, significant genetic pathways associated with expression changes were identified. By combining several datasets from the public database, sufficient samples can be obtained for detecting reliable information. Most of the selected genes were known as cancer-related genes, including oral squamous cell carcinoma. Several unknown genes can be biologically evaluated in further studies.

Meta- and Gene Set Analysis of Stomach Cancer Gene Expression Data

  • Kim, Seon-Young;Kim, Jeong-Hwan;Lee, Heun-Sik;Noh, Seung-Moo;Song, Kyu-Sang;Cho, June-Sik;Jeong, Hyun-Yong;Kim, Woo Ho;Yeom, Young-Il;Kim, Nam-Soon;Kim, Sangsoo;Yoo, Hyang-Sook;Kim, Yong Sung
    • Molecules and Cells
    • /
    • v.24 no.2
    • /
    • pp.200-209
    • /
    • 2007
  • We generated gene expression data from the tissues of 50 gastric cancer patients, and applied meta-analysis and gene set analysis to this data and three other stomach cancer gene expression data sets to define the gene expression changes in gastric tumors. By meta-analysis we identified genes consistently changed in gastric carcinomas, while gene set analysis revealed consistently changed biological themes. Genes and gene sets involved in digestion, fatty acid metabolism, and ion transport were consistently down-regulated in gastric carcinomas, while those involved in cellular proliferation, cell cycle, and DNA replication were consistently up-regulated. We also found significant differences between the genes and gene sets expressed in diffuse and intestinal type gastric carcinoma. By gene set analysis of cytogenetic bands, we identified many chromosomal regions with possible gross chromosomal changes (amplifications or deletions). Similar analysis of transcription factor binding sites (TFBSs), revealed transcription factors that may have caused the observed gene expression changes in gastric carcinomas, and we confirmed the overexpression of one of these, E2F1, in many gastric carcinomas by tissue array and immunohistochemistry. We have incorporated the results of our meta- and gene set analyses into a web accessible database (http://human-genome.kribb.re.kr/stomach/).

Relationship between angiotensin-converting enzyme gene polymorphism and muscle damage parameters after eccentric exercise

  • Kim, Jooyoung;Kim, Chang-Sun;Lee, Joohyung
    • Korean Journal of Exercise Nutrition
    • /
    • v.17 no.2
    • /
    • pp.25-34
    • /
    • 2013
  • This study was conducted to investigate the relationship between ACE gene polymorphism and muscle damage parameters after eccentric exercise. 80 collegiate males were instructed to take an eccentric exercise with the elbow flexor muscle through the modified preacher curl machine for 2 sets of 25 cycles (total 50 cycles). The maximal isometric strength, muscle soreness, creatine kinase (CK), and myoglobin (Mb) were measured before exercise, and 0, 24, 48, 72, and 96 hrs after exercise. The result showed that after the eccentric exercise, the maximal isometric strength significantly decreased by more than 50% (p < 0.001) and the muscle soreness, CK, and Mb significantly increased compared to those before the exercise (p < 0.001). The ACE gene polymorphism of the subjects was classified using real-time polymerase chain reaction (real-time PCR). The result showed that it consisted of 38 cases of type II (46.4%), 33 cases of type ID (43.4%), and 9 cases of type DD (10.2%). The Hardy-Weinberg equilibrium for ACE gene polymorphism was shown to have p = 0.653, which showed that each allele was evenly distributed. Although significant differences in the changes in the maximal isometric strength, muscle soreness, CK, and Mb were found according to time course (p < 0.001), no significant differences in the changes in the maximal isometric strength, muscle soreness, CK, and Mb were found according to ACE gene polymorphism. Furthermore, no significant difference in the changes in the muscle damage parameters was found according to interaction between ACE gene polymorphism and time course (p > 0.05). In conclusion, the level of the muscle damage parameters changed in the injured muscle after eccentric exercise, but these changes in the muscle damage parameters were not affected by ACE gene polymorphism. The result of this study indicates that ACE gene is not a candidate gene that explains muscle damage.