Search | Korea Science

Multiple Group Testing Procedures for Analysis of High-Dimensional Genomic Data

Ko, Hyoseok;Kim, Kipoong;Sun, Hokeun
- Genomics & Informatics
- /
- v.14 no.4
- /
- pp.187-195
- /
- 2016
In genetic association studies with high-dimensional genomic data, multiple group testing procedures are often required in order to identify disease/trait-related genes or genetic regions, where multiple genetic sites or variants are located within the same gene or genetic region. However, statistical testing procedures based on an individual test suffer from multiple testing issues such as the control of family-wise error rate and dependent tests. Moreover, detecting only a few of genes associated with a phenotype outcome among tens of thousands of genes is of main interest in genetic association studies. In this reason regularization procedures, where a phenotype outcome regresses on all genomic markers and then regression coefficients are estimated based on a penalized likelihood, have been considered as a good alternative approach to analysis of high-dimensional genomic data. But, selection performance of regularization procedures has been rarely compared with that of statistical group testing procedures. In this article, we performed extensive simulation studies where commonly used group testing procedures such as principal component analysis, Hotelling's $T^2$ test, and permutation test are compared with group lasso (least absolute selection and shrinkage operator) in terms of true positive selection. Also, we applied all methods considered in simulation studies to identify genes associated with ovarian cancer from over 20,000 genetic sites generated from Illumina Infinium HumanMethylation27K Beadchip. We found a big discrepancy of selected genes between multiple group testing procedures and group lasso.
https://doi.org/10.5808/GI.2016.14.4.187 인용 PDF KSCI

Identification of druggable genes for multiple myeloma based on genomic information

Rahmat Dani Satria;Lalu Muhammad Irham;Wirawan Adikusuma;Anisa Nova Puspitaningrum;Arief Rahman Afief;Riat El Khair;Abdi Wira Septama
- Genomics & Informatics
- /
- v.21 no.3
- /
- pp.31.1-31.8
- /
- 2023
Multiple myeloma (MM) is a hematological malignancy. It is widely believed that genetic factors play a significant role in the development of MM, as investigated in numerous studies. However, the application of genomic information for clinical purposes, including diagnostic and prognostic biomarkers, remains largely confined to research. In this study, we utilized genetic information from the Genomic-Driven Clinical Implementation for Multiple Myeloma database, which is dedicated to clinical trial studies on MM. This genetic information was sourced from the genome-wide association studies catalog database. We prioritized genes with the potential to cause MM based on established annotations, as well as biological risk genes for MM, as potential drug target candidates. The DrugBank database was employed to identify drug candidates targeting these genes. Our research led to the discovery of 14 MM biological risk genes and the identification of 10 drugs that target three of these genes. Notably, only one of these 10 drugs, panobinostat, has been approved for use in MM. The two most promising genes, calcium signal-modulating cyclophilin ligand (CAMLG) and histone deacetylase 2 (HDAC2), were targeted by four drugs (cyclosporine, belinostat, vorinostat, and romidepsin), all of which have clinical evidence supporting their use in the treatment of MM. Interestingly, five of the 10 drugs have been approved for other indications than MM, but they may also be effective in treating MM. Therefore, this study aimed to clarify the genomic variants involved in the pathogenesis of MM and highlight the potential benefits of these genomic variants in drug discovery.
https://doi.org/10.5808/gi.23011 인용 PDF

RFLP Analysis of cry1 and cry2 Genes of Bacillus thuringiensis Isolates from India

Patel, Ketan D.;Ingle, Sanjay S.
- Journal of Microbiology and Biotechnology
- /
- v.22 no.6
- /
- pp.729-735
- /
- 2012
The PCR-RFLP method has been useful for detection of known genes and identification of novel genes. In the present study, degenerate primers were designed from five groups of cry1 genes for PCR-RFLP analysis. Bacillus thuringiensis (Bt) isolates from different regions were evaluated for PCR amplification of various cry1 genes using newly designed primers and cry2 genes using reported primers. PCR analysis showed an abundance of cry1A genes and especially cry1Ac genes in isolates from all regions. RFLP analysis revealed the presence of multiple cry1A genes in isolates from central and southern regions. Unique digestion patterns of cry1A genes were observed in isolates from each region. Few of the isolates represented a digestion pattern of cry1A genes that did match to any of the known cry1A genes. RFLP analysis suggested an abundance of cry2Ab along with a novel cry2 gene in Bt isolates from different regions of India. Sequence analysis of the novel cry2 gene revealed 95% sequence identity to cry2Ab and cry2Ah genes. Phylogenetic analysis revealed that the novel cry2 gene could have diverged earlier than the other cry2 genes. Our results encourage finding of more diverse cry2 genes in Bt isolates. Rarefaction analysis was used to compare cry1A gene diversity in isolates from different soil types. It showed a higher degree of cry1A gene diversity in isolates from central region. In the present study, we propose the use of novel degenerate primers for cry1 genes and the PCR-RFLP method using a single enzyme to distinguish multiple cry1A and cry2 genes as well as identify novel genes.
https://doi.org/10.4014/jmb.1111.11046 인용 PDF KSCI

Multiple Testing in Genomic Sequences Using Hamming Distance

Kang, Moonsu
- Communications for Statistical Applications and Methods
- /
- v.19 no.6
- /
- pp.899-904
- /
- 2012
High-dimensional categorical data models with small sample sizes have not been used extensively in genomic sequences that involve count (or discrete) or purely qualitative responses. A basic task is to identify differentially expressed genes (or positions) among a number of genes. It requires an appropriate test statistics and a corresponding multiple testing procedure so that a multivariate analysis of variance should not be feasible. A family wise error rate(FWER) is not appropriate to test thousands of genes simultaneously in a multiple testing procedure. False discovery rate(FDR) is better than FWER in multiple testing problems. The data from the 2002-2003 SARS epidemic shows that a conventional FDR procedure and a proposed test statistic based on a pseudo-marginal approach with Hamming distance performs better.
https://doi.org/10.5351/CKSS.2012.19.6.899 인용 PDF KSCI

Agrobacterium-Mediated Co-transformation of Multiple Genes in Metarhizium robertsii

Padilla-Guerrero, Israel Enrique;Bidochka, Michael J.
- Mycobiology
- /
- v.45 no.2
- /
- pp.84-89
- /
- 2017
Fungi of the Metarhizium genus are a very versatile model for understanding pathogenicity in insects and their symbiotic relationship with plants. To establish a co-transformation system for the transformation of multiple M. robertsii genes using Agrobacterium tumefaciens, we evaluated whether the antibiotic nourseothricin has the same marker selection efficiency as phosphinothricin using separate vectors. Subsequently, in the two vectors containing the nourseothricin and phosphinothricin resistance cassettes were inserted eGFP and mCherry expression cassettes, respectively. These new vectors were then introduced independently into A. tumefaciens and used to transform M. robertsii either in independent events or in one single co-transformation event using an equimolar mixture of A. tumefaciens cultures. The number of transformants obtained by co-transformation was similar to that obtained by the individual transformation events. This method provides an additional strategy for the simultaneous insertion of multiple genes into M. robertsii.
https://doi.org/10.5941/MYCO.2017.45.2.84 인용 PDF KSCI

Dominance effects of ion transport and ion transport regulator genes on the final weight and backfat thickness of Landrace pigs by dominance deviation analysis

Lee, Young?Sup;Shin, Donghyun;Song, Ki?Duk
- Genes and Genomics
- /
- v.40 no.12
- /
- pp.1331-1338
- /
- 2018
Although there have been plenty of dominance deviation analysis, few studies have dealt with multiple phenotypes. Because researchers focused on multiple phenotypes (final weight and backfat thickness) of Landrace pigs, the classification of the genes was possible. With genome-wide association studies (GWASs), we analyzed the additive and dominance effects of the single nucleotide polymorphisms (SNPs). The classification of the pig genes into four categories (overdominance in final weight, overdominance in backfat thickness and overdominance in final weight, underdominance in backfat thickness, etc.) can enable us not only to analyze each phenotype's dominant effects, but also to illustrate the gene ontology (GO) analysis with different aspects. We aimed to determine the additive and dominant effect in backfat thickness and final weight and performed GO analysis. Using additive model and dominance deviation analysis in GWASs, Landrace pigs' overdominant and underdominant SNP effects in final weight and backfat thickness were surveyed. Then through GO analysis, we investigated the genes that were classified in the GWASs. The major GO terms of the underdominant effects in final weight and overdominant effects in backfat thickness were ion transport with the SLC8A3, KCNJ16, P2RX7 and TRPC3 genes. Interestingly, the major GO terms in the underdominant effects in the final weight and the underdominant effects in the backfat thickness were the regulation of ion transport with the STAC, GCK, TRPC6, UBASH3B, CAMK2D, CACNG4 and SCN4B genes. These results demonstrate that ion transport and ion transport regulation genes have distinct dominant effects. Through GWASs using the mode of linear additive model and dominance deviation, overdominant effects and underdominant effects in backfat thickness was contrary to each other in GO terms (ion transport and ion transport regulation, respectively). Additionally, because ion transport and ion transport regulation genes are associative with adipose tissue accumulation, we could infer that these two groups of genes had to do with unique fat accumulation mechanisms in Landrace pigs.
https://doi.org/10.1007/s13258-018-0728-7 인용

Multiple shRNA expressing vector enhances efficiency of gene silencing

Song, Jun;Giang, An;Lu, Yingchun;Pang, Shen;Chiu, Robert
- BMB Reports
- /
- v.41 no.5
- /
- pp.358-362
- /
- 2008
RNA interference (RNAi) is the process of sequence-specific gene silencing. However, RNAi efficiency still needs to be improved for effective inhibition of target genes. We have developed an effective strategy to express multiple shRNAs (small hairpin RNA) simultaneously using multiple RNA Polymerase III (Pol III) promoters in a single vector. Our data demonstrate that multiple shRNAs expressed from Pol III promoters have a synergistic effect in repressing the target gene. Silencing of endogenous cyclophilin A (CypA) or key HIV viral genes by multiple shRNAs results in significant inhibition of the target gene.
https://doi.org/10.5483/BMBRep.2008.41.5.358 인용 PDF

Ensemble Gene Selection Method Based on Multiple Tree Models

Mingzhu Lou
- Journal of Information Processing Systems
- /
- v.19 no.5
- /
- pp.652-662
- /
- 2023
Identifying highly discriminating genes is a critical step in tumor recognition tasks based on microarray gene expression profile data and machine learning. Gene selection based on tree models has been the subject of several studies. However, these methods are based on a single-tree model, often not robust to ultra-highdimensional microarray datasets, resulting in the loss of useful information and unsatisfactory classification accuracy. Motivated by the limitations of single-tree-based gene selection, in this study, ensemble gene selection methods based on multiple-tree models were studied to improve the classification performance of tumor identification. Specifically, we selected the three most representative tree models: ID3, random forest, and gradient boosting decision tree. Each tree model selects top-n genes from the microarray dataset based on its intrinsic mechanism. Subsequently, three ensemble gene selection methods were investigated, namely multipletree model intersection, multiple-tree module union, and multiple-tree module cross-union, were investigated. Experimental results on five benchmark public microarray gene expression datasets proved that the multiple tree module union is significantly superior to gene selection based on a single tree model and other competitive gene selection methods in classification accuracy.
https://doi.org/10.3745/JIPS.04.0290 인용 PDF

Discovery of Cellular RhoA Functions by the Integrated Application of Gene Set Enrichment Analysis

Chun, Kwang-Hoon
- Biomolecules & Therapeutics
- /
- v.30 no.1
- /
- pp.98-116
- /
- 2022
The small GTPase RhoA has been studied extensively for its role in actin dynamics. In this study, multiple bioinformatics tools were applied cooperatively to the microarray dataset GSE64714 to explore previously unidentified functions of RhoA. Comparative gene expression analysis revealed 545 differentially expressed genes in RhoA-null cells versus controls. Gene set enrichment analysis (GSEA) was conducted with three gene set collections: (1) the hallmark, (2) the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, and (3) the Gene Ontology Biological Process. GSEA results showed that RhoA is related strongly to diverse pathways: cell cycle/growth, DNA repair, metabolism, keratinization, response to fungus, and vesicular transport. These functions were verified by heatmap analysis, KEGG pathway diagramming, and direct acyclic graphing. The use of multiple gene set collections restricted the leakage of information extracted. However, gene sets from individual collections are heterogenous in gene element composition, number, and the contextual meaning embraced in names. Indeed, there was a limit to deriving functions with high accuracy and reliability simply from gene set names. The comparison of multiple gene set collections showed that although the gene sets had similar names, the gene elements were extremely heterogeneous. Thus, the type of collection chosen and the analytical context influence the interpretation of GSEA results. Nonetheless, the analyses of multiple collections made it possible to derive robust and consistent function identifications. This study confirmed several well-described roles of RhoA and revealed less explored functions, suggesting future research directions.
https://doi.org/10.4062/biomolther.2021.075 인용 PDF KSCI

Screening of Multiple Abiotic Stress-Induced Genes in Italian Ryegrass leaves

Lee, Sang-Hoon;Rahman, Md. Atikur;Kim, Kwan-Woo;Lee, Jin-Wook;Ji, Hee Chung;Choi, Gi Jun;Song, Yowook;Lee, Ki-Won
- Journal of The Korean Society of Grassland and Forage Science
- /
- v.38 no.3
- /
- pp.190-195
- /
- 2018
Cold, salt and heat are the most critical factors that restrict full genetic potential, growth and development of crops globally. However, clarification of genes expression and regulation is a fundamental approach to understanding the adaptive response of plants under unfavorable environments. In this study, we applied an annealing control primer (ACP) based on the GeneFishing approach to identify differentially expressed genes (DEGs) in Italian ryegrass (cv. Kowinearly) leaves under cold, salt and heat stresses. Two-week-old seedlings were exposed to cold ($4^{\circ}C$), salt (NaCl 200 mM) and heat ($42^{\circ}C$) treatments for six hours. A total 8 differentially expressed genes were isolated from ryegrass leaves. These genes were sequenced then identified and validated using the National Center for Biotechnology Information (NCBI) database. We identified several promising genes encoding light harvesting chlorophyll a/b binding protein, alpha-glactosidase b, chromosome 3B, elongation factor 1-alpha, FLbaf106f03, Lolium multiflorum plastid, complete genome, translation initiation factor SUI1, and glyceraldehyde-3-phosphate dehydrogenase. These genes were potentially involved in photosynthesis, plant development, protein synthesis and abiotic stress tolerance in plants. However, this study provides new insight regarding molecular information about several genes in response to multiple abiotic stresses. Additionally, these genes may be useful for enhancement of abiotic stress tolerance in fodder crops as well a crop improvement under unfavorable environmental conditions.
https://doi.org/10.5333/KGFS.2018.38.3.190 인용 PDF KSCI

Search Result 585, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)