• 제목/요약/키워드: genomic variants

검색결과 106건 처리시간 0.028초

Multi-omics techniques for the genetic and epigenetic analysis of rare diseases

  • Yeonsong Choi;David Whee-Young Choi;Semin Lee
    • Journal of Genetic Medicine
    • /
    • 제20권1호
    • /
    • pp.1-5
    • /
    • 2023
  • Until now, rare disease studies have mainly been carried out by detecting simple variants such as single nucleotide substitutions and short insertions and deletions in protein-coding regions of disease-associated gene panels using diagnostic next-generation sequencing in association with patient phenotypes. However, several recent studies reported that the detection rate hardly exceeds 50% even when whole-exome sequencing is applied. Therefore, the necessity of introducing whole-genome sequencing is emerging to discover more diverse genomic variants and examine their association with rare diseases. When no diagnosis is provided by whole-genome sequencing, additional omics techniques such as RNA-seq also can be considered to further interrogate causal variants. This paper will introduce a description of these multi-omics techniques and their applications in rare disease studies.

Selection probability of multivariate regularization to identify pleiotropic variants in genetic association studies

  • Kim, Kipoong;Sun, Hokeun
    • Communications for Statistical Applications and Methods
    • /
    • 제27권5호
    • /
    • pp.535-546
    • /
    • 2020
  • In genetic association studies, pleiotropy is a phenomenon where a variant or a genetic region affects multiple traits or diseases. There have been many studies identifying cross-phenotype genetic associations. But, most of statistical approaches for detection of pleiotropy are based on individual tests where a single variant association with multiple traits is tested one at a time. These approaches fail to account for relations among correlated variants. Recently, multivariate regularization methods have been proposed to detect pleiotropy in analysis of high-dimensional genomic data. However, they suffer a problem of tuning parameter selection, which often results in either too many false positives or too small true positives. In this article, we applied selection probability to multivariate regularization methods in order to identify pleiotropic variants associated with multiple phenotypes. Selection probability was applied to individual elastic-net, unified elastic-net and multi-response elastic-net regularization methods. In simulation studies, selection performance of three multivariate regularization methods was evaluated when the total number of phenotypes, the number of phenotypes associated with a variant, and correlations among phenotypes are different. We also applied the regularization methods to a wild bean dataset consisting of 169,028 variants and 17 phenotypes.

Short Reads Phasing to Construct Haplotypes in Genomic Regions That Are Associated with Body Mass Index in Korean Individuals

  • Lee, Kichan;Han, Seonggyun;Tark, Yeonjeong;Kim, Sangsoo
    • Genomics & Informatics
    • /
    • 제12권4호
    • /
    • pp.165-170
    • /
    • 2014
  • Genome-wide association (GWA) studies have found many important genetic variants that affect various traits. Since these studies are useful to investigate untyped but causal variants using linkage disequilibrium (LD), it would be useful to explore the haplotypes of single-nucleotide polymorphisms (SNPs) within the same LD block of significant associations based on high-density variants from population references. Here, we tried to make a haplotype catalog affecting body mass index (BMI) through an integrative analysis of previously published whole-genome next-generation sequencing (NGS) data of 7 representative Korean individuals and previously known Korean GWA signals. We selected 435 SNPs that were significantly associated with BMI from the GWA analysis and searched 53 LD ranges nearby those SNPs. With the NGS data, the haplotypes were phased within the LDs. A total of 44 possible haplotype blocks for Korean BMI were cataloged. Although the current result constitutes little data, this study provides new insights that may help to identify important haplotypes for traits and low variants nearby significant SNPs. Furthermore, we can build a more comprehensive catalog as a larger dataset becomes available.

Association of the X-linked Androgen Receptor Leu57Gln Polymorphism with Monomelic Amyotrophy

  • Park, Young-Mi;Lim, Young-Min;Kim, Dae-Seong;Lee, Jong-Keuk;Kim, Kwang-Kuk
    • Genomics & Informatics
    • /
    • 제9권2호
    • /
    • pp.64-68
    • /
    • 2011
  • Monomelic amyotrophy (MA), also known as Hirayama disease, occurs mainly in young men and manifests as weakness and wasting of the muscles of the distal upper limbs. Here, we sought to identify a genetic basis for MA. Given the predominance of MA in males, we focused on candidate neurological disease genes located on the X chromosome, selecting two X-linked candidate genes, androgen receptor (AR ) and ubiquitin-like modifier activating enzyme 1 (UBA1). Screening for genetic variants using patients' genomic DNA revealed three known genetic variants in the coding region of the AR gene: one nonsynonymous single-nucleotide polymorphism (SNP; rs78686797) encoding Leu57Gln, and two variants of polymorphic trinucleotide repeat segments that encode polyglutamine (CAG repeat; rs5902610) and polyglycine (GGC repeat; rs3138869) tracts. Notably, the Leu57Gln polymorphism was found in two patients with MA from 24 MA patients, whereas no variants were found in 142 healthy male controls. However, the numbers of CAG and GGC repeats in the AR gene were within the normal range. These data suggest that the Leu57Gln polymorphism encoded by the X-linked AR gene may contribute to the development of MA.

Chromosome-Centric Human Proteome Study of Chromosome 11 Team

  • Hwang, Heeyoun;Kim, Jin Young;Yoo, Jong Shin
    • Mass Spectrometry Letters
    • /
    • 제12권3호
    • /
    • pp.60-65
    • /
    • 2021
  • As a part of the Chromosome-centric Human Proteome Project (C-HPP), we have developed a few algorithms for accurate identification of missing proteins, alternative splicing variants, single amino acid variants, and characterization of function unannotated proteins. We have found missing proteins, novel and known ASVs, and SAAVs using LC-MS/MS data from human brain and olfactory epithelial tissue, where we validated their existence using synthetic peptides. According to the neXtProt database, the number of missing proteins in chromosome 11 shows a decreasing pattern. The development of genomic and transcriptomic sequencing techniques make the number of protein variants in chromosome 11 tremendously increase. We developed a web solution named as SAAvpedia for identification and function annotation of SAAVs, and the SAAV information is automatically transformed into the neXtProt web page using REST API service. For the 73 uPE1 in chromosome 11, we have studied the function annotaion of CCDC90B (NX_Q9GZT6), SMAP (NX_O00193), and C11orf52 (NX_Q96A22).

Generation of Protein Lineages with new Sequence Spaces by Functional Salvage Screen

  • Kim, Geun-Joong;Cheon, Young-Hoon;Park, Min-Soon;Park, Hee-Sung;Kim, Hak-Sung
    • 한국미생물생명공학회:학술대회논문집
    • /
    • 한국미생물생명공학회 2001년도 Proceedings of 2001 International Symposium
    • /
    • pp.77-80
    • /
    • 2001
  • A variety of different methods to generate diverse proteins, including random mutagenesis and recombination, are currently available, and most of them accumulate the mutations on the target gene of a protein, whose sequence space remains unchanged. On the other hand, a pool of diverse genes, which is generated by random insertions, deletions, and exchange of the homologous domains with different lengths in the target gene, would present the protein lineages resulting in new fitness landscapes. Here we report a method to generate a pool of protein variants with different sequence spaces by employing green fluorescent protein (GFP) as a model protein. This process, designated functional salvage screen (FSS), comprises the following procedures: a defective GFP template expressing no fluorescence is firstly constructed by genetically disrupting a predetermined region(s) of the protein, and a library of GFP variants is generated from the defective template by incorporating the randomly fragmented genomic DNA from E. coli into the defined region(s) of the target gene, followed by screening of the functionally salvaged, fluorescence-emitting GFPs. Two approaches, sequence-directed and PCR-coupled methods, were attempted to generate the library of GFP variants with new sequences derived from the genomic segments of E. coli. The functionally salvaged GFPs were selected and analyzed in terms of the sequence space and functional property. The results demonstrate that the functional salvage process not only can be a simple and effective method to create protein lineages with new sequence spaces, but also can be useful in elucidating the involvement of a specific region(s) or domain(s) in the structure and function of protein.

  • PDF

Genetic Characterization of Molecular Targets in Korean Patients with Gastrointestinal Stromal Tumors

  • Park, Joonhong;Yoo, Han Mo;Sul, Hae Jung;Shin, Soyoung;Lee, Seung Woo;Kim, Jeong Goo
    • Journal of Gastric Cancer
    • /
    • 제20권1호
    • /
    • pp.29-40
    • /
    • 2020
  • Purpose: Gastrointestinal stromal tumors (GISTs) frequently harbor activating gene mutations in either KIT or platelet-derived growth factor receptor A (PDGFRA) and are highly responsive to several selective tyrosine kinase inhibitors. In this study, a targeted next-generation sequencing (NGS) assay with an Oncomine Focus Assay (OFA) panel was used for the genetic characterization of molecular targets in 30 Korean patients with GIST. Materials and Methods: Using the OFA that enables rapid and simultaneous detection of hotspots, single nucleotide variants (SNVs), insertion and deletions (Indels), copy number variants (CNVs), and gene fusions across 52 genes relevant to solid tumors, targeted NGS was performed using genomic DNA extracted from formalin-fixed and paraffin-embedded samples of 30 GISTs. Results: Forty-three hotspot/other likely pathogenic variants (33 SNVs, 8 Indels, and 2 amplifications) in 16 genes were identified in 26 of the 30 GISTs. KIT variants were most frequent (44%, 19/43), followed by 6 variants in PIK3CA, 3 in PDGFRA, 2 each in JAK1 and EGFR, and 1 each in AKT1, ALK, CCND1, CTNNB1, FGFR3, FGFR4, GNA11, GNAQ, JAK3, MET, and SMO. Based on the mutation types, majority of the variants carried missense mutations (60%, 26/43), followed by 8 frameshifts, 6 nonsense, 1 stop-loss, and 2 amplifications. Conclusions: Our study confirmed the advantage of using targeted NGS with a cancer gene panel to efficiently identify mutations associated with GISTs. These findings may provide a molecular genetic basis for developing new drugs targeting these gene mutations for GIST therapy.

Genomic partitioning of growth traits using a high-density single nucleotide polymorphism array in Hanwoo (Korean cattle)

  • Park, Mi Na;Seo, Dongwon;Chung, Ki-Yong;Lee, Soo-Hyun;Chung, Yoon-Ji;Lee, Hyo-Jun;Lee, Jun-Heon;Park, Byoungho;Choi, Tae-Jeong;Lee, Seung-Hwan
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제33권10호
    • /
    • pp.1558-1565
    • /
    • 2020
  • Objective: The objective of this study was to characterize the number of loci affecting growth traits and the distribution of single nucleotide polymorphism (SNP) effects on growth traits, and to understand the genetic architecture for growth traits in Hanwoo (Korean cattle) using genome-wide association study (GWAS), genomic partitioning, and hierarchical Bayesian mixture models. Methods: GWAS: A single-marker regression-based mixed model was used to test the association between SNPs and causal variants. A genotype relationship matrix was fitted as a random effect in this linear mixed model to correct the genetic structure of a sire family. Genomic restricted maximum likelihood and BayesR: A priori information included setting the fixed additive genetic variance to a pre-specified value; the first mixture component was set to zero, the second to 0.0001×σ2g, the third 0.001×σ2g, and the fourth to 0.01×σ2g. BayesR fixed a priori information was not more than 1% of the genetic variance for each of the SNPs affecting the mixed distribution. Results: The GWAS revealed common genomic regions of 2 Mb on bovine chromosome 14 (BTA14) and 3 had a moderate effect that may contain causal variants for body weight at 6, 12, 18, and 24 months. This genomic region explained approximately 10% of the variance against total additive genetic variance and body weight heritability at 12, 18, and 24 months. BayesR identified the exact genomic region containing causal SNPs on BTA14, 3, and 22. However, the genetic variance explained by each chromosome or SNP was estimated to be very small compared to the total additive genetic variance. Causal SNPs for growth trait on BTA14 explained only 0.04% to 0.5% of the genetic variance Conclusion: Segregating mutations have a moderate effect on BTA14, 3, and 19; many other loci with small effects on growth traits at different ages were also identified.

Multiple Group Testing Procedures for Analysis of High-Dimensional Genomic Data

  • Ko, Hyoseok;Kim, Kipoong;Sun, Hokeun
    • Genomics & Informatics
    • /
    • 제14권4호
    • /
    • pp.187-195
    • /
    • 2016
  • In genetic association studies with high-dimensional genomic data, multiple group testing procedures are often required in order to identify disease/trait-related genes or genetic regions, where multiple genetic sites or variants are located within the same gene or genetic region. However, statistical testing procedures based on an individual test suffer from multiple testing issues such as the control of family-wise error rate and dependent tests. Moreover, detecting only a few of genes associated with a phenotype outcome among tens of thousands of genes is of main interest in genetic association studies. In this reason regularization procedures, where a phenotype outcome regresses on all genomic markers and then regression coefficients are estimated based on a penalized likelihood, have been considered as a good alternative approach to analysis of high-dimensional genomic data. But, selection performance of regularization procedures has been rarely compared with that of statistical group testing procedures. In this article, we performed extensive simulation studies where commonly used group testing procedures such as principal component analysis, Hotelling's $T^2$ test, and permutation test are compared with group lasso (least absolute selection and shrinkage operator) in terms of true positive selection. Also, we applied all methods considered in simulation studies to identify genes associated with ovarian cancer from over 20,000 genetic sites generated from Illumina Infinium HumanMethylation27K Beadchip. We found a big discrepancy of selected genes between multiple group testing procedures and group lasso.

Repeated Random Mutagenesis of ${\alpha}$-Amylase from Bacillus licheniformis for Improved pH Performance

  • Priyadharshini, Ramachandran;Manoharan, Shankar;Hemalatha, Devaraj;Gunasekaran, Paramasamy
    • Journal of Microbiology and Biotechnology
    • /
    • 제20권12호
    • /
    • pp.1696-1701
    • /
    • 2010
  • The ${\alpha}$-amylases activity was improved by random mutagenesis and screening. A region comprising residues from the position 34-281 was randomly mutated in B. licheniformis ${\alpha}$-amylase (AmyL), and the library with mutations ranging from low, medium, and high frequencies was generated. The library was screened using an effective liquid-phase screening method to isolate mutants with an altered pH profile. The sequencing of improved variants indicated 2-5 amino acid changes. Among them, mutant TP8H5 showed an altered pH profile as compared with that of wild type. The sequencing of variant TP8H5 indicated 2 amino acid changes, Ile157Ser and Trp193Arg, which were located in the solvent accessible flexible loop region in domain B.