• Title/Summary/Keyword: reference genome

Search Result 193, Processing Time 0.031 seconds

Birth of an 'Asian cool' reference genome: AK1

  • Kim, Changhoon
    • BMB Reports
    • /
    • v.49 no.12
    • /
    • pp.653-654
    • /
    • 2016
  • The human reference genome, maintained by the Genome Reference Consortium, is conceivably the most complete genome assembly ever, since its first construction. It has continually been improved by incorporating corrections made to the previous assemblies, thanks to various technological advances. Many currently-ongoing population sequencing projects have been based on this reference genome, heightening hopes of the development of useful medical applications of genomic information, thanks to the recent maturation of high-throughput sequencing technologies. However, just one reference genome does not fit all the populations across the globe, because of the large diversity in genomic structures and technical limitations inherent to short read sequencing methods. The recent success in de novo construction of the highly contiguous Asian diploid genome AK1, by combining single molecule technologies with routine sequencing data without resorting to traditional clone-by-clone sequencing and physical mapping, reveals the nature of genomic structure variation by detecting thousands of novel structural variations and by finally filling in some of the prior gaps which had persistently remained in the current human reference genome. Now it is expected that the AK1 genome, soon to be paired with more upcoming de novo assembled genomes, will provide a chance to explore what it is really like to use ancestry-specific reference genomes instead of hg19/hg38 for population genomics. This is a major step towards the furthering of genetically-based precision medicine.

Prediction of an Essential Gene with Potential Drug Target Property in Streptococcus suis Using Comparative Genomics

  • Zaman, Aubhishek
    • Interdisciplinary Bio Central
    • /
    • v.4 no.4
    • /
    • pp.11.1-11.8
    • /
    • 2012
  • Genes that are indispensable for survival are referred to as essential gene. Due to the momentous significance of these genes for cellular activity they can be selected potentially as drug targets. Here in this study, an essential gene for Streptococcus suis was predicted using coherent statistical analysis and powerful genome comparison computational method. At first the whole genome protein scatter plot was generated and subsequently, on the basis of statistical significance, a reference genome was chosen. The parameters set forth for selecting the reference genome was that the genome of the query (Streptococcus suis) and subject must fall in the same genus and yet they must vary to a good degree. Streptococcus pneumoniae was found to be suitable as the reference genome. A whole genome comparison was performed for the reference (Streptococcus pneumoniae) and the query genome (Streptococcus suis) and 14 conserved proteins from them were subjected to a screen for potential essential gene property. Among those 14 only one essential gene was found to be with impressive similarity score between reference and query. The essential gene encodes for a type of 'Clp protease'. Clp proteases play major roles in degrading misfolded proteins. Results found here should help formulating a drug against Strptococcus suis which is responsible for mild to severe clinical conditions in human. However, like many other computational studies, the study has to be validated furthermore through in vitro assays for concrete proof.

Detection of hydin Gene Duplication in Personal Genome Sequence Data

  • Kim, Jong-Il;Ju, Young-Seok;Kim, Shee-Hyun;Hong, Dong-Wan;Seo, Jeong-Sun
    • Genomics & Informatics
    • /
    • v.7 no.3
    • /
    • pp.159-162
    • /
    • 2009
  • Human personal genome sequencing can be done with high efficiency by aligning a huge number of short reads derived from various next generation sequencing (NGS) technologies to the reference genome sequence. One of the major obstacles is the incompleteness of human reference genome. We tried to analyze the effect of hidden gene duplication on the NGS data using the known example of hydin gene. Hydin2, a duplicated copy of hydin on chromosome 16q22, has been recently found to be localized to chromosome 1q21, and is not included in the current version of standard human genome reference. We found that all of eight personal genome data published so far do not contain hydin2, and there is large number of nsSNPs in hydin. The heterozygosity of those nsSNPs was significantly higher than expected. The sequence coverage depth in hydin gene was about two fold of average depth. We believe that these unique finding of hydin can be used as useful indicators to discover new hidden multiplication in human genome.

Current status and prospects to identify mutations responsible for mutant phenotypes by using NGS technology (NGS 기술 활용 돌연변이체 해석 및 연구현황)

  • Jung, Yu Jin;Ryu, Ho Jin;Cho, Yong-Gu;Kang, Kwon Kyoo
    • Journal of Plant Biotechnology
    • /
    • v.43 no.4
    • /
    • pp.411-416
    • /
    • 2016
  • Next-generation sequencing allows the identification of mutations responsible for mutant phenotypes by whole-genome resequencing and alignment to a reference genome. However, when the resequenced cultivar/line displays significant structural variation from the reference genome, mutations in the genome regions absent in the reference cannot be identified by simple alignment. In this review, we report the current status and prospects in identification of genes in mutant phenotypes, by using the methods MutMap, MutMap-Gap, and MutMap+. These methods delineate a candidate region harboring a mutation of interest, followed by de novo assembly, alignment, and identification of the mutation within genome gaps. These methods are likely to prove useful for cloning genes that exhibit significant structural variations, such as disease resistance genes of the nucleotide-binding site-leucine rich repeat (NBS-LRR) class.

misMM: An Integrated Pipeline for Misassembly Detection Using Genotyping-by-Sequencing and Its Validation with BAC End Library Sequences and Gene Synteny

  • Ko, Young-Joon;Kim, Jung Sun;Kim, Sangsoo
    • Genomics & Informatics
    • /
    • v.15 no.4
    • /
    • pp.128-135
    • /
    • 2017
  • As next-generation sequencing technologies have advanced, enormous amounts of whole-genome sequence information in various species have been released. However, it is still difficult to assemble the whole genome precisely, due to inherent limitations of short-read sequencing technologies. In particular, the complexities of plants are incomparable to those of microorganisms or animals because of whole-genome duplications, repeat insertions, and Numt insertions, etc. In this study, we describe a new method for detecting misassembly sequence regions of Brassica rapa with genotyping-by-sequencing, followed by MadMapper clustering. The misassembly candidate regions were cross-checked with BAC clone paired-ends library sequences that have been mapped to the reference genome. The results were further verified with gene synteny relations between Brassica rapa and Arabidopsis thaliana. We conclude that this method will help detect misassembly regions and be applicable to incompletely assembled reference genomes from a variety of species.

Perspectives of International Human Epigenome Consortium

  • Bae, Jae-Bum
    • Genomics & Informatics
    • /
    • v.11 no.1
    • /
    • pp.7-14
    • /
    • 2013
  • As the International Human Epigenome Consortium (IHEC) launched officially at the 2010 Washington meeting, a giant step toward the conquest of unexplored regions of the human genome has begun. IHEC aims at the production of 1,000 reference epigenomes to the international scientific community for next 7-10 years. Seven member institutions, including South Korea, Korea National Institute of Health (KNIH), will produce 25-200 reference epigenomes individually, and the produced data will be publically available by using a data center. Epigenome data will cover from whole genome bisulfite sequencing, histone modification, and chromatin access information to miRNA-seq. The final goal of IHEC is the production of reference maps of human epigenomes for key cellular status relevant to health and disease.

Whole Mitochondrial Genome Sequence of an Indian Plasmodium falciparum Field Isolate

  • Tyagi, Suchi;Pande, Veena;Das, Aparup
    • Parasites, Hosts and Diseases
    • /
    • v.52 no.1
    • /
    • pp.99-103
    • /
    • 2014
  • Mitochondrial genome sequence of malaria parasites has served as a potential marker for inferring evolutionary history of the Plasmodium genus. In Plasmodium falciparum, the mitochondrial genome sequences from around the globe have provided important evolutionary understanding, but no Indian sequence has yet been utilized. We have sequenced the whole mitochondrial genome of a single P. falciparum field isolate from India using novel primers and compared with the 3D7 reference sequence and 1 previously reported Indian sequence. While the 2 Indian sequences were highly divergent from each other, the presently sequenced isolate was highly similar to the reference 3D7 strain.

Genomic Tools and Their Implications for Vegetable Breeding

  • Phan, Ngan Thi;Sim, Sung-Chur
    • Horticultural Science & Technology
    • /
    • v.35 no.2
    • /
    • pp.149-164
    • /
    • 2017
  • Next generation sequencing (NGS) technologies have led to the rapid accumulation of genome sequences through whole-genome sequencing and re-sequencing of crop species. Genomic resources provide the opportunity for a new revolution in plant breeding by facilitating the dissection of complex traits. Among vegetable crops, reference genomes have been sequenced and assembled for several species in the Solanaceae and Cucurbitaceae families, including tomato, pepper, cucumber, watermelon, and melon. These reference genomes have been leveraged for re-sequencing of diverse germplasm collections to explore genome-wide sequence variations, especially single nucleotide polymorphisms (SNPs). The use of genome-wide SNPs and high-throughput genotyping methods has led to the development of new strategies for dissecting complex quantitative traits, such as genome-wide association study (GWAS). In addition, the use of multi-parent populations, including nested association mapping (NAM) and multiparent advanced generation intercross (MAGIC) populations, has helped increase the accuracy of quantitative trait loci (QTL) detection. Consequently, a number of QTL have been discovered for agronomically important traits, such as disease resistance and fruit traits, with high mapping resolution. The molecular markers for these QTL represent a useful resource for enhancing selection efficiency via marker-assisted selection (MAS) in vegetable breeding programs. In this review, we discuss current genomic resources and marker-trait association analysis to facilitate genome-assisted breeding in vegetable species in the Solanaceae and Cucurbitaceae families.

Evaluation of DNA Microarray Approach for Identifying Strain-Specific Genes

  • Hwang, Keum-Ok;Cho, Jae-Chang
    • Journal of Microbiology and Biotechnology
    • /
    • v.16 no.11
    • /
    • pp.1773-1777
    • /
    • 2006
  • We evaluated the usefulness of DNA microarray as a comparative genomics tool, and tested the validity of the cutoff values for defining absent genes in test genomes. Three genome-sequenced E. coli strains (K-12, EDL933, and CFT073) were subjected to comparative genomic hybridization with DNA microarrays covering almost all ORFs of the reference strain K-12, and the microarray results were compared with the results obtained from in silico analyses of genome sequences. For defining the K-12 ORFs absent in test genomes (reference strain-specific ORFs), we applied and evaluated the cutoff level of -1. The average sequence similarity between ORFs, to which corresponding spots showed a log-ratio of>-1, was $96.9{\pm}4.8$. The numbers of spots showing a log-ratio of <-1 (P<0.05, t-test) were 90 (2.5%) and 417 (10.6%) for the EDL933 genome and the CFT073 genome, respectively. Frequency of false negatives (FN) was ca. 0.2, and the cutoff level of -1.3 was required to achieve the FN of 0.1. The average sequence similarity of the false negative ORFs was $77.8{\pm}14.8$, indicating that the majority of the false negatives were caused by highly divergent genes. We concluded that the microarray is useful for identifying missing or divergent ORFs in closely related prokaryotic genomes.

New role of LTR-retrotransposons for emergence and expansion of disease-resistance genes and high-copy gene families in plants

  • Kim, Seungill;Choi, Doil
    • BMB Reports
    • /
    • v.51 no.2
    • /
    • pp.55-56
    • /
    • 2018
  • Long terminal repeat retrotransposons (LTR-Rs) are major elements creating new genome structure for expansion of plant genomes. However, in addition to the genome expansion, the role of LTR-Rs has been unexplored. In this study, we constructed new reference genome sequences of two pepper species (Capsicum baccatum and C. chinense), and updated the reference genome of C. annuum. We focused on the study for speciation of Capsicum spp. and its driving forces. We found that chromosomal translocation, unequal amplification of LTR-Rs, and recent gene duplications in the pepper genomes as major evolutionary forces for diversification of Capsicum spp. Specifically, our analyses revealed that the nucleotide-binding and leucine-rich-repeat proteins (NLRs) were massively created by LTR-R-driven retroduplication. These retoduplicated NLRs were abundant in higher plants, and most of them were lineage-specific. The retroduplication was a main process for creation of functional disease-resistance genes in Solanaceae plants. In addition, 4-10% of whole genes including highly amplified families such as MADS-box and cytochrome P450 emerged by the retroduplication in the plants. Our study provides new insight into creation of disease-resistance genes and high-copy number gene families by retroduplication in plants.