• Title/Summary/Keyword: Sequencing depth

Search Result 48, Processing Time 0.025 seconds

Effect of Next-Generation Exome Sequencing Depth for Discovery of Diagnostic Variants

  • Kim, Kyung;Seong, Moon-Woo;Chung, Won-Hyong;Park, Sung Sup;Leem, Sangseob;Park, Won;Kim, Jihyun;Lee, KiYoung;Park, Rae Woong;Kim, Namshin
    • Genomics & Informatics
    • /
    • v.13 no.2
    • /
    • pp.31-39
    • /
    • 2015
  • Sequencing depth, which is directly related to the cost and time required for the generation, processing, and maintenance of next-generation sequencing data, is an important factor in the practical utilization of such data in clinical fields. Unfortunately, identifying an exome sequencing depth adequate for clinical use is a challenge that has not been addressed extensively. Here, we investigate the effect of exome sequencing depth on the discovery of sequence variants for clinical use. Toward this, we sequenced ten germ-line blood samples from breast cancer patients on the Illumina platform GAII(x) at a high depth of ${\sim}200{\times}$. We observed that most function-related diverse variants in the human exonic regions could be detected at a sequencing depth of $120{\times}$. Furthermore, investigation using a diagnostic gene set showed that the number of clinical variants identified using exome sequencing reached a plateau at an average sequencing depth of about $120{\times}$. Moreover, the phenomena were consistent across the breast cancer samples.

ChIP-seq Analysis of Histone H3K27ac and H3K27me3 Showing Different Distribution Patterns in Chromatin

  • Kang, Jin;Kim, AeRi
    • Biomedical Science Letters
    • /
    • v.28 no.2
    • /
    • pp.109-119
    • /
    • 2022
  • Histone proteins can be modified by the addition of acetyl group or methyl group to specific amino acids. The modifications have different distribution patterns in chromatin. Recently, histone modifications are studied based on ChIP-seq data, which requires reasonable analysis of sequencing data depending on their distribution patterns. Here we have analyzed histone H3K27ac and H3K27me3 ChIP-seq data and it showed that the H3K27ac is enriched at narrow regions while H3K27me3 distributes broadly. To properly analyze the ChIP-seq data, we called peaks for H3K27ac and H3K27me3 using MACS2 (narrow option and broad option) and SICER methods, and compared propriety of the peaks using signal-to-background ratio. As results, H3K27ac-enriched regions were well identified by both methods while H3K27me3 peaks were properly identified by SICER, which indicates that peak calling method is more critical for histone modifications distributed broadly. When ChIP-seq data were compared in different sequencing depth (15, 30, 60, 120 M), high sequencing depth caused high false-positive rate in H3K27ac peak calling, but it reflected more properly the broad distribution pattern of H3K27me3. These results suggest that sequencing depth affects peak calling from ChIP-seq data and high sequencing depth is required for H3K27me3. Taken together, peak calling tool and sequencing depth should be chosen depending on the distribution pattern of histone modification in ChIP-seq analysis.

Application of next generation sequencing (NGS) system for whole-genome sequencing of porcine reproductive and respiratory syndrome virus (PRRSV) (돼지생식기호흡기증후군바이러스(PRRSV)의 전장 유전체 염기서열(whole-genome sequencing) 분석을 위한 차세대 염기서열 분석법의 활용)

  • Moon, Sung-Hyun;Khatun, Amina;Kim, Won-Il;Hossain, Md Mukter;Oh, Yeonsu;Cho, Ho-Seong
    • Korean Journal of Veterinary Service
    • /
    • v.39 no.1
    • /
    • pp.41-49
    • /
    • 2016
  • In the present study, fast and robust methods for the next generation sequencing (NGS) were developed for analysis of PRRSV full genome sequences, which is a positive sensed RNA virus with a high degree of genetic variability among isolates. Two strains of PRRSVs (VR2332 and VR2332-R) which have been maintained in our laboratory were used to validate our methods and to compare with the sequence registered in GenBank (GenBank accession no. EF536003). The results suggested that both of strains had 100% coverage with the reference; the VR2332 had the coverage depth from minimum 3 to maximum 23,012, for the VR2332-R from minimum 3 to maximum 41,348, and 22,712 as an average depth. Genomic data produced from the massive sequencing capacities of the NGS have enabled the study of PRRSV at an unprecedented rate and details. Unlike conventional sequence methods which require the knowledge of conserved regions, the NGS allows de novo assembly of the full viral genomes. Therefore, our results suggested that these methods using the NGS massively facilitate the generation of more full genome PRRSV sequences locally as well as nationally in regard of saving time and cost.

New Lung Cancer Panel for High-Throughput Targeted Resequencing

  • Kim, Eun-Hye;Lee, Sunghoon;Park, Jongsun;Lee, Kyusang;Bhak, Jong;Kim, Byung Chul
    • Genomics & Informatics
    • /
    • v.12 no.2
    • /
    • pp.50-57
    • /
    • 2014
  • We present a new next-generation sequencing-based method to identify somatic mutations of lung cancer. It is a comprehensive mutation profiling protocol to detect somatic mutations in 30 genes found frequently in lung adenocarcinoma. The total length of the target regions is 107 kb, and a capture assay was designed to cover 99% of it. This method exhibited about 97% mean coverage at $30{\times}$ sequencing depth and 42% average specificity when sequencing of more than 3.25 Gb was carried out for the normal sample. We discovered 513 variations from targeted exome sequencing of lung cancer cells, which is 3.9-fold higher than in the normal sample. The variations in cancer cells included previously reported somatic mutations in the COSMIC database, such as variations in TP53, KRAS, and STK11 of sample H-23 and in EGFR of sample H-1650, especially with more than $1,000{\times}$ coverage. Among the somatic mutations, up to 91% of single nucleotide polymorphisms from the two cancer samples were validated by DNA microarray-based genotyping. Our results demonstrated the feasibility of high-throughput mutation profiling with lung adenocarcinoma samples, and the profiling method can be used as a robust and effective protocol for somatic variant screening.

Detection of hydin Gene Duplication in Personal Genome Sequence Data

  • Kim, Jong-Il;Ju, Young-Seok;Kim, Shee-Hyun;Hong, Dong-Wan;Seo, Jeong-Sun
    • Genomics & Informatics
    • /
    • v.7 no.3
    • /
    • pp.159-162
    • /
    • 2009
  • Human personal genome sequencing can be done with high efficiency by aligning a huge number of short reads derived from various next generation sequencing (NGS) technologies to the reference genome sequence. One of the major obstacles is the incompleteness of human reference genome. We tried to analyze the effect of hidden gene duplication on the NGS data using the known example of hydin gene. Hydin2, a duplicated copy of hydin on chromosome 16q22, has been recently found to be localized to chromosome 1q21, and is not included in the current version of standard human genome reference. We found that all of eight personal genome data published so far do not contain hydin2, and there is large number of nsSNPs in hydin. The heterozygosity of those nsSNPs was significantly higher than expected. The sequence coverage depth in hydin gene was about two fold of average depth. We believe that these unique finding of hydin can be used as useful indicators to discover new hidden multiplication in human genome.

Flanking Sequence and Copy-Number Analysis of Transformation Events by Integrating Next-Generation Sequencing Technology with Southern Blot Hybridization

  • Qin, Yang;Woo, Hee-Jong;Shin, Kong-Sik;Lim, Myung-Ho;Cho, Hyun-Suk;Lee, Seong-Kon
    • Plant Breeding and Biotechnology
    • /
    • v.5 no.4
    • /
    • pp.269-281
    • /
    • 2017
  • With the continual development of genetically modified (GM) crops, it has become necessary to develop detailed and effective molecular characterization methods to select candidate events from a large pool of transformation events. Relative to traditional molecular analysis methods such as the polymerase chain reaction (PCR) and Southern blot hybridization, next generation sequencing (NGS) technology for whole-genome sequencing of complex crop genomes had proven comparatively useful for in-depth molecular characterization. In this study, four transformation events, including one in Bacillus thuringiensis (Bt)-resistant rice, one in resveratrol-producing rice, and two in beta-carotene-enhanced soybeans, were selected for molecular characterization. To merge NGS analysis and Southern blot-hybridization results, we confirmed the transgene insertion sites, insertion construction, and insertion numbers of these four transformation events. In addition, the read-coverage depth assessed by NGS analysis for inserted genes might provide consistent results in terms of inserted T-DNA numbers in case of complex insertion structures and highly duplicated donor genomes; however, PCR-based methods can produce incorrect conclusions. Our combined method provides an effective and complete analytical approach for whole-genome visual inspection of transformation events that require biosafety assessment.

Next-Generation Sequencing and Epigenomics Research: A Hammer in Search of Nails

  • Sarda, Shrutii;Hannenhalli, Sridhar
    • Genomics & Informatics
    • /
    • v.12 no.1
    • /
    • pp.2-11
    • /
    • 2014
  • After the initial enthusiasm of the human genome project, it became clear that without additional data pertaining to the epigenome, i.e., how the genome is marked at specific developmental periods, in different tissues, as well as across individuals and species-the promise of the genome sequencing project in understanding biology cannot be fulfilled. This realization prompted several large-scale efforts to map the epigenome, most notably the Encyclopedia of DNA Elements (ENCODE) project. While there is essentially a single genome in an individual, there are hundreds of epigenomes, corresponding to various types of epigenomic marks at different developmental times and in multiple tissue types. Unprecedented advances in next-generation sequencing (NGS) technologies, by virtue of low cost and high speeds that continue to improve at a rate beyond what is anticipated by Moore's law for computer hardware technologies, have revolutionized molecular biology and genetics research, and have in turn prompted innovative ways to reduce the problem of measuring cellular events involving DNA or RNA into a sequencing problem. In this article, we provide a brief overview of the epigenome, the various types of epigenomic data afforded by NGS, and some of the novel discoveries yielded by the epigenomics projects. We also provide ample references for the reader to get in-depth information on these topics.

No excessive mutations in transcription activator-like effector nuclease-mediated α-1,3-galactosyltransferase knockout Yucatan miniature pigs

  • Choi, Kimyung;Shim, Joohyun;Ko, Nayoung;Park, Joonghoon
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.33 no.2
    • /
    • pp.360-372
    • /
    • 2020
  • Objective: Specific genomic sites can be recognized and permanently modified by genome editing. The discovery of endonucleases has advanced genome editing in pigs, attenuating xenograft rejection and cross-species disease transmission. However, off-target mutagenesis caused by these nucleases is a major barrier to putative clinical applications. Furthermore, off-target mutagenesis by genome editing has not yet been addressed in pigs. Methods: Here, we generated genetically inheritable α-1,3-galactosyltransferase (GGTA1) knockout Yucatan miniature pigs by combining transcription activator-like effector nuclease (TALEN) and nuclear transfer. For precise estimation of genomic mutations induced by TALEN in GGTA1 knockout pigs, we obtained the whole-genome sequence of the donor cells for use as an internal control genome. Results: In-depth whole-genome sequencing analysis demonstrated that TALEN-mediated GGTA1 knockout pigs had a comparable mutation rate to homologous recombination-treated pigs and wild-type strain controls. RNA sequencing analysis associated with genomic mutations revealed that TALEN-induced off-target mutations had no discernable effect on RNA transcript abundance. Conclusion: Therefore, TALEN appears to be a precise and safe tool for generating genomeedited pigs, and the TALEN-mediated GGTA1 knockout Yucatan miniature pigs produced in this study can serve as a safe and effective organ and tissue resource for clinical applications.

Genome-Wide SNP Calling Using Next Generation Sequencing Data in Tomato

  • Kim, Ji-Eun;Oh, Sang-Keun;Lee, Jeong-Hee;Lee, Bo-Mi;Jo, Sung-Hwan
    • Molecules and Cells
    • /
    • v.37 no.1
    • /
    • pp.36-42
    • /
    • 2014
  • The tomato (Solanum lycopersicum L.) is a model plant for genome research in Solanaceae, as well as for studying crop breeding. Genome-wide single nucleotide polymorphisms (SNPs) are a valuable resource in genetic research and breeding. However, to do discovery of genome-wide SNPs, most methods require expensive high-depth sequencing. Here, we describe a method for SNP calling using a modified version of SAMtools that improved its sensitivity. We analyzed 90 Gb of raw sequence data from next-generation sequencing of two resequencing and seven transcriptome data sets from several tomato accessions. Our study identified 4,812,432 non-redundant SNPs. Moreover, the workflow of SNP calling was improved by aligning the reference genome with its own raw data. Using this approach, 131,785 SNPs were discovered from transcriptome data of seven accessions. In addition, 4,680,647 SNPs were identified from the genome of S. pimpinellifolium, which are 60 times more than 71,637 of the PI212816 transcriptome. SNP distribution was compared between the whole genome and transcriptome of S. pimpinellifolium. Moreover, we surveyed the location of SNPs within genic and intergenic regions. Our results indicated that the sufficient genome-wide SNP markers and very sensitive SNP calling method allow for application of marker assisted breeding and genome-wide association studies.

Detection of Innate and Artificial Mitochondrial DNA Heteroplasmy by Massively Parallel Sequencing: Considerations for Analysis

  • Kim, Moon-Young;Cho, Sohee;Lee, Ji Hyun;Seo, Hee Jin;Lee, Soong Deok
    • Journal of Korean Medical Science
    • /
    • v.33 no.52
    • /
    • pp.337.1-337.14
    • /
    • 2018
  • Background: Mitochondrial heteroplasmy, the co-existence of different mitochondrial polymorphisms within an individual, has various forensic and clinical implications. But there is still no guideline on the application of massively parallel sequencing (MPS) in heteroplasmy detection. We present here some critical issues that should be considered in heteroplasmy studies using MPS. Methods: Among five samples with known innate heteroplasmies, two pairs of mixture were generated for artificial heteroplasmies with target minor allele frequencies (MAFs) ranging from 50% to 1%. Each sample was amplified by two-amplicon method and sequenced by Ion Torrent system. The outcomes of two different analysis tools, Torrent Suite Variant Caller (TVC) and mtDNA-Server (mDS), were compared. Results: All the innate heteroplasmies were detected correctly by both analysis tools. Average MAFs of artificial heteroplasmies correlated well to the target values. The detection rates were almost 90% for high-level heteroplasmies, but decreased for low-level heteroplasmies. TVC generally showed lower detection rates than mDS, which seems to be due to their own computation algorithms which drop out some reference-dominant heteroplasmies. Meanwhile, mDS reported several unintended low-level heteroplasmies which were suggested as nuclear mitochondrial DNA sequences. The average coverage depth of each sample placed on the same chip showed considerable variation. The increase of coverage depth had no effect on the detection rates. Conclusion: In addition to the general accuracy of the MPS application on detecting heteroplasmy, our study indicates that the understanding of the nature of mitochondrial DNA and analysis algorithm would be crucial for appropriate interpretation of MPS results.