DOI QR코드

DOI QR Code

Toward Complete Bacterial Genome Sequencing Through the Combined Use of Multiple Next-Generation Sequencing Platforms

  • Jeong, Haeyoung (Super-Bacteria Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB)) ;
  • Lee, Dae-Hee (Synthetic Biology and Bioengineering Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB)) ;
  • Ryu, Choong-Min (Super-Bacteria Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB)) ;
  • Park, Seung-Hwan (Super-Bacteria Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB))
  • Received : 2015.07.15
  • Accepted : 2015.10.14
  • Published : 2016.01.28

Abstract

PacBio's long-read sequencing technologies can be successfully used for a complete bacterial genome assembly using recently developed non-hybrid assemblers in the absence of second-generation, high-quality short reads. However, standardized procedures that take into account multiple pre-existing second-generation sequencing platforms are scarce. In addition to Illumina HiSeq and Ion Torrent PGM-based genome sequencing results derived from previous studies, we generated further sequencing data, including from the PacBio RS II platform, and applied various bioinformatics tools to obtain complete genome assemblies for five bacterial strains. Our approach revealed that the hierarchical genome assembly process (HGAP) non-hybrid assembler resulted in nearly complete assemblies at a moderate coverage of ~75x, but that different versions produced non-compatible results requiring post processing. The other two platforms further improved the PacBio assembly through scaffolding and a final error correction.

Keywords

References

  1. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, et al. 2012. SPAdes: a n ew g enome a ssembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19: 455-477. https://doi.org/10.1089/cmb.2012.0021
  2. Barthelson R, McFarlin AJ, Rounsley SD, Young S. 2011. Plantagora: modeling whole genome sequencing and assembly of plant genomes. PLoS One 6: e28436. https://doi.org/10.1371/journal.pone.0028436
  3. Boetzer M, Pirovano W. 2012. Toward almost closed genomes with GapFiller. Genome Biol. 13: R56. https://doi.org/10.1186/gb-2012-13-6-r56
  4. Charneski CA, Honti F, Bryant JM, Hurst LD, Feil EJ. 2011. Atypical at skew in Firmicute genomes results from selection and not from mutation. PLoS Genet. 7: e1002283. https://doi.org/10.1371/journal.pgen.1002283
  5. Chin CS, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, et al. 2013. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat. Methods 10: 563-569. https://doi.org/10.1038/nmeth.2474
  6. Coil D, Jospin G, Darling AE. 2015. A5-miseq: an updated pipeline to assemble microbial genomes from Illumina MiSeq data. Bioinformatics 31: 587-589. https://doi.org/10.1093/bioinformatics/btu661
  7. English AC, Richards S, Han Y, Wang M, Vee V, Qu J, et al. 2012. Mind the gap: upgrading genomes with Pacific Biosciences RS long-read sequencing technology. PLoS One 7: e47768. https://doi.org/10.1371/journal.pone.0047768
  8. Gordon D, Green P. 2013. Consed: a graphical editor for next-generation sequencing. Bioinformatics 29: 2936-2937. https://doi.org/10.1093/bioinformatics/btt515
  9. Gurevich A, Saveliev V, Vyahhi N, Tesler G. 2013. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29: 1072-1075. https://doi.org/10.1093/bioinformatics/btt086
  10. Harris RS. 2007. Improved pairwise alignment of genomic DNA. PhD thesis. Pennsylvania State University.
  11. Jeong H, Kloepper JW, Ryu C-M. 2015. Genome sequences of Pseudomonas amygdali pv. tabaci strain ATCC 11528 and pv. lachrymans strain 98A-744. Genome Announc. 3: e00683-00615.
  12. Kamada M, Hase S, Sato K, Toyoda A, Fujiyama A, Sakakibara Y. 2014. Whole genome complete resequencing of Bacillus subtilis natto by combining long reads with highquality short reads. PLoS One 9: e109999. https://doi.org/10.1371/journal.pone.0109999
  13. Koren S, Phillippy AM. 2015. One chromosome, one contig: complete microbial genomes from long-read sequencing and assembly. Curr. Opin. Microbiol. 23: 110-120. https://doi.org/10.1016/j.mib.2014.11.014
  14. Koren S, Schatz MC, Walenz BP, Martin J, Howard JT, Ganapathy G, et al. 2012. Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nat. Biotechnol. 30: 693-700. https://doi.org/10.1038/nbt.2280
  15. Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C, Salzberg SL. 2004. Versatile and open software for comparing large genomes. Genome Biol. 5: R12. https://doi.org/10.1186/gb-2004-5-2-r12
  16. Liao YC, Lin SH, Lin HH. 2015. Completing bacterial genome assemblies: strategy and performance comparisons. Sci. Rep. 5: 8747. https://doi.org/10.1038/srep08747
  17. Lobry JR, Louarn JM. 2003. Polarisation of prokaryotic chromosomes. Curr. Opin. Microbiol. 6: 101-108. https://doi.org/10.1016/S1369-5274(03)00024-9
  18. Park N, Shirley L, Gu Y, Keane TM, Swerdlow H, Quail MA. 2013. An improved approach to mate-paired library preparation for Illumina sequencing. Methods Next Gener. Seq.1: 10-20.
  19. Park S-H, Choi S-K, Park S-Y, Jeon JH, Kim HR, Jeong J, Kim YT. 2015. Novel Paenibacillus sp. and the method for yield increase of potato using the same. Republic of Korea patent application 10-1498155.
  20. Park YS, Jeong H, Sim YM, Yi HS, Ryu CM. 2014. Genome sequence and comparative genome analysis of Pseudomonas syringae pv. syringae type strain ATCC 19310. J. Microbiol. Biotechnol. 24: 563-567. https://doi.org/10.4014/jmb.1312.12082
  21. Ribeiro FJ, Przybylski D, Yin S, Sharpe T, Gnerre S, Abouelleil A, et al. 2012. Finished bacterial genomes from shotgun sequence data. Genome Res. 22: 2270-2277. https://doi.org/10.1101/gr.141515.112
  22. Zerbino DR, Birney E. 2008. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18: 821-829. https://doi.org/10.1101/gr.074492.107

Cited by

  1. Complete Genome Sequence of the Extremely Thermoacidophilic Archaeon Acidianus manzaensis YN-25 vol.5, pp.25, 2016, https://doi.org/10.1128/genomea.00438-17
  2. Testing assembly strategies of Francisella tularensis genomes to infer an evolutionary conservation analysis of genomic structures vol.22, pp.1, 2021, https://doi.org/10.1186/s12864-021-08115-x