DOI QR코드

DOI QR Code

Draft Genome of Toxocara canis, a Pathogen Responsible for Visceral Larva Migrans

  • Kong, Jinhwa (Department of Computer Engineering, College of Engineering, Hallym University) ;
  • Won, Jungim (Smart Computing Lab., Hallym University) ;
  • Yoon, Jeehee (Department of Computer Engineering, College of Engineering, Hallym University) ;
  • Lee, UnJoo (Department of Electronic Engineering, College of Engineering, Hallym University) ;
  • Kim, Jong-Il (Department of Biomedical Sciences, Seoul National University Graduate School) ;
  • Huh, Sun (Department of Parasitology and Institute of Medical Education, College of Medicine, Hallym University)
  • Received : 2016.06.07
  • Accepted : 2016.10.21
  • Published : 2016.12.31

Abstract

This study aimed at constructing a draft genome of the adult female worm Toxocara canis using next-generation sequencing (NGS) and de novo assembly, as well as to find new genes after annotation using functional genomics tools. Using an NGS machine, we produced DNA read data of T. canis. The de novo assembly of the read data was performed using SOAPdenovo. RNA read data were assembled using Trinity. Structural annotation, homology search, functional annotation, classification of protein domains, and KEGG pathway analysis were carried out. Besides them, recently developed tools such as MAKER, PASA, Evidence Modeler, and Blast2GO were used. The scaffold DNA was obtained, the N50 was 108,950 bp, and the overall length was 341,776,187 bp. The N50 of the transcriptome was 940 bp, and its length was 53,046,952 bp. The GC content of the entire genome was 39.3%. The total number of genes was 20,178, and the total number of protein sequences was 22,358. Of the 22,358 protein sequences, 4,992 were newly observed in T. canis. Following proteins previously unknown were found: E3 ubiquitin-protein ligase cbl-b and antigen T-cell receptor, zeta chain for T-cell and B-cell regulation; endoprotease bli-4 for cuticle metabolism; mucin 12Ea and polymorphic mucin variant C6/1/40r2.1 for mucin production; tropomodulin-family protein and ryanodine receptor calcium release channels for muscle movement. We were able to find new hypothetical polypeptides sequences unique to T. canis, and the findings of this study are capable of serving as a basis for extending our biological understanding of T. canis.

Keywords

References

  1. Kim YH, Huh S, Chung YB. Seroprevalence of toxocariasis among healthy people with eosinophilia. Korean J Parasitol 2008; 46: 29-32. https://doi.org/10.3347/kjp.2008.46.1.29
  2. Zhu XQ, Korhonen PK, Cai H, Young ND, Nejsum P, von Samson-Himmelstjerna G, Boag PR, Tan P, Li Q, Min J, Yang Y, Wang X, Fang X, Hall RS, Hofmann A, Sternberg PW, Jex AR, Gasser RB. Genetic blueprint of the zoonotic pathogen Toxocara canis. Nat Commun 2015; 6: 6145. https://doi.org/10.1038/ncomms7145
  3. Melsted P, Pritchard JK. Efficient counting of k-mers in DNA sequences using a bloom filter. BMC Bioinformatics 2011; 12: 333. https://doi.org/10.1186/1471-2105-12-333
  4. Marcais G, Kingsford C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 2011; 27: 764-770. https://doi.org/10.1093/bioinformatics/btr011
  5. Beijing Genomics Institute. SOAPec [Internet]. Shenzen, China: Beijing Genomics Institute; [cited 2016 Jan 2]. Available from: http://soap.genomics.org.cn/about.html.
  6. Broad Institute. GATK [Internet]. Cambridge, MA, USA: Broad Institute; [cited 2016 Jan 2]. Available from: https://www.broadinstitute.org/gatk.
  7. Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, He G, Chen Y, Pan Q, Liu Y, Tang J, Wu G, Zhang H, Shi Y, Liu Y, Yu C, Wang B, Lu Y, Han C, Cheung DW, Yiu SM, Peng S, Xiaoqian Z, Liu G, Liao X, Li Y, Yang H, Wang J, Lam TW, Wang J. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience 2012; 1: 18. https://doi.org/10.1186/2047-217X-1-18
  8. Broad Institute. Trinity [Internet]. Cambridge, MA, USA: Broad Institute; [cited 2016 Jan 2]. Available from: http://trinityrnaseq.sourceforge.net.
  9. Institute for Systems Biology. RepeatMasker [Internet]. Seattle, WA, USA: Institute for Systems Biology; [cited 2016 Jan 2]. Available from: http://repeatmasker.org.
  10. Smith CD, Edgar RC, Yandell M, Smith DR, Celniker SE, Myers EW, Karpen GH. Improved repeat identification and masking in Dipterans. Gene 2007; 389: 1-9. https://doi.org/10.1016/j.gene.2006.09.011
  11. Jurka J. Repbase Update: a database and an electronic journal of repetitive elements. Trends Genetics 2000; 9: 418-420.
  12. Yandell M. Comparative genomics library (CGL) [internet]. Available from: http://www.yandell-lab.org/software/cgl.html.
  13. Stanke M, Tzvetkova A, Morgenstern B. AUGUSTUS at EGASP: using EST, protein and genomic alignments for improved gene prediction in the human genome. Genome Biology 2006; 7(suppl): 1-8.
  14. Cantarel BL, Korf I, Robb SM, Parra G, Ross E, Moore B, Holt C, Sanchez Alvarado A, Yandell M. MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res 2008; 18: 188-196.
  15. Holt C, Yandell M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 2011; 12: 491. https://doi.org/10.1186/1471-2105-12-491
  16. Johnson AD, Handsaker RE, Pulit SL, Nizzari MM, O'Donnell CJ, de Bakker PI. SNAP: A web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics 2008; 24: 2938-2939. https://doi.org/10.1093/bioinformatics/btn564
  17. Jex AR, Liu S, Li B, Young ND, Hall RS, Li Y, Yang L, Zeng N, Xu X, Xiong Z, Chen F, Wu X, Zhang G, Fang X, Kang Y, Anderson GA, Harris TW, Campbell BE, Vlaminck J, Wang T, Cantacessi C, Schwarz EM, Ranganathan S, Geldhof P, Nejsum P, Sternberg PW, Yang H, Wang J, Wang J, Gasser RB. Ascaris suum draft genome. Nature 2011; 479: 529-533. https://doi.org/10.1038/nature10553
  18. Haas BJ, Salzberg SL, Zhu W, Pertea M, Allen JE, Orvis J, White O, Buell CR, Wortman JR. Automated eukaryotic gene structure annotation using EVidenceModeler and the program to assemble spliced alignments. Genome Biol 2008; 9: R7. https://doi.org/10.1186/gb-2008-9-1-r7
  19. Institute for Genomic Research. PASA [Internet]. La Jolla, CA, USA: Institute for Genomic Research; [cited 2016 Jan 2]. Available from: http://pasapipeline.github.io.
  20. Conesa A, Gotz S, Garcia-Gomez JM, Terol J, Talon M, Robles M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 2005; 21: 3674-3676. https://doi.org/10.1093/bioinformatics/bti610
  21. Staudacher E. Mucin-type O-glycosylation in invertebrates. Molecules 2015; 20: 10622-10640. https://doi.org/10.3390/molecules200610622
  22. Page AP, Stepek G, Winter AD, Pertab D. Enzymology of the nematode cuticle: a potential drug target? Int J Parasitol Drugs Drug Resist 2014; 4: 133-141. https://doi.org/10.1016/j.ijpddr.2014.05.003
  23. Fowler VM. Tropomodulin: a cytoskeletal protein that binds to the end of erythrocyte tropomyosin and inhibits tropomyosin binding to actin. J Cell Biol 1990; 111: 471-481. https://doi.org/10.1083/jcb.111.2.471
  24. Wang H, Spang A, Sullivan MA, Hryhorenko J, Hagen FK. The terminal phase of cytokinesis in the Caenorhabditis elegans early embryo requires protein glycosylation. Mol Biol Cell 2005; 16: 4202-4213. https://doi.org/10.1091/mbc.E05-05-0472

Cited by

  1. GAAP: A Genome Assembly + Annotation Pipeline vol.2019, pp.None, 2016, https://doi.org/10.1155/2019/4767354