DOI QR코드

DOI QR Code

Storing Digital Information in Long-Read DNA

  • Ahn, TaeJin (Department of Life Science, Handong Global University) ;
  • Ban, Hamin (Department of Life Science, Handong Global University) ;
  • Park, Hyunsoo (Department of Life Science, Handong Global University)
  • Received : 2018.11.30
  • Accepted : 2018.12.20
  • Published : 2018.12.31

Abstract

There is urgent need for effective and cost-efficient data storage, as the worldwide requirement for data storage is rapidly growing. DNA has introduced a new tool for storing digital information. Recent studies have successfully stored digital information, such as text and gif animation. Previous studies tackled technical hurdles due to errors from DNA synthesis and sequencing. Studies also have focused on a strategy that makes use of 100-150-bp read sizes in both synthesis and sequencing. In this paper, we a suggest novel data encoding/decoding scheme that makes use of long-read DNA (~1,000 bp). This enables accurate recovery of stored digital information with a smaller number of reads than the previous approach. Also, this approach reduces sequencing time.

Keywords

References

  1. Extance A. How DNA could store all the world's data. Nature 2016;537:22-24. https://doi.org/10.1038/537022a
  2. Goldman N, Bertone P, Chen S, Dessimoz C, LeProust EM, Sipos B, et al. Towards practical, high-capacity, low-maintenance information storage in synthesized DNA. Nature 2013;494:77-80. https://doi.org/10.1038/nature11875
  3. Shipman SL, Nivala J, Macklis JD, Church GM. CRISPR-Cas encoding of a digital movie into the genomes of a population of living bacteria. Nature 2017;547:345-349. https://doi.org/10.1038/nature23017
  4. Organick L, Ang SD, Chen YJ, Lopez R, Yekhanin S, Makarychev K, et al. Random access in large-scale DNA data storage. Nat Biotechnol 2018;36:242-248. https://doi.org/10.1038/nbt.4079
  5. Pfeiffer F, Grober C, Blank M, Handler K, Beyer M, Schultze JL, et al. Systematic evaluation of error rates and causes in short samples in next-generation sequencing. Sci Rep 2018;8:10950. https://doi.org/10.1038/s41598-018-29325-6
  6. Menegon M, Cantaloni C, Rodriguez-Prieto A, Centomo C, Abdelfattah A, Rossato M, et al. On site DNA barcoding by nanopore sequencing. PLoS One 2017;12:e0184741. https://doi.org/10.1371/journal.pone.0184741
  7. Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 2017;27:722-736. https://doi.org/10.1101/gr.215087.116