References
- C. Gibas and P. Jambeck, Developing Bioinformatics Computer Skills, O'Reilly and Associates Inc., 2001
- R. S. C. Goble, P. Baker, and Brass, 'A Classification of tasks in bioinformatics,' Bioinformatics, Vol. 17, No. 2, pp. 180-188, 2001 https://doi.org/10.1093/bioinformatics/17.2.180
- D. A. Benson, M. S. Boguski, D. J. Lipman, J. Ostell, and B. F. Quellette, 'Genbank,' Nucleic Acids Research, Vol. 26, No. 1, pp. 1-7, 1998 https://doi.org/10.1093/nar/26.1.1
- H. E. Williams and J. Zobel, 'Indexing and Retrieval for Genomic Databases,' IEEE TKDE Vol. 14, No. 1. pp. 63-78, 2002 https://doi.org/10.1109/69.979973
- Z. Tan, X. Cao, B. Ooi, and A. Tung, 'The ed-tree: An Index for Large DNA Sequence Databases,' In Proceedings of SSDBM Conference, pp. 1-10, 2003
- S. Altschul, W. Gish, W. Miller, E. Myers, and D. Lipman, 'Basic local alignment search tool,' Journal of Molecular Biology, Vol. 215, No. 3, pp. 403-410, 1990 https://doi.org/10.1016/S0022-2836(05)80360-2
- S. Altschul, T. Madden, A. Schaffer, J. Zhang, W. Miller, and D. Lipman, 'Gapped BLAST and PSI-BLAST: A New Generation of Protein Database Search Programs,' Nucleic Acids Research, Vol. 25, No. 17, pp. 3389-3402, 1997 https://doi.org/10.1093/nar/25.17.3389
- T. Smith and M. Waterman, 'Identification of Common Molecular Subsequences,' Journal of Molecular Biology, 147, pp. 195-197, 1981 https://doi.org/10.1016/0022-2836(81)90087-5
- J. Buhler, 'Efficient Large-Scale Sequence Comparison by Local-Sensitive Hashing,' Bioinformatics, Vol. 17, pp. 419-428, 2001 https://doi.org/10.1093/bioinformatics/17.5.419
- B. Ma, J. Tromp, and M. Li, 'Patternhunter: Faster and more Sensitive Homology Search,' Bioinformatics, Vol. 18, pp. 440-445, 2002 https://doi.org/10.1093/bioinformatics/18.3.440
- G. A. Stephen, String Searching Algorithms, World Scientific Publishing, 1994
- A. L. Delcher, S. Kasif, R. D. Fleischmann, and J. Peterson, O. White, and S. L. Salzberg, 'Alignment of whole genomes,' Nucleic Acids Research, 27, pp. 2369-2376, 1999 https://doi.org/10.1093/nar/27.11.2369
- S. Kurtz, C. Schleiermacher, 'REPuter: fast computation of maximal repeats in complete genomes,' Bioinformatics, Vol. 15, No. 5, pp. 426-427, 1999 https://doi.org/10.1093/bioinformatics/15.5.426
- G. Navarro and R. Baeza-Yates, 'A new indexing method for approximate string matching,' In Proceedings of Combinatorial Pattern Matching (CPM99), Lecture Notes in Computer Science, 1645, Springer, pp. 163-185, 1999
- E. Ukkonen, 'Approximate string matching over suffix trees,' In Proceedings of Combinatorial Pattern Matching (CPM93), Lecture Notes in Computer Science, 684, Springer, pp. 228-242, 1999
- C. Meek, J. M. Patel, and S. Kasetty, 'OASIS: An Online and Accurate Technique for Local-Alignment Searches on Biological sequences,' In Proceedings of the 29th VLDB Conference, pp. 920-921, 2003
- E. Hunt, M. P. Atkinson and R. W. Irving, 'Database indexing for large DNA and protein sequence collections,' The VLDB Journal, Vol. 11, No. 3, pp. 256-271, 2002 https://doi.org/10.1007/s007780200064
- H. Wang et al., 'BLAST++: A Tool for BLASTing Queries in Batches,' In Proceedings First Asia-Pacific Bioinformatics Conference, pp. 71-79, 2003
- E. Horowitz, S. Sahni, and S. Anderson-Freed, Fundamentals of Data Structures in C, Computer Science Press, 1993
- D. W. Mount, Bioinformatics: Sequence and Genome Analysis, Cold Spring Harbor Laboratory Press, 2001
- A. Califano and I. Rigoutso, 'FLASH: A Fast Look-up Algorithm for String Homology,' In Proceedings of Intelligent System Conference for Morecular Biology, pp. 56-64, 1993
- C. Fondrat and P. Dessen, 'A Rapid Access Motif database(RAMdb) with a search algorithm for the retrieval patterns in nucleic acids or proteun databanks,' Computer Applications in the Biosciences. Vol. 11, No. 3, pp. 273-279, 1995 https://doi.org/10.1093/bioinformatics/11.3.273
- T. Kahveci and A. K. Singh, 'An Efficient Index Structure for String Databases,' In Proceedings of the 27th VLDB Conference, pp. 351-360, 2001
- K. Sadakane and T. Shibuya, 'Indexing huge genome sequences for solving various problems,' In Proceedings of the 12th Genome Informatics, pp. 175-183, 2001
- U. Manber and G. Myers, 'Suffix Arrays: A New Method for On-Line String Searches,' SIAM J. Comput., Vol. 22, No. 5, pp. 935-948, 1993 https://doi.org/10.1137/0222058
- K. Kelly and P. Labute, 'The A* Search and Applications to Sequence Alignment,' http://www.chemcomp.com/article/astar.htm, 1996
- E. M. McCreight, 'A Space-Economic Suffix Tree Construction Algorithm,' JACM, Vol. 23, No. 2, pp. 262-272, 1976 https://doi.org/10.1145/321941.321946
- J. Kar kkainen and E. Ukkonen, 'Sparse Suffix Trees,' In Proceedings of COCOON, pp. 219-230, 1996
- S. Kurtz, Reducing the Space Requirement of Suffix Trees. Softw. Pract. Exp., Vol. 29, pp. 1149-1171, 1999 https://doi.org/10.1002/(SICI)1097-024X(199911)29:13<1149::AID-SPE274>3.0.CO;2-O
- R. De La Briandais, 'File searching using variable length keys,' In Proceedings of Western Joint Computer Conference, Vol. 15, pp. 295-298, 1959
- D. E. Knuth, Sorting and Searching, The Art of Computer Programming: Vol. 3, Addison-Wesley, 1973
- H. Shang and T. H. Merrett, 'Tries for approximate string matching,' IEEE Trans. on Knowlege and Data Engineering, Vol. 8, No. 4, pp. 540-547, 1996 https://doi.org/10.1109/69.536247
- http://www.ncbi.nlm.nih.gov
- N. Beckmann, H. Kriegel, R. Schneider, and B. Seeger, 'The R*-tree: An efficient and robust access method for points and rectangles,' In Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 322-331, 1990
- P. Bieganski, J. Riedl, J. V. Carlis, 'Generalized suffix trees for biological sequence data: applications and implementation,' In Proceedings of 27th Hawaii International Conference on System Sciences, Vol. 5, pp. 35-44, 1994 https://doi.org/10.1109/HICSS.1994.323593