A Practical Approximate Sub-Sequence Search Method for DNA Sequence Databases |
Won, Jung-Im
(한양대학교 정보통신학부)
Hong, Sang-Kyoon (연세대학교 컴퓨터과학과) Yoon, Jee-Hee (한림대학교 정보통신공학부) Park, Sang-Hyun (연세대학교 컴퓨터과학과) Kim, Sang-Wook (한양대학교 정보통신학부) |
1 | E. Horowitz, S. Sahni, and S. Anderson-Freed, Fundamentals of Data Structures in C, Computer Science Press, 1993 |
2 | H. Wang et al., 'BLAST++: A Tool for BLASTing Queries in Batches,' In Proceedings First Asia-Pacific Bioinformatics Conference, pp. 71-79, 2003 |
3 | G. Navarro and R. Baeza-Yates, 'A Hybrid Indexing Method for Approximate String Matching,' J. of Discrete ALgorithms, Vol. 1, No.1, pp. 205-239, 2000 |
4 | S. Altschul, T. Madden, A. Schaffer, J. Zhang, W. Miller, and D. Lipman, 'Gapped BLAST and PSI-BLAST: A New Generation of Protein Database Search Programs,' Nucleic Acids Research, Vol 25, No. 17, pp. 3389-3402,1997 DOI ScienceOn |
5 | G. A. Stephen, String Searching Algorithms, World Scientific Publishing, 1994 |
6 | A. L. Deicher, S. Kasif, R. D. Fleischmann, and J. Peterson, O. White, and S. L. Salzberg, 'Alignment of whole genomes,' Nucleic Acids Research, 27, pp. 2369-2376, 1999 DOI ScienceOn |
7 | E. Hunt, M. P. Atkinson and R. W. Irving, 'Database indexing for large DNA and protein sequence collections,' The VLDB Journal, Vol. 11, No.3, pp. 256-271, 2002 DOI |
8 | S. Altschul, W. Gish, W. Miller, E. Myers, and D. Lipman, 'Basic local alignment search tool,' Journal of Molecular Biology, Vol. 215, No.3, pp. 403-410, 1990 DOI ScienceOn |
9 | http://www.ncbi.nlm.nih.gov |
10 | H. Shang and T. H. Merrett, 'Tries for approximate string matching,' IEEE Trans. on Knowlege and Data Engineering, Vol. 8, No.4, pp. 540-547, 1996 DOI ScienceOn |
11 | V. Makinen and G. Navarro, 'Compressed Compact Suffix Arrays,' CPM 2004, Springer-Verlag LNCS 3109, pp. 420-433 |
12 | U. Manber and G. Myers, 'Suffix arrays: A new method for on-line string searches,' SIAM J. Comput. 22, pp. 935-948, 1993 DOI ScienceOn |
13 | S. Tata, R. Hankins, and J. Patel, 'Practical Suffix Tree Construction,' In Proceedings of the 30th VLDB Conference, pp. 36-47, 2004 |
14 | V. Makinen, 'Compact Suffix Array: A Space efficient Full-text Index,' Fundamenta Informaticae, 56(1-2), pp. 191-210, 2003 |
15 | K. Kelly and P. Labute, 'The A* Search and Applications to Sequence Alignment,' http://www.chemcomp.com/article/astar.htm, 1996 |
16 | T. Kahveci and A. K. Singh, 'An Efficient Index Structure for String Databases,' In Proceedings of the 27th VLDB Conference, pp. 351-360, 2001 |
17 | C. Fondrat and P. Dessen, 'A Rapid Access Motif database(RAMdb) with a search algorithm for the retrieval patterns in nucleic acids or proteun databanks,' Computer Applications in the Biosciences. Vol. 11, No.3, pp. 273-279, 1995 DOI ScienceOn |
18 | R. Giegerich, S. Kurtz, and J. Stoye, 'Efficient Implementation of Lazy Suffix Trees,' Softw. Pract. Exp., Vol 33, pp. 1035-1049, 2003 DOI ScienceOn |
19 | A. Califano and I. Rigoutso, 'FLASH: A Fast Look-up Algorithm for String Homology,' In Proceedings of Intelligent System Conference for Morecular Biology, pp. 56-64, 1993 |
20 | E. Ukkonen, 'Approximate string matching over suffix trees,' In Proceedings of Combinatorial Pattern Matching (CPM93), pp. 228-242, 1993 DOI ScienceOn |
21 | S. Kurtz, 'Reducing the Space Requirement of Suffix Trees,' Softw. Pract. Exp., Vol 29, pp. 1149-1171, 1999 DOI ScienceOn |
22 | C. Meek, J. M. Patel, and S. Kasetty, 'OASIS: An Online and Accurate Technique for Local-Alignment Searches on Biological sequences,' In Proceedings of the 29th VLDB Conference, pp. 920-921, 2003 |
23 | K. Sadakane and T. Shibuya. 'Indexing huge genome sequences for solving various problems,' In Proceedings of the 12th Genome Informatics, pp. 175-183, 2001 |
24 | S. Kurtz, J. Choudhuri, E. Ohlebusch, C. Schleiermacher, J. Stoye, and R. Giegerich, 'REPuter: the manifold applications of repeat analysis on a genome scale,' Nucleic Acids Research, Vol. 29, No. 22, pp. 4633-4642, 2001 DOI ScienceOn |
25 | C. Gibas and P. Jambeck, Developing Bioinformatics Computer Skills, O'Reilly and Associates Inc., 2001 |
26 | Z. Tan, X. Cao, B. Ooi, and A. Tung, 'The ed-tree: An Index for Large DNA Sequence Databases,' In Proceedings of SSDBM Conference, pp. 1-10, 2003 |
27 | H. E. Williams and J. Zobel, 'Indexing and Retrieval for Genomic Databases,' IEEE TKDE Vol. 14, No. 1. pp. 63-78, 2002 DOI ScienceOn |
28 | T. Smith and M. Waterman, 'Identification of Common Molecular Subsequences,' Journal of Molecular Biology, 147, pp. 195-197, 1981 DOI |