Browse > Article

Finding Approximate Covers of Strings  

Sim, Jeong-Seop (서울대학교 컴퓨터공학부)
Park, Kun-Soo (서울대학교 컴퓨터공학부)
Kim, Sung-Ryul ((주) 와이즈넛 연구원)
Lee, Jee-Soo (한국방송대학교 전자계산학과)
Abstract
Repetitive strings have been studied in such diverse fields as molecular biology data compression etc. Some important regularities that have been studied are perods, covers seeds and squares. A natural extension of the repetition problems is to allow errors. Among the four notions above aproximate squares and approximate periodes have been studied. In this paper, we introduce the notion of approximate covers which is an approximate version of covers. Given two strings P(|P|=m) and T(|T|=n) we propose and algorithm with finds the minimum distance t such that P is a t-approximate cover of T. The algorithm take O(m,n) time for the edit distance and $O(mn^2)$ time of finding a string which is an approximate cover of T is minimum distance is NP-complete.
Keywords
approximate string matching; editdistance; weighted edit distance; approximate;
Citations & Related Records
연도 인용수 순위
  • Reference
1 M. Middendorf, More on the complexity of common superstring and supersequence problems, Theoretical Computer Science 125, 2 (1994), 205-228   DOI   ScienceOn
2 K.J. Raihe and E. Ukkonen, The shortest common supersequence problem over binary alphabet is NP-complete. Theoretical Computer Science 16 (1981), 187-198   DOI   ScienceOn
3 J.S. Sim, C.S. Iliopoulos, K. Park, W.F. Smyth, Approximate periods of strings, Theoretical Computer Science, 252 (2001), 557-568   DOI   ScienceOn
4 S. Kim, K. Park, A Dynamic edit distance table, Proc. 11th Symp. Combinatorial Pattern Matching, LNCS 1848 (2000), 60-68
5 G.M. Landau, E.W. Myers and J.P. Schmidt, Incremental string comparison, SIAM .J. Computing 27, 2 (1998), 557-582   DOI   ScienceOn
6 C.S. Iliopoulos, D.W.G. Moore and K. Park, Covering a string, Algorithmica 16 (1996), 288-297   DOI
7 M.G. Main and R.J. Lorentz, An O(n log n) algorithm for finding all repetitions in a string, J. Algorithms 5 (1984), 422-432   DOI
8 A. Apostolico, Fast parallel detection of squares in strings, Algorithmica 8 (1992), 285-319   DOI
9 G.M. Landau and J.P. Schmidt, An algorithm for approximate tandem repeats, Proc. 4th Symp. Combinatorial Pattern Matching, LNCS 648 (1993), 120-133
10 J.P. Schmidt, All highest scoring paths in weighted grid graphs and its application to finding all approximate repeats in strings, SIAM J. Computing 27, 4 (1998), 972-992   DOI   ScienceOn
11 C.S. Iliopoulos and K. Park, An optimal O( log log n)-time algorithm for parallel super-primitivity testing, J. KISS 21, 8 (1994), 1400-1404
12 M. Crochemore, String-matching and periods, Bulletin of the European Association for Theoretical Computer Science 39 (1989), 149-153
13 A. Apostolico, D. Breslauer and Z. Galil, Optimal parallel algorithms for periods, palindromes and squares, Proc. 19th Int. Colloq. Automata Languages and Programming, LNCS 623 (1992), 296-307
14 A. Apostolico, M. Farach and C. S. Iliopoulos, Optimal superprimitivity testing for strings, Information Processing Letters 39 (1991), 17-20   DOI   ScienceOn
15 D. Breslauer, An on-line string superprimitivity test, Information Processing Letters 44 (1992), 345-347   DOI   ScienceOn