• Title/Summary/Keyword: string algorithms

Search Result 105, Processing Time 0.035 seconds

Finding All-Pairs Suffix-Prefix Matching Using Suffix Array (접미사 배열을 이용한 Suffix-Prefix가 일치하는 모든 쌍 찾기)

  • Han, Seon-Mi;Woo, Jin-Woon
    • The KIPS Transactions:PartA
    • /
    • v.17A no.5
    • /
    • pp.221-228
    • /
    • 2010
  • Since string operations were applied to computational biology, security and search for Internet, various data structures and algorithms for computing efficient string operations have been studied. The all-pairs suffix-prefix matching is to find the longest suffix and prefix among given strings. The matching algorithm is importantly used for fast approximation algorithm to find the shortest superstring, as well as for bio-informatics and data compressions. In this paper, we propose an algorithm to find all-pairs suffix-prefix matching using the suffix array, which takes O($k{\cdot}m$)�� time complexity. The suffix array algorithm is proven to be better than the suffix tree algorithm by showing it takes less time and memory through experiments.

Fast Matching Method for DNA Sequences (DNA 서열을 위한 빠른 매칭 기법)

  • Kim, Jin-Wook;Kim, Eun-Sang;Ahn, Yoong-Ki;Park, Kun-Soo
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.36 no.4
    • /
    • pp.231-238
    • /
    • 2009
  • DNA sequences are the fundamental information for each species and a comparison between DNA sequences of different species is an important task. Since DNA sequences are very long and there exist many species, not only fast matching but also efficient storage is an important factor for DNA sequences. Thus, a fast string matching method suitable for encoded DNA sequences is needed. In this paper, we present a fast string matching method for encoded DNA sequences which does not decode DNA sequences while matching. We use four-characters-to-one-byte encoding and combine a suffix approach and a multi-pattern matching approach. Experimental results show that our method is about 5 times faster than AGREP and the fastest among known algorithms.

Segmentation Algorithm for Wafer ID using Active Multiple Templates Model

  • Ahn, In-Mo;Kang, Dong-Joong;Chung, Yoon-Tack
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.839-844
    • /
    • 2003
  • This paper presents a method to segment wafer ID marks on poor quality images under uncontrolled lighting conditions of the semiconductor process. The active multiple templates matching method is suggested to search ID areas on wafers and segment them into meaningful regions and it would have been impossible to recognize characters using general OCR algorithms. This active template model is designed by applying a snake model that is used for active contour tracking. Active multiple template model searches character areas and segments them into single characters optimally, tracking each character that can vary in a flexible manner according to string configurations. Applying active multiple templates, the optimization of the snake energy is done using Greedy algorithm, to maximize its efficiency by automatically controlling each template gap. These vary according to the configuration of character string. Experimental results using wafer images from real FA environment are presented.

  • PDF

Nearest L- Neighbor Method with De-crossing in Vehicle Routing Problem

  • Kim, Hwan-Seong;Tran-Ngoc, Hoang-Son
    • Journal of Navigation and Port Research
    • /
    • v.33 no.2
    • /
    • pp.143-151
    • /
    • 2009
  • The field of vehicle routing is currently growing rapidly because of many actual applications in truckload and less than truckload trucking, courier services, door to door services, and many other problems that generally hinder the optimization of transportation costs in a logistics network. The rapidly increasing number of customers in such a network has caused problems such as difficulty in cost optimization in terms of getting a global optimum solution in an acceptable time. Fast algorithms are needed to find sufficient solutions in a limited time that can be used for real time scheduling. In this paper, the nearest L-method (NLNM) is proposed to obtain a vehicle routing solution. String neighbors of different lengths were chosen, tested and compared. The applied de crossing procedure is meant to solve the routes by NLNM by giving a better solution and shorter computation time than that of NLNM with long string neighbors.

Parallel Algorithms for Finding Consensus of Circular Strings (환형문자열에 대한 대표문자열을 찾는 병렬 알고리즘)

  • Kim, Dong Hee;Sim, Jeong Seop
    • Journal of KIISE
    • /
    • v.42 no.3
    • /
    • pp.289-294
    • /
    • 2015
  • The consensus problem is finding a representative string, called a consensus, of a given set S of k strings. Circular strings are different from linear strings in that the last symbol precedes the first symbol. Given a set S of circular strings of length n over an alphabet ${\Sigma}$, we first present an $O({\mid}{\Sigma}{\mid}nlogn)$ time parallel algorithm for finding a consensus of S minimizing both radius and distance sum when k=3 using O(n) threads. Then we present an $O({\mid}{\Sigma}{\mid}n^2logn)$ time parallel algorithm for finding a consensus of S minimizing distance sum when k=4 using O(n) threads. Finally, we compare execution times of our algorithms implemented using CUDA with corresponding sequential algorithms.

A Study on the Convergence of Optimal Value using Selection Method in Genetic Algorithms (유전자 알고리즘에서 선택 기법을 이용한 해의 수렴 과정에 관한 연구)

  • 김용범;김병재;박명규
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.20 no.42
    • /
    • pp.171-179
    • /
    • 1997
  • Genetic Algorithms face an inherent conflict between exploitation and exploration. Exploitation refers to taking advantage of information already obtained in the search. Exploration show that a pattern in bits coupled with another pattern elsewhere in the string is more effective. In this paper shows that the selection method has a major impact on the balance between exploitation and exploration. A more heavy-handed approach seeks to exploit the available information. If decisions must be made quickly, especially those in real-time trading environments, then quicker convergence through exploitation may be more desirable. Also this paper we present some theoretical and empirical the selection method in genetic algorithms for a GA-hard problem.

  • PDF

A Novel Cryptosystem Based on Steganography and Automata Technique for Searchable Encryption

  • Truong, Nguyen Huy
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.5
    • /
    • pp.2258-2274
    • /
    • 2020
  • In this paper we first propose a new cryptosystem based on our data hiding scheme (2,9,8) introduced in 2019 with high security, where encrypting and hiding are done at once, the ciphertext does not depend on the input image size as existing hybrid techniques of cryptography and steganography. We then exploit our automata approach presented in 2019 to design two algorithms for exact and approximate pattern matching on secret data encrypted by our cryptosystem. Theoretical analyses remark that these algorithms both have O(n) time complexity in the worst case, where for the approximate algorithm, we assume that it uses ⌈(1-ε)m)⌉ processors, where ε, m and n are the error of our string similarity measure and lengths of the pattern and secret data, respectively. In searchable encryption, our cryptosystem is used by users and our pattern matching algorithms are performed by cloud providers.

A Fast Algorithm for the k-Keyword Ordered Proximity Problem (순서를 고려하는 k-키워드 근접도 문제를 위한 빠른 알고리즘)

  • Kim, Jin-Wook
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.3
    • /
    • pp.281-288
    • /
    • 2010
  • In the web search engines, the proximity is used to compute the relevance of a document to the given query. There exist various research results about the proximity problems and the ordered proximity problems. In this paper, we present O(n) time algorithms for the k-keyword ordered proximity problems where n is the total number of occurrences of the k keywords in a document. Experimental results show that the proposed algorithms are about 1.2 times and over 3 times faster than the previous results when k=2 and k=5, respectively.

Linear-time algorithms for computing a maximal increasing subsequence (극대 증가 부분서열을 찾는 선형 알고리즘)

  • Joong Chae Na
    • Smart Media Journal
    • /
    • v.12 no.6
    • /
    • pp.9-14
    • /
    • 2023
  • The longest increasing subsequence is a fundamental problem which has been studied for a long time in computer science. In this paper, we consider the maximal increasing subsequence problem where the constraint is released from the longest to the maximal. For two kinds of increasing (monotone increasing and strictly increasing), we propose linear-time algorithms computing a maximal increasing subsequence of an input sequence from an alphabet Σ. Our algorithm for computing a maximal monotone increasing subsequence requires O(1) space and our algorithm for computing a maximal strictly increasing subsequence requires O(|Σ|) space.

A New Korean Search Pattern of the Operator LIKE (연산자 LIKE의 새로운 한글 탐색 패턴)

  • Park, Sung-Chul;Roh, Eun-Hyang;Park, Young-Chul;Park, Jong-Cheol
    • Journal of KIISE:Databases
    • /
    • v.34 no.3
    • /
    • pp.244-260
    • /
    • 2007
  • The operator LIKE of the database language SQL is a string pattern search operator. By providing the string pattern, the operator can identify column values that match with the string pattern. As a phonetic symbol, each Korean syllable is composed either of a leading sound and a medial sound or of a leading sound, a medial sound, and a trailing sound. As a search pattern of Korean syllables of the operator LIKE, in addition to the traditional Korean search pattern, this paper proposes a new search pattern that is based on leading sounds and medial sounds of Korean. With the new Korean search pattern, Korean syllables having specific leading sounds, specific medial sounds, or both specific leading sounds and medial sounds can be found. Formulating predicates equivalent with the new Korean search pattern by way of existing SQL operators is cumbersome and might cause the portability problem of applications depending on the underlying character set of the DBMS. This paper presents algorithms for the execution of the operator LIKE considering the new Korean search pattern based on the characters that are represented in KS X 1001, which is a Korean standard code for information interchange of Korean and Chinese.