• Title/Summary/Keyword: Approximate Matching

Search Result 68, Processing Time 0.021 seconds

New Randomness Testing Methods using Approximate Periods (근사 주기를 이용한 새로운 랜덤성 테스트 기법)

  • Lim, Ji-Hyuk;Lee, Sun-Ho;Kim, Dong-Kyue
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.6
    • /
    • pp.742-746
    • /
    • 2010
  • In this paper, we propose new randomness testing methods based on approximate periods in order to improve the previous randomness testing method using exact pattern matching. Finding approximate periods of random sequences enables us to search similarly repeated parts, but it has disadvantages since it takes long time. In this paper we propose randomness testing methods whose time complexity is O($n^2$) by reducing the time complexity of computing approximate periods from O($n^3$) to O($n^2$). Moreover, we perform some experiments to compare pseudo random number generated by AES cryptographic algorithms and true random number.

Development of a Conversational Help Agent Using Approximate Pattern Matching (근사 패턴매칭을 이용한 대화형 도우미 에이전트의 개발)

  • 김수영;조성배
    • Korean Journal of Cognitive Science
    • /
    • v.13 no.4
    • /
    • pp.1-8
    • /
    • 2002
  • As Internet grows, many web sites have been built, therefore much information has been registered. Because the web sites have more information, it is more difficult that the user can find the information wanted. Therefore, to get information that user wants easily, the full-text engine may be embedded to the web site. This paper is about developing the help conversational agent for a user to find the information that he wants through conversation with agent. The proposed method is based on the pattern matching of artificial intelligence, not natural language processing. If a user inputs any sentence, the help conversational agent responds to the sentence through preprocessing and pattern matching with knowledge. The knowledge is built with the XML format. With the approximate pattern matching, the agent picks up the appropriate response with some degree of similarities. At the experiment, some different sentences with the same meaning have been entered, then the agent recognized them as the same pattern, and it made a correct answer.

  • PDF

Finding approximate occurrence of a pattern that contains gaps by the bit-vector approach

  • Lee, In-Bok;Park, Kun-Soo
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2003.10a
    • /
    • pp.193-199
    • /
    • 2003
  • The application of finding occurrences of a pattern that contains gaps includes information retrieval, data mining, and computational biology. As the biological sequences may contain errors, it is important to find not only the exact occurrences of a pattern but also approximate ones. In this paper we present an O(mnk$_{max}$/w) time algorithm for the approximate gapped pattern matching problem, where m is the length of the text, H is the length of the pattern, w is the word size of the target machine, and k$_{max}$ is the greatest error bound for subpatterns.

  • PDF

A Study on Shape Matching of Two-Dimensional Object using Relaxation (Relaxation을 이용한 2차원 물체의 형상매칭에 관한 연구)

  • 곽윤식;이대령
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.1
    • /
    • pp.133-142
    • /
    • 1993
  • This paper prrsents shape matching of two-dimensional object. This shape matching is applied to two-dimensional simple c10sedcurves represented by polygons. A large number of shape matching procedures have proposed baseed on teh view that shape can be represented by a vector of numerical features, and that this representation can be matched using techniques from statical pattern recognition. The varieties of features that have been extracted from shapes and used to represent them are numerous. But all of these feature-based approches suffer from the shortcoming that the descriptor of a segment of a shape do not ordinarily bear any simple relations hip to the description for the entire shape. We solve the segment matching problem of shape matching, defined as the recognition of a piece of a shape as approximate match to a part of large shape, by using relaxation labeling technique.

  • PDF

Finding Approximate Covers of Strings (문자열의 근사커버 찾기)

  • Sim, Jeong-Seop;Park, Kun-Soo;Kim, Sung-Ryul;Lee, Jee-Soo
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.29 no.1
    • /
    • pp.16-21
    • /
    • 2002
  • Repetitive strings have been studied in such diverse fields as molecular biology data compression etc. Some important regularities that have been studied are perods, covers seeds and squares. A natural extension of the repetition problems is to allow errors. Among the four notions above aproximate squares and approximate periodes have been studied. In this paper, we introduce the notion of approximate covers which is an approximate version of covers. Given two strings P(|P|=m) and T(|T|=n) we propose and algorithm with finds the minimum distance t such that P is a t-approximate cover of T. The algorithm take O(m,n) time for the edit distance and $O(mn^2)$ time of finding a string which is an approximate cover of T is minimum distance is NP-complete.

Cooperative Query Answering Based on Abstraction Database (추상화 정보 데이터베이스 기반 협력적 질의 응답)

  • 허순영;이정환
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.24 no.1
    • /
    • pp.99-117
    • /
    • 1999
  • Since query language is used as a handy tool to obtain information from a database, a more intelligent query answering system is needed to provide user-friendly and fault-tolerant human-machine Interface. Frequently, database users prefer less rigid querying structure, one which allows for vagueness in composing queries, and want the system to understand the intent behind a query. When there is no matching data available, users would rather receive approximate answers than a null information response. This paper presents a knowledge abstraction database that facilitates the development of such a fault-tolerant and intelligent database system. The proposed knowledge abstraction database adepts a multilevel knowledge representation scheme called the knowledge abstraction hierarchy(KAH), extracts semantic data relationships from the underlying database, and provides query transformation mechanisms using query generalization and specialization steps. In cooperation with the underlying database, the knowledge abstraction database accepts vague queries and allows users to pose approximate queries as well as conceptually abstract queries. Specifically. four types of vague queries are discussed, including approximate selection, approximate join, conceptual selection, and conceptual Join. A prototype system has been implemented at KAIST and is being tested with a personnel database system to demonstrate the usefulness and practicality of the knowledge abstraction database in ordinary database application systems.

  • PDF

A Novel Scalable and Storage-Efficient Architecture for High Speed Exact String Matching

  • Peiravi, Ali;Rahimzadeh, Mohammad Javad
    • ETRI Journal
    • /
    • v.31 no.5
    • /
    • pp.545-553
    • /
    • 2009
  • String matching is a fundamental element of an important category of modern packet processing applications which involve scanning the content flowing through a network for thousands of strings at the line rate. To keep pace with high network speeds, specialized hardware-based solutions are needed which should be efficient enough to maintain scalability in terms of speed and the number of strings. In this paper, a novel architecture based upon a recently proposed data structure called the Bloomier filter is proposed which can successfully support scalability. The Bloomier filter is a compact data structure for encoding arbitrary functions, and it supports approximate evaluation queries. By eliminating the Bloomier filter's false positives in a space efficient way, a simple yet powerful exact string matching architecture is proposed that can handle several thousand strings at high rates and is amenable to on-chip realization. The proposed scheme is implemented in reconfigurable hardware and we compare it with existing solutions. The results show that the proposed approach achieves better performance compared to other existing architectures measured in terms of throughput per logic cells per character as a metric.

The Use of Generalized Gamma-Polynomial Approximation for Hazard Functions

  • Ha, Hyung-Tae
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.6
    • /
    • pp.1345-1353
    • /
    • 2009
  • We introduce a simple methodology, so-called generalized gamma-polynomial approximation, based on moment-matching technique to approximate survival and hazard functions in the context of parametric survival analysis. We use the generalized gamma-polynomial approximation to approximate the density and distribution functions of convolutions and finite mixtures of random variables, from which the approximated survival and hazard functions are obtained. This technique provides very accurate approximation to the target functions, in addition to their being computationally efficient and easy to implement. In addition, the generalized gamma-polynomial approximations are very stable in middle range of the target distributions, whereas saddlepoint approximations are often unstable in a neighborhood of the mean.

Analysis of Nested Case-Control Study Designs: Revisiting the Inverse Probability Weighting Method

  • Kim, Ryung S.
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.6
    • /
    • pp.455-466
    • /
    • 2013
  • In nested case-control studies, the most common way to make inference under a proportional hazards model is the conditional logistic approach of Thomas (1977). Inclusion probability methods are more efficient than the conditional logistic approach of Thomas; however, the epidemiology research community has not accepted the methods as a replacement of the Thomas' method. This paper promotes the inverse probability weighting method originally proposed by Samuelsen (1997) in combination with an approximate jackknife standard error that can be easily computed using existing software. Simulation studies demonstrate that this approach yields valid type 1 errors and greater powers than the conditional logistic approach in nested case-control designs across various sample sizes and magnitudes of the hazard ratios. A generalization of the method is also made to incorporate additional matching and the stratified Cox model. The proposed method is illustrated with data from a cohort of children with Wilm's tumor to study the association between histological signatures and relapses.

Parallel Computation For The Edit Distance Based On The Four-Russians' Algorithm (4-러시안 알고리즘 기반의 편집거리 병렬계산)

  • Kim, Young Ho;Jeong, Ju-Hui;Kang, Dae Woong;Sim, Jeong Seop
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.2 no.2
    • /
    • pp.67-74
    • /
    • 2013
  • Approximate string matching problems have been studied in diverse fields. Recently, fast approximate string matching algorithms are being used to reduce the time and costs for the next generation sequencing. To measure the amounts of errors between two strings, we use a distance function such as the edit distance. Given two strings X(|X| = m) and Y(|Y| = n) over an alphabet ${\Sigma}$, the edit distance between X and Y is the minimum number of edit operations to convert X into Y. The edit distance between X and Y can be computed using the well-known dynamic programming technique in O(mn) time and space. The edit distance also can be computed using the Four-Russians' algorithm whose preprocessing step runs in $O((3{\mid}{\Sigma}{\mid})^{2t}t^2)$ time and $O((3{\mid}{\Sigma}{\mid})^{2t}t)$ space and the computation step runs in O(mn/t) time and O(mn) space where t represents the size of the block. In this paper, we present a parallelized version of the computation step of the Four-Russians' algorithm. Our algorithm computes the edit distance between X and Y in O(m+n) time using m/t threads. Then we implemented both the sequential version and our parallelized version of the Four-Russians' algorithm using CUDA to compare the execution times. When t = 1 and t = 2, our algorithm runs about 10 times and 3 times faster than the sequential algorithm, respectively.