• Title/Summary/Keyword: 부분문자열

Search Result 67, Processing Time 0.027 seconds

Design of a Fuzzy Classifier by Repetitive Analyses of Multifeatures (다중 특징의 반복적 분석에 의한 퍼지 분류기의 설계)

  • 신대정;나승유
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.6 no.3
    • /
    • pp.14-24
    • /
    • 1996
  • A fuzzy classifier which needs various analyses of features using genetic algorithms is proposed. The fuzzy classifier has a simple structure, which contains a classification part based on fuzzy logic theory and a rule generation ation padptu sing genetic algorithms. The rule generation part determines optimal fuzzy membership functions and inclusior~ or exclusion of each feature in fuzzy classification rules. We analyzed recognition rate of a specific object, then added finer features repetitively, if necessary, to the object which has large misclassification rate. And we introduce repetitive analyses method for the minimum size of string and population, and for the improvement of recognition rates. This classifier is applied to three examples of the classification of iris data, the discrimination of thyroid gland cancer cells and the recognition of confusing handwritten and printed numerals. In the recognition of confusing handwritten and printed numerals, each sample numeral is classified into one of the groups which are divided according to the sample structure. The fuzzy classifier proposed in this paper has recognition rates of 98. 67% for iris data, 98.25% for thyroid gland cancer cells and 96.3% for confusing handwritten and printed numeral!;.

  • PDF

Image Recognition by Fuzzy Logic and Genetic Algorithms (퍼지로직과 유전 알고리즘을 이용한 영상 인식)

  • Ryoo, Sang-Jin;Na, Chul-Hoon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.5
    • /
    • pp.969-976
    • /
    • 2007
  • A fuzzy classifier which needs various analyses of features using genetic algorithms is proposed. The fuzzy classifier has a simple structure, which contains a classification part based on fuzzy logic theory and a rule generation part using genetic algorithms. The rule generation part determines optimal fuzzy membership functions and inclusion or exclusion of each feature in fuzzy classification rules. We analyzed recognition rate of a specific object, then added finer features repetitively, if necessary, to the object which has large misclassification rate. And we introduce repetitive analyses method for the minimum size of string and population, and for the improvement of recognition rates. This classifier is applied to two examples of the recognition of iris data and the recognition of Thyroid Gland cancer cells. The fuzzy classifier proposed in this paper has recognition rates of 98.67% for iris data and 98.25% for Thyroid Gland cancer cells.

Handwritten Korean Amounts Recognition in Bank Slips using Rule Information (규칙 정보를 이용한 은행 전표 상의 필기 한글 금액 인식)

  • Jee, Tae-Chang;Lee, Hyun-Jin;Kim, Eun-Jin;Lee, Yill-Byung
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.8
    • /
    • pp.2400-2410
    • /
    • 2000
  • Many researches on recognition of Korean characters have been undertaken. But while the majority are done on Korean character recognition, tasks for developing document recognition system have seldom been challenged. In this paper, I designed a recognizer of Korean courtesy amounts to improve error correction in recognized character string. From the very first step of Korean character recognition, we face the enormous scale of data. We have 2350 characters in Korean. Almost the previous researches tried to recognize about 1000 frequently-used characters, but the recognition rates show under 80%. Therefore using these kinds of recognizers is not efficient, so we designed a statistical multiple recognizer which recognize 16 Korean characters used in courtesy amounts. By using multiple recognizer, we can prevent an increase of errors. For the Postprocessor of Korean courtesy amounts, we use the properties of Korean character strings. There are syntactic rules in character strings of Korean courtesy amounts. By using this property, we can correct errors in Korean courtesy amounts. This kind of error correction is restricted only to the Korean characters representing the unit of the amounts. The first candidate of Korean character recognizer show !!i.49% of recognition rate and up to the fourth candidate show 99.72%. For Korean character string which is postprocessed, recognizer of Korean courtesy amounts show 96.42% of reliability. In this paper, we suggest a method to improve the reliability of Korean courtesy amounts recognition by using the Korean character recognizer which recognize limited numbers of characters and the postprocessor which correct the errors in Korean character strings.

  • PDF

Development of Workbench for Analysis and Visualization of Whole Genome Sequence (전유전체(Whole gerlome) 서열 분석과 가시화를 위한 워크벤치 개발)

  • Choe, Jeong-Hyeon;Jin, Hui-Jeong;Kim, Cheol-Min;Jang, Cheol-Hun;Jo, Hwan-Gyu
    • The KIPS Transactions:PartA
    • /
    • v.9A no.3
    • /
    • pp.387-398
    • /
    • 2002
  • As whole genome sequences of many organisms have been revealed by small-scale genome projects, the intensive research on individual genes and their functions has been performed. However on-memory algorithms are inefficient to analysis of whole genome sequences, since the size of individual whole genome is from several million base pairs to hundreds billion base pairs. In order to effectively manipulate the huge sequence data, it is necessary to use the indexed data structure for external memory. In this paper, we introduce a workbench system for analysis and visualization of whole genome sequence using string B-tree that is suitable for analysis of huge data. This system consists of two parts : analysis query part and visualization part. Query system supports various transactions such as sequence search, k-occurrence, and k-mer analysis. Visualization system helps biological scientist to easily understand whole structure and specificity by many kinds of visualization such as whole genome sequence, annotation, CGR (Chaos Game Representation), k-mer, and RWP (Random Walk Plot). One can find the relations among organisms, predict the genes in a genome, and research on the function of junk DNA using our workbench.

An Efficient Local Alignment Algorithm for DNA Sequences including N and X (N과 X를 포함하는 DNA 서열을 위한 효율적인 지역정렬 알고리즘)

  • Kim, Jin-Wook
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.3
    • /
    • pp.275-280
    • /
    • 2010
  • A local alignment algorithm finds a substring pair of given two strings where two substrings of the pair are similar to each other. A DNA sequence can consist of not only A, C, G, and T but also N and X where N and X are used when the original bases lose their information for various reasons. In this paper, we present an efficient local alignment algorithm for two DNA sequences including N and X using the affine gap penalty metric. Our algorithm is an extended version of the Kim-Park algorithm and can be extended in case of including other characters which have similar properties to N and X.

A Multiple Pattern Matching Scheme to Improve Rule Application Performance (규칙 적용 성능을 개선하기 위한 다중 패턴매칭 기법)

  • Lee, Jae-Kook;Kim, Hyong-Shik
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.18 no.3
    • /
    • pp.79-88
    • /
    • 2008
  • On the internet, the NIDS(Network Intrusion Detection System) has been widely deployed to protect the internal network. The NIDS builds a set of rules with analysis results on illegal packets and filters them using the rules, thus protecting the internal system. The number of rules is ever increasing as the attacks are becoming more widespread and well organized these days. As a result, the performance degradation has been found severe in the rule application fer the NIDS. In this paper, we propose a multiple pattern matching scheme to improve rule application performance. Then we compare our algorithm with Wu-Mantel algorithm which is known to do high performance multi-pattern matching.

Korean Unknown-noun Recognition using Strings Following Nouns in Words (명사후문자열을 이용한 미등록어 인식)

  • Park, Ki-Tak;Seo, Young-Hoon
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.4
    • /
    • pp.576-584
    • /
    • 2017
  • Unknown nouns which are not in a dictionary make problems not only morphological analysis but also almost all natural language processing area. This paper describes a recognition method for Korean unknown nouns using strings following nouns such as postposition, suffix and postposition, suffix and eomi, etc. We collect and sort words including nouns from documents and divide a word including unknown noun into two parts, candidate noun and string following the noun, by finding same prefix morphemes from more than two unknown words. We use information of strings following nouns extracted from Sejong corpus and decide unknown noun finally. We obtain 99.64% precision and 99.46% recall for unknown nouns occurred more than two forms in news of two portal sites.

A Motion Correspondence Algorithm based on Point Series Similarity (점 계열 유사도에 기반한 모션 대응 알고리즘)

  • Eom, Ki-Yeol;Jung, Jae-Young;Kim, Moon-Hyun
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.4
    • /
    • pp.305-310
    • /
    • 2010
  • In this paper, we propose a heuristic algorithm for motion correspondence based on a point series similarity. A point series is a sequence of points which are sorted in the ascending order of their x-coordinate values. The proposed algorithm clusters the points of a previous frame based on their local adjacency. For each group, we construct several potential point series by permuting the points in it, each of which is compared to the point series of the following frame in order to match the set of points through their similarity based on a proximity constraint. The longest common subsequence between two point series is used as global information to resolve the local ambiguity. Experimental results show an accuracy of more than 90% on two image sequences from the PETS 2009 and the CAVIAR data sets.

A Design and Implementation of WML Compiler for WAP Gateway for Wireless Internet Services (무선 인터넷 서비스를 위한 WAP 게이트웨이용 WML 컴파일러의 설계 및 구현)

  • Choi, Eun-Jeong;Han, Dong-Won;Lim, Kyung-Shik
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.2
    • /
    • pp.165-182
    • /
    • 2001
  • In this paper, we describe a design and implementation of the Wireless Markup Language(WML) compiler to deploy wireless Internet services effectively. The WML compiler translates textual WML decks into binary ones in order to reduce the traffic on wireless links that have relatively low bandwidth to wireline links and mitigate the processing overhead of WML decks on, wireless terminals that have relatively low processing power to fixed workstations. In addition, it takes over the overhead of eXtensible Markup Language(XML) well-formedness and validation processes. The WML compiler consists of the lexical analyzer and parser modules. The granunar for the WML parser module is LALR(1) context-free grammar that is designed based on XML 1.0 and WML 1.2 DTD(Document Type Definition) with the consideration of the Wireless Application Protocol Binary XML grammar. The grammar description is converted into a C program to parse that grammar by using parser generator. Even though the tags in WML will be extended or WML DTD will be upgraded, this approach has the advantage of flexibility because the program is generated by modifying just the changed parts. We have verified the functionality of the WML compiler by using a WML decompiler in the public domain and by using the Nokia WAP Toolkit as a WAP client. To measurethe compressibility gain of the WML compiler, we have tested a large number of textual WML decks and obtained a maximum 85 %. As the effect of compression is reduced when the portion of general textual strings increases relative to one of the tags and attributes in a WML deck, an extended encoding method might be needed for specific applications such as compiling of the WML decks to which the Hyper Text Markup Language document is translated dynamically.

  • PDF

Application and Evaluation of Object-Oriented Educational Programming Language 'Dolittle' for Computer Science Education in Secondary Education (중등 컴퓨터과학교육을 위한 객체지향형 EPL '두리틀'의 적용 및 평가)

  • Kwon, Dae-Yong;Gil, Hye-Min;Yeum, Yong-Cheul;Yoo, Seoung-Wook;Kanemune, Susumu;Kuno, Yasushi;Lee, Won-Gyu
    • The Journal of Korean Association of Computer Education
    • /
    • v.7 no.6
    • /
    • pp.1-12
    • /
    • 2004
  • Current computer education is difficult to educate basic concepts and principals of the computer science because the 7th curriculum of computer education is focused on the application of software. According to the ACM K-12 report about the computer science education model, current computer education is taking the wrong way and we should put the highly priority on the education of the fundamentals through programming languages for a better computer education oriented to the computer science. This paper introduces a new object-oriented educational programming language "Dolittle". The design principals of Dolittle are simple syntax of Korean, incremental programming, text based programming, aliasing of function, and object-oriented programming. Being applied to middle school classes, we can confirm that Dolittle is easy to learn, and gives rise to high interest and keeps interest through a course, and also is of great practical use in class for programming novice.

  • PDF