• Title/Summary/Keyword: Korean morphological analyzer

Search Result 116, Processing Time 0.028 seconds

Implementation of morphologica analyzer and spelling corrector for charcter recognition post-processing (문자 인식 후처리를 위한 형태소 분석기와 문자 교정기의 구현)

  • 이영화;김규성;김영훈;이상조
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.5
    • /
    • pp.82-92
    • /
    • 1997
  • In this paper, we propose post-rpocessing method that corrects a misrecognized character by generated a characater recognizer using morphological analyzer and spelling corrector. The proposed post-processing consists of sthree phases : First, our method pass through morhological analyzer which only outputted necessary information for spelling correcting, doesn't analyze a bundle of phrases, and detects the location of misrecognized character. Second, tagging the generated candidate character using the information of character substitution table and grapheme substitution/separating table. Then we retry analysis after the misrecognition character has been substituted. Finally we select table, we investigate misrecognized charcters in CORPUS. Reliability analysis used to frequency of randomly selected about 100,000 words in CORPUS. A korean character recognizer demonstrates 93% correction rate without a post-processing. The entire recognition rate of our system with a post-processing exceeds 97% correction rate.

  • PDF

High Speed Korean Morphological Analysis based on Adjacency Condition Check (인접 조건 검사에 의한 초고속 한국어 형태소 분석)

  • 심광섭;양재형
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.1
    • /
    • pp.89-99
    • /
    • 2004
  • This paper proposes a morphological analysis method that enables morphological analysis by checking conditions between two adjacent morphemes. These conditions are fed from a dictionary. This method eliminates a code conversion module and the application of transformational rules for candidate generation. The method claims that very high speed morphological analysis is attainable through simple bit operations for adjacency condition check. MACH, an implementation of the proposed method, is a supersonic Korean morphological analyzer which is able to analyze a document of 1 GB in 5 minutes on a PC with 1.13 GHz Pentium III CPU. The analysis accuracy of MACH is 99.2 %.

An Efficient Method for Korean Noun Extraction Using Noun Patterns (명사 출현 특성을 이용한 효율적인 한국어 명사 추출 방법)

  • 이도길;이상주;임해창
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.1_2
    • /
    • pp.173-183
    • /
    • 2003
  • Morphological analysis is the most widely used method for extracting nouns from Korean texts. For every Eojeol, in order to extract nouns from it, a morphological analyzer performs frequent dictionary lookup and applies many morphonological rules, therefore it requires many operations. Moreover, a morphological analyzer generates all the possible morphological interpretations (sequences of morphemes) of a given Eojeol, which may by unnecessary from the noun extraction`s point of view. To reduce unnecessary computation of morphological analysis from the noun extraction`s point of view, this paper proposes a method for Korean noun extraction considering noun occurrence characteristics. Noun patterns denote conditions on which nouns are included in an Eojeol or not, which are positive cues or negative cues, respectively. When using the exclusive information as the negative cues, it is possible to reduce the search space of morphological analysis by ignoring Eojeols not including nouns. Post-noun syllable sequences(PNSS) as the positive cues can simply extract nouns by checking the part of the Eojeol preceding the PNSS and can guess unknown nouns. In addition, morphonological information is used instead of many morphonological rules in order to recover the lexical form from its altered surface form. Experimental results show that the proposed method can speed up without losing accuracy compared with other systems based on morphological analysis.

Error-driven Noun-Connection Rule Extraction for Morphological Analysis (오류에 기반한 복합명사 좌우접속규칙 사전 구축)

  • Lee, Kong Joo;Lee, Songwook
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.36 no.8
    • /
    • pp.1123-1128
    • /
    • 2012
  • The goal of this research is to develop an error-driven noun-connection rules which is used for breaking complicate nouns in Korean morphology analysis module. We collected complicate nouns from Web sites, and analyzed them by CnuMa. Whenever we find errors from outputs of the analyzer, we write noun-connection rules to correct the errors. The noun-connection rules are devised by considering left/right contexts in compound nouns. The error-driven noun-connection rules are helpful in improving precision and recall of a Korean morphology analyzer, CnuMa by 2.8% and 10.8%, respectively.

Characteristic Change of Fiber Depending on the Refining Conditions of Reconstituted Tobacco Process (판상엽 고해조건에 따른 섬유특성 변화 평가)

  • Han young-Rim;Sung Yong-Joo;Kim Sam-Kon;Kim Kun-Soo;Han In-Ho
    • Journal of the Korean Society of Tobacco Science
    • /
    • v.27 no.2
    • /
    • pp.195-200
    • /
    • 2005
  • The goal of refining is to treat fibers so they meet the requirements of the papermaking process. The refining process in papermaking has great influence on the quality of the final product by changing the fiber properties, such as fiber length, shape, fine contents and so on. In this study, the effect on the morphological change of fibers by the refining conditions were investigated using the fiber morphology analyzer. Fiber morphology analyzer used to determine which pulps are suitable for producing particular products. Furthermore it is widely used in paper mills to monitor paper quality. The morphological change of fibers according to refining conditions were evaluated out by measuring fiber, shive and fine. In the fiber morphology, the domestic reconstituted tobacco fiber has the bigger average fiber length value than that of the foreign reconstituted tobacco.

Design of an on-line morphological analyzer for a japanese-to-korean translation system (일한 기계번역을 위한 on-line 형태소 해석기 설계)

  • 강석훈;최병욱
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.5
    • /
    • pp.127-137
    • /
    • 1996
  • In this paper, an algorithm for on-line rightward japanese parsing is proposed. The ambiguity in on-line parsing is accumulated until the input is completely finished, since there is not a space between words in the japanese sentence. Thus the algorithm for morphological analysis, based on modified chart, is used in solving it. And the number of searching a word in dirctionary for morphological analysis is also a puzzling problem. The japanese sentence, consist of N characters, has logically its maximum number of N(N+1)/2 searches in the ordinary on-line analysis, which is nearly twice as many as normal off-line. In this paper, the matter is settled through the modification of dictionary format. In experiment, we can accomplish the rate of analysis which is nearly equal to that of off-line parsing. And it becomes clear that the longer a sentence is, the better an analysis efficiency is.

  • PDF

Morphological control and electrostatic deposition of silver nanoparticles produced by condensation-evaporation method (증발-응축법에 의해 발생된 은(silver) 나노입자의 구조제어 및 전기적 부착 특성 연구)

  • Kim, Whidong;Ahn, Ji Young;Kim, Soo Hyung
    • Particle and aerosol research
    • /
    • v.5 no.2
    • /
    • pp.83-90
    • /
    • 2009
  • This paper describes a condensation-evaporation method (CEM) to produce size-controlled spherical silver nanoparticles by perturbing coagulation and coalescence processes in the gas phase. Polydisperse silver nanoparticles generated by the CEM were first introduced into a differential mobility analyzer (DMA) to select a group of silver nanoparticles with same electrical mobility, which also enables to make a group of nanoparticles with elongated structures and same projected area. These silver nanoparticles selected by the DMA were then in-situ sintered at ${\sim}600^{\circ}C$, and then they were observed to turn into spherical shaped nanoparticles by the rapid coalescence process. With the assistance of modified converging-typed quartz reactor, we can also produce the 10 times higher number concentration of silver nanoparticles compared with a general quartz reactor with uniform diameter. Finally, the spherical silver nanoparticles with 30 nm were electrostatically deposited on the surface of silicon substrate with the coverage rate of ~4%/hr. This useful preparation method of size-controlled monodisperse silver nanoparticles developed in this work can be applied to the various studies for characterizing the physical, chemical, optical, and biological properties of nanoparticles as a function of their size.

  • PDF

The syllable recovrey rule-based system and the application of a morphological analysis method for the post-processing of a continuous speech recognition (연속음성인식 후처리를 위한 음절 복원 rule-based 시스템과 형태소분석기법의 적용)

  • 박미성;김미진;김계성;최재혁;이상조
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.36C no.3
    • /
    • pp.47-56
    • /
    • 1999
  • Various phonological alteration occurs when we pronounce continuously in korean. This phonological alteration is one of the major reasons which make the speech recognition of korean difficult. This paper presents a rule-based system which converts a speech recognition character string to a text-based character string. The recovery results are morphologically analyzed and only a correct text string is generated. Recovery is executed according to four kinds of rules, i.e., a syllable boundary final-consonant initial-consonant recovery rule, a vowel-process recovery rule, a last syllable final-consonant recovery rule and a monosyllable process rule. We use a x-clustering information for an efficient recovery and use a postfix-syllable frequency information for restricting recovery candidates to enter morphological analyzer. Because this system is a rule-based system, it doesn't necessitate a large pronouncing dictionary or a phoneme dictionary and the advantage of this system is that we can use the being text based morphological analyzer.

  • PDF

Detecting Spelling Errors by Comparison of Words within a Document (문서내 단어간 비교를 통한 철자오류 검출)

  • Kim, Dong-Joo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.12
    • /
    • pp.83-92
    • /
    • 2011
  • Typographical errors by the author's mistyping occur frequently in a document being prepared with word processors contrary to usual publications. Preparing this online document, the most common orthographical errors are spelling errors resulting from incorrectly typing intent keys to near keys on keyboard. Typical spelling checkers detect and correct these errors by using morphological analyzer. In other words, the morphological analysis module of a speller tries to check well-formedness of input words, and then all words rejected by the analyzer are regarded as misspelled words. However, if morphological analyzer accepts even mistyped words, it treats them as correctly spelled words. In this paper, I propose a simple method capable of detecting and correcting errors that the previous methods can not detect. Proposed method is based on the characteristics that typographical errors are generally not repeated and so tend to have very low frequency. If words generated by operations of deletion, exchange, and transposition for each phoneme of a low frequency word are in the list of high frequency words, some of them are considered as correctly spelled words. Some heuristic rules are also presented to reduce the number of candidates. Proposed method is able to detect not syntactic errors but some semantic errors, and useful to scoring candidates.

Emotion Prediction of Paragraph using Big Data Analysis (빅데이터 분석을 이용한 문단 내의 감정 예측)

  • Kim, Jin-su
    • Journal of Digital Convergence
    • /
    • v.14 no.11
    • /
    • pp.267-273
    • /
    • 2016
  • Creation and Sharing of information which is structured data as well as various unstructured data. makes progress actively through the spread of mobile. Recently, Big Data extracts the semantic information from SNS and data mining is one of the big data technique. Especially, the general emotion analysis that expresses the collective intelligence of the masses is utilized using large and a variety of materials. In this paper, we propose the emotion prediction system architecture which extracts the significant keywords from social network paragraphs using n-gram and Korean morphological analyzer, and predicts the emotion using SVM and these extracted emotion features. The proposed system showed 82.25% more improved recall rate in average than previous systems and it will help extract the semantic keyword using morphological analysis.