• Title/Summary/Keyword: Korean morphological analyzer

Search Result 116, Processing Time 0.025 seconds

Probabilistic Segmentation and Tagging of Unknown Words (확률 기반 미등록 단어 분리 및 태깅)

  • Kim, Bogyum;Lee, Jae Sung
    • Journal of KIISE
    • /
    • v.43 no.4
    • /
    • pp.430-436
    • /
    • 2016
  • Processing of unknown words such as proper nouns and newly coined words is important for a morphological analyzer to process documents in various domains. In this study, a segmentation and tagging method for unknown Korean words is proposed for the 3-step probabilistic morphological analysis. For guessing unknown word, it uses rich suffixes that are attached to open class words, such as general nouns and proper nouns. We propose a method to learn the suffix patterns from a morpheme tagged corpus, and calculate their probabilities for unknown open word segmentation and tagging in the probabilistic morphological analysis model. Results of the experiment showed that the performance of unknown word processing is greatly improved in the documents containing many unregistered words.

Crawlers and Morphological Analyzers Utilize to Identify Personal Information Leaks on the Web System (크롤러와 형태소 분석기를 활용한 웹상 개인정보 유출 판별 시스템)

  • Lee, Hyeongseon;Park, Jaehee;Na, Cheolhun;Jung, Hoekyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.10a
    • /
    • pp.559-560
    • /
    • 2017
  • Recently, as the problem of personal information leakage has emerged, studies on data collection and web document classification have been made. The existing system judges only the existence of personal information, and there is a problem in that unnecessary data is not filtered because classification of documents published by the same name or user is not performed. In this paper, we propose a system that can identify the types of data or homonyms using the crawler and morphological analyzer for solve the problem. The user collects personal information on the web through the crawler. The collected data can be classified through the morpheme analyzer, and then the leaked data can be confirmed. Also, if the system is reused, more accurate results can be obtained. It is expected that users will be provided with customized data.

  • PDF

Measurement Technique of Particle Sizing in Spay Flow (분무 유동의 입경 계측 기법에 관한 연구)

  • Yang, Chang-Jo;Kim, Jeong-Hwan;Oh, Jong-Hwan;Kim, Mann-Eung;Lee, Young-Ho
    • Proceedings of the Korean Society of Marine Engineers Conference
    • /
    • 2005.06a
    • /
    • pp.534-539
    • /
    • 2005
  • Particle image analyzer for measuring droplet size has been developed. Image processing technique was used with relaxation method. The morphological method based on partial curvature information of pre-processed images was adopted for recognition and separation of overlapped particles. The measurement results show that the present method may be reliable for the analysis of the size and distribution of droplets produced by water mist spay flow.

  • PDF

Implementing an Inflection Analyzer Program for English Verbs in a Word-and-Paradigm Morphology. (낱말.패러다임 형태이론에 입각한 영어동사 굴절 해석 프로그램의 구현)

  • No, Yong-Kyoon
    • Language and Information
    • /
    • v.2 no.2
    • /
    • pp.121-154
    • /
    • 1998
  • The morphological analyzer is expected to tell attested word forms from imaginable yet unattested ones. An account of the inflectional morphology of English verbs is given in the framework of Word-and-Paradigm morphology, developed mainly by Matthews (1972, 1974, 1991) and further by Aronoff (1994) and Zwicky (1985, 1988), which is free of overrecognition. Thirteen inflectional classes are identified according to the patterns each of them exhibits in filling the slots in the paradigm. Peculiarity in orthography is also considered in assigning each verb lexeme to a class. Modules of a C program which gives associated morphosyntactic properties to all and only attested verb forms are written so that details of this framework can be evaluated explicitly. This program is shown to be superior to existing programs in economy and in the generality it achieves.

  • PDF

The Agglutination of the Korean Language and the Implementation of Korean Morphological Analyzer (국어의 교착성과 형태소 분석기의 구현)

  • Lee, Min-Haeng;Kim, Seong-Moo
    • Annual Conference on Human and Language Technology
    • /
    • 1992.10a
    • /
    • pp.105-117
    • /
    • 1992
  • 교착어(agglutinating language)에서는 다양한 통사정보가 독자적인 형태소에 내재되어 있다. 국어의 경우 형태소의 분석이 통사구조 분석에 선행되어 이루어져야 하는 이유가 바로 국어의 교착어적인 속성에 기인한다. 이 논문의 전반부에서 국어의 교착성을 명확히 보여주는 등위 접속구문을 핵심어 주도 구구조문법(HPSG)에 의하여 분석한다. 후반부에서는 PROLOG로 구현된 국어의 형태소 분석기와 통사구조 분석기(PARSER)를 소개한다.

  • PDF

A Study on the Development of a Practical Morphological Analysis System Based on Word Analysis (어절 분석 기반 형태소 분석 시스템 개발에 관한 연구)

  • 조현양;최성필;최재황
    • Journal of the Korean Society for information Management
    • /
    • v.18 no.2
    • /
    • pp.105-124
    • /
    • 2001
  • The purpose of this study is to develop a Korean word analysis system, which can improve performance of IRS, based on various methods of word analysis. In this study we focused on maximizing the speed of Korean word analysis, modulizing each functional system and analyzing Korean morpheme precisely. The system, developed in this study, implemented optimal algorithm to increase the speed of word analysis and to verify speed and performance of each subsystem. In addition, the numeral analysis processing was achieved to reduce a system burden by avoiding recursive analysis of compound nouns, based on numeral pattern recognition.

  • PDF

The Word Structure of the North Korean Morphological Analyzer (북한 문화어 형태소 분석기(NKMA)의 어절 구조)

  • Choi, Woon-Ho;Chung, Hoi-Sun
    • Annual Conference on Human and Language Technology
    • /
    • 1998.10c
    • /
    • pp.49-55
    • /
    • 1998
  • 분단 이후 북한은 우리와는 다른 언어정책을 추진해 왔고, 그 결과로 지금은 남북한 언어 정책에서 많은 차이를 드러내게 되었다. 본 논문은 북한 문화어 형태소 분석 시스템(NKMA)의 구축을 위한 어절 구조를 제시한다. 북한 문화어의 형태소 분절 및 분석을 위해 사용된 어절 구조는 대체로 말토막 단위와 일치하므로, 음성언어의 인식을 위한 분절 방법에 응용될 수도 있으리라 기대한다.

  • PDF

Residual Stress Behavior of High Temperature Polyimide Thin Films depending on the Structural Isomers of Diamine (Diamine의 구조적 이성질체에 따른 내열성 폴리이미드 박막의 잔류응력거동)

  • 임창호;정현수;한학수
    • Journal of the Microelectronics and Packaging Society
    • /
    • v.6 no.2
    • /
    • pp.23-30
    • /
    • 1999
  • The relationships between morphological structures and residual stress behaviors of polyimide thin films depending on isomeric diamines were investigated. For this study, Poly(phenylene biphenyltetracarboximide) (BPDA-PDA) and poly(oxydiphenylene biphenyltetracarboximide) (BPDA-ODA) films were prepared from their isomeric diamines: 1,3-phenylene diamine (1,3-PDA) 1,4-phenylene diamine (1.4-PDA), 3,4'-oxydiphenylene diamine (3,4'-ODA), and 4,4'-oxydiphenylene diamine (4,4'-ODA), respectively. For those films, residual stresses were detected in-situ during thermal imidization of the isomeric polyimide as a function of processing temperature over the range of 25~$400^{\circ}C$ using. Thin Film Stress Analyzer (TFSA). In comparison, residual stress of BPDA-1.4PDA having better in-plain orientation and chain order was the lowest value of 7MPa whereas those of BPDA-1,3-PDA, BPDA-3,4'-ODA, and BPDA-4,4'-ODA were in the range of 40-50MPa. Conclusively, the effect of morphological nature (chain rigidity, chain order, orientation) and chain mobility relating to the g1ass transition behavior on the residual stress of isomeric polyimide thin films wart analyzed.

  • PDF

Predictive Morphological Analysis of Korean with Dynamic Programming (동적 프로그래밍기법에 근거한 예측중심의 한국어 형태소 분석)

  • 김덕봉;최기선
    • Korean Journal of Cognitive Science
    • /
    • v.4 no.2
    • /
    • pp.145-180
    • /
    • 1994
  • In this paper,we present an efficient morphological analysis model for Korean which produces from an input word all the feasible sequences of morphemes in the word.This model is deterministic in applying spelling rules,and has few redundant computations in processing complex and ambiguous words.This is the effect of three types of new techniques:first,a new method for interpreting speilling rules;second,predictive rule applications which restrict to the spelling rules suitable for the input word;third,the use of dynamic programming which enables the analyzer to avoid recomputing analyzed substring in case the input word is morphologically ambiguous.our model has been experimented with 413,975 word randomly selected from the corpus of Korean elementary textbooks.Experimental results show that our model guarantees fast and reliable processing.

Chemical Constitution, Morphological Characteristics, and Biological Properties of ProRoot Mineral Trioxide Aggregate and Ortho Mineral Trioxide Aggregate

  • Kum, Kee Yeon;Yoo, Yeon Jee;Chang, Seok Woo
    • Journal of Korean Dental Science
    • /
    • v.6 no.2
    • /
    • pp.41-49
    • /
    • 2013
  • Purpose: This study sought to compare the elemental constitution, morphological characteristics, particle size distribution, biocompatibility, and mineralization potential of Ortho MTA (OMTA) and ProRoot MTA (PMTA). Materials and Methods: OMTA and PMTA were compared using energy-dispersive spectrometry, particle size analysis, and scanning electron microscopy. The biocompatibility and mineralization-related gene expression (osteonectin and osteopontin) of both MTAs were also compared using methylthiazol tetrazolium assay and reverse transcription-polymerization chain reaction analysis, respectively. The results were analyzed by Kruskal-Wallis test with Bonferroni correction. P-value of <0.05 was considered significant. Result: The morphology of OMTA powders was similar to that of PMTA. The constituent elements of both MTAs were calcium, silicon, and aluminum. The mean particle sizes of OMTA and PMTA were 4.60 and 3.34 mm, respectively. Both MTAs had equally favorable in vitro biocompatibility and affected the messenger RNA expression of osteonectin and osteopontin. Conclusion: Within the limitations of this study, OMTA could be a promising biomaterial in clinical endodontics.