• Title/Summary/Keyword: Morpheme Recovery

Search Result 6, Processing Time 0.02 seconds

Error Correction in Korean Morpheme Recovery using Deep Learning (딥 러닝을 이용한 한국어 형태소의 원형 복원 오류 수정)

  • Hwang, Hyunsun;Lee, Changki
    • Journal of KIISE
    • /
    • v.42 no.11
    • /
    • pp.1452-1458
    • /
    • 2015
  • Korean Morphological Analysis is a difficult process. Because Korean is an agglutinative language, one of the most important processes in Morphological Analysis is Morpheme Recovery. There are some methods using Heuristic rules and Pre-Analyzed Partial Words that were examined for this process. These methods have performance limits as a result of not using contextual information. In this study, we built a Korean morpheme recovery system using deep learning, and this system used word embedding for the utilization of contextual information. In '들/VV' and '듣/VV' morpheme recovery, the system showed 97.97% accuracy, a better performance than with SVM(Support Vector Machine) which showed 96.22% accuracy.

Morpheme Recovery Based on Naïve Bayes Model (NB 모델을 이용한 형태소 복원)

  • Kim, Jae-Hoon;Jeon, Kil-Ho
    • The KIPS Transactions:PartB
    • /
    • v.19B no.3
    • /
    • pp.195-200
    • /
    • 2012
  • In Korean, spelling change in various forms must be recovered into base forms in morphological analysis as well as part-of-speech (POS) tagging is difficult without morphological analysis because Korean is agglutinative. This is one of notorious problems in Korean morphological analysis and has been solved by morpheme recovery rules, which generate morphological ambiguity resolved by POS tagging. In this paper, we propose a morpheme recovery scheme based on machine learning methods like Na$\ddot{i}$ve Bayes models. Input features of the models are the surrounding context of the syllable which the spelling change is occurred and categories of the models are the recovered syllables. The POS tagging system with the proposed model has demonstrated the $F_1$-score of 97.5% for the ETRI tree-tagged corpus. Thus it can be decided that the proposed model is very useful to handle morpheme recovery in Korean.

Language Model Smoothing for Korean Morpheme Recovery (한국어 형태소 복원을 위한 언어모델의 평탄화(smoothing))

  • Lee, Daniel;Kim, Bo-Gyum;Lee, Jae-Sung
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06b
    • /
    • pp.309-311
    • /
    • 2012
  • 형태소 복원은 형태소 분석의 한 단계로 문장에 나타난 형태소의 변형 현상을 분석하여 규칙화하고 이를 이용하여 형태소 원형을 복원하는 것이다. 본 논문에서는 형태소 품사 부착 말뭉치로부터 다양한 형태소 변화 규칙을 학습하여 효과적으로 형태소 원형을 복원하기 위한 계산 방법을 비교한다. 이를 위해 계산 모델, 한글 코드, 학습 자료를 다르게 하여 학습하고 그에 따른 성능을 비교 분석한다.

Syllable-based Probabilistic Models for Korean Morphological Analysis (한국어 형태소 분석을 위한 음절 단위 확률 모델)

  • Shim, Kwangseob
    • Journal of KIISE
    • /
    • v.41 no.9
    • /
    • pp.642-651
    • /
    • 2014
  • This paper proposes three probabilistic models for syllable-based Korean morphological analysis, and presents the performance of proposed probabilistic models. Probabilities for the models are acquired from POS-tagged corpus. The result of 10-fold cross-validation experiments shows that 98.3% answer inclusion rate is achieved when trained with Sejong POS-tagged corpus of 10 million eojeols. In our models, POS tags are assigned to each syllable before spelling recovery and morpheme generation, which enables more efficient morphological analysis than the previous probabilistic models where spelling recovery is performed at the first stage. This efficiency gains the speed-up of morphological analysis. Experiments show that morphological analysis is performed at the rate of 147K eojeols per second, which is almost 174 times faster than the previous probabilistic models for Korean morphology.

Comparison of Calculation Methods for Probabilistic Korean Morpheme Recovery Model (한국어 형태소 복원 확률 모델의 계산 방법 비교)

  • Lee, Daniel;Kim, Bogyum;Lee, Jae Sung
    • Annual Conference on Human and Language Technology
    • /
    • 2011.10a
    • /
    • pp.130-132
    • /
    • 2011
  • 형태소 복원은 형태소 분석의 한 단계로 문장에 나타난 형태소의 변형 현상을 분석하여 규칙화하고 이를 이용하여 형태소 원형을 복원하는 것이다. 본 논문에서는 형태소 품사 부착 말뭉치로부터 다양한 형태소 변화 규칙을 학습하여 효과적으로 형태소 원형을 복원하기 위한 계산 방법을 비교한다. 이를 위해 계산 모델, 한글 코드, 학습 자료를 다르게 하여 학습하고 그에 따른 성능을 비교 분석한다.

  • PDF

Development of the video-based smart utterance deep analyser (SUDA) application (동영상 기반 자동 발화 심층 분석(SUDA) 어플리케이션 개발)

  • Lee, Soo-Bok;Kwak, Hyo-Jung;Yun, Jae-Min;Shin, Dong-Chun;Sim, Hyun-Sub
    • Phonetics and Speech Sciences
    • /
    • v.12 no.2
    • /
    • pp.63-72
    • /
    • 2020
  • This study aims to develop a video-based smart utterance deep analyser (SUDA) application that analyzes semiautomatically the utterances that child and mother produce during interactions over time. SUDA runs on the platform of Android, iPhones, and tablet PCs, and allows video recording and uploading to server. In this device, user modes are divided into three modes: expert mode, general mode and manager mode. In the expert mode which is useful for speech and language evaluation, the subject's utterances are analyzed semi-automatically by measuring speech and language factors such as disfluency, morpheme, syllable, word, articulation rate and response time, etc. In the general mode, the outcome of utterance analysis is provided in a graph form, and the manger mode is accessed only to the administrator controlling the entire system, such as utterance analysis and video deletion. SUDA helps to reduce clinicians' and researchers' work burden by saving time for utterance analysis. It also helps parents to receive detailed information about speech and language development of their child easily. Further, this device will contribute to building a big longitudinal data enough to explore predictors of stuttering recovery and persistence.