A Machine Learning Approach to Korean Language Stemming

Cho, Se-hyeong;

Journal of the Korean Institute of Intelligent Systems (한국지능시스템학회논문지)

Volume 11 Issue 6
/
Pages.549-557
/
2001
/
1976-9172(pISSN)
/
2288-2324(eISSN)

Korean Institute of Intelligent Systems (한국지능시스템학회)

A Machine Learning Approach to Korean Language Stemming

Cho, Se-hyeong (Myongji University)

Published : 2001.12.01

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

Morphological analysis and POS tagging require a dictionary for the language at hand . In this fashion though it is impossible to analyze a language a dictionary. We also have difficulty if significant portion of the vocabulary is new or unknown . This paper explores the possibility of learning morphology of an agglutinative language. in particular Korean language, without any prior lexical knowledge of the language. We use unsupervised learning in that there is no instructor to guide the outcome of the learner, nor any tagged corpus. Here are the main characteristics of the approach: First. we use only raw corpus without any tags attached or any dictionary. Second, unlike many heuristics that are theoretically ungrounded, this method is based on statistical methods , which are widely accepted. The method is currently applied only to Korean language but since it is essentially language-neutral it can easily be adapted to other agglutinative languages.

Keywords

References

Journal of Korean Information Science Society (B) v.24-2 Two-stage Korean tagger based on Statistics and Rules SangHyun Shin;KunBae Lee;JongHyuk Lee
Journal of Korean Information Science Society(B) v.23-4 Construction of dictionary of noun-derived suffixes based on Corpus analysis YunJin Nam;ChulYung Ok
Journal of Korean Information Science Society(B) v.22-10 Morphological Analysis of Korean Irregular verbs and adjectives using syllable characteristics SeungSik Kang
Journal of Korean Information Science Society v.20 no.10 Method for reduction of lexicon reference in Korean morphological analysis by two-way longest match JaeHyung Choi;SangJo Lee
Journal of Korean Information Science Society(B) v.23-1 Korean dictionary using two-way trie structure ChulSu Kim(et al.)
Journal of Korean Information Science Society v.22-6 Efficient Korean morphology analyzer using exclusion information HeeSuk Lim;BoHyun Yoon;HaeChang Lim
Journal of Korean Information Science Society v.23-9 Automatic segmentation using mutual information among syllables Kwang Sub Shim
Foundations of Statistical Natural Language Processing C. Manning;H. Schultze
Machine Translation and Computational Linguistics v.11 Development of stemming algorithms Lovins J.B
proceedings of the ACL99 workshop: Unsupervised learning in Natural Language Processing Knowledge-free Induction of Morphology using Latent Semantic Analysis Patrick Schone;Daniel Jurafsky
Unsupervised learning of the morphology of a natural language J. Goldsmith
Machine Learning v.39 A Machine Learning Approach to POS tagging LLuis Marquez;Lluis Padro;Horacio Rodriguez
proceedings of the ACL99 workshop : Unsupervised learning in Natural Language Processing Unsupervised learning of derivational morphology from inflectional lexicons E. Gaussier
Morphemes as necessary concepts for structures : Discovery from untagged corpora Dejean, H.
Analysis of usage count of Korean morphemes and words HeungGyu Kim;BumMo Kang
Program v.14 no.3 An algorithm for suffix stripping M.F.Porter
Human Behavior and the Principle of Least Effort Zipf G.K.
Introduction to Probability and Statistics W. Mendenhall;R.J.Beaver
Technical Report TR99-1756 Unsupervised Statistical Segmentation of Japanese Kanji Strings R.Ando;L.Lee
Trends in Speech Recognition Phonological Aspect of speech recognition J.E. Shoup;Lea W.A(ed.)
Speech and Language Processing D. Jurafsky;J.H. Martin
Proc. IEEE International Conference of Neural Networks Distributed Syntactic Representations with an Application to Part-of-speech Tagging H. Schutze

Journal of the Korean Institute of Intelligent Systems (한국지능시스템학회논문지)

A Machine Learning Approach to Korean Language Stemming

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)