Browse > Article
http://dx.doi.org/10.5573/ieie.2014.51.8.066

Analysis of Korean Language Parsing System and Speed Improvement of Machine Learning using Feature Module  

Kim, Seong-Jin (Dept. of electrical engineering, Ulsan Univ.)
Ock, Cheol-Young (Dept. of electrical engineering, Ulsan Univ.)
Publication Information
Journal of the Institute of Electronics and Information Engineers / v.51, no.8, 2014 , pp. 66-74 More about this Journal
Abstract
Recently a variety of study of Korean parsing system is carried out by many software engineers and linguists. The parsing system mainly uses the method of machine learning or symbol processing paradigm. But the parsing system using machine learning has long training time because the data of Korean sentence is very big. And the system shows the limited recognition rate because the data has self error. In this thesis we design system using feature module which can reduce training time and analyze the recognized rate each the number of training sentences and repetition times. The designed system uses the separated modules and sorted table for binary search. We use the refined 36,090 sentences which is extracted by Sejong Corpus. The training time is decreased about three hours and the comparison of recognized rate is the highest as 84.54% when 10,000 sentences is trained 50 times. When all training sentence(32,481) is trained 10 times, the recognition rate is 82.99%. As a result it is more efficient that the system is used the refined data and is repeated the training until it became the steady state.
Keywords
한국어 파싱;한국어의 의존 관계;기계 학습;자질 집합;
Citations & Related Records
Times Cited By KSCI : 5  (Citation Analysis)
연도 인용수 순위
1 Miikkulainen, R. and Dyer, M. G, "Natural Language processing with modular neural networks and distributed lexicon", Convitive Science, 15, pp. 343-399.
2 Hinton, G. E., McClelland, J. L., and RumelHart, D. E., "Distributed representation. - Parallel Distributed Processing: Exploratons in the Microstructure of Cognition.", Vol I. pages 77-109, MIT Press, Cambridge, MA.
3 Soojong Lim, Youngtae Kim, Dongyul Ra, "Korean Dependency Parsing Based on Machine Learning of Feature Weights", the journal of KIISE, Vol 38. 4, pp. 214-223, 2011.   과학기술학회마을
4 Youngmin Park, Jungyun Seo, "Segang Korean dependency Analyzer", competitive exhibition of 2011 Korean Information Processing System, 2011.
5 J.H. Kim, "A Study on a Corpus Construction Tool for Machine Translation", Research Report, Electronics and Telecommunications Research Institute (ETRI), 2012.
6 Yonghun Lee, JongHyeok Lee, "Korean Dependency Parsing Using Online Learning", the Conference of Korea Computer Congress 2014, Vol. 37., No, 1, 2010.
7 H.G. Kim, "21st Century Sejong Project Construction of the Primary Data of the Korean Language", Research Report NIKL 2007-01-10, National Institute of the Korean Language, 2007.
8 Youngsook Hwang, Hoojung Chung, Soyoung Park, YoungJae Kwak, Haechang Rim, "Improving the Performance of Korean Text Chunking by Machine Learning Approaches based on Feature Set Selection", the journal of KIISE, Vol. 29. pp. 654-668. 2002.   과학기술학회마을
9 Geunbae Lee, "Comparison of connectionism and Symbolism in Natural Lanuage Processing", The Journal of KIISE, pp. 1230-1238. 1993.
10 Myunggil Choi, Hyungwon Seo, Hongseok Kwon, Jaehoon Kim, "Detecting and correcting errors in Korean POS-tagged corpora", the Journal of KOSME, Vol. 37, pp. 227-235. 2013.   과학기술학회마을   DOI   ScienceOn
11 Waltz, D. L., Pollack, J. B., "Massively parallel parsing", Cognivive Science, 9. pp. 51-74.
12 Youngkuk Hong, Jonghyuk Hong, Geunbae Lee, "A Korean Syntactic Analyzer based on the Dependency Grammar", the Conference of KIISE, Vol. 20. pp. 781-784. 1993.
13 Joonchoul Shin, Cheolyoung Ock, ,"A Korean Morphological Analyzer using a Pre-analyzed Partial Word-phrase Dictionary", the journal of KIISE, Vol 39, pp. 415-424, 2012.   과학기술학회마을
14 Kwangmo Ahn, Younghoon Seo, "A Korean Dependency Parsing Algorithm using Sets of Head Candidates", the journal of KIISE, Vol 41. pp. 88-95, 2014.   과학기술학회마을
15 M. Collins, "Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms," Proc. of EMNLP, 2002.
16 Yoav Freund, Robert E. Schapire, "Large Margin Classification Using the Perceptron Algorithm", Machine Learning Vo. 37. 277-296, 1999.   DOI   ScienceOn
17 S. Bucholz, E. Marsi, "CoNLL-X shared task on Multilingual Dependency Parsing", Proc. of CoNLL, pp.149-164, 2006.
18 R. McDonald, K. Crammar, F. Pereira, "Online Largemargin Training of Dependency Parsers," Proc. of ACL, pp.91-98, 2005.
19 J. Nivre, "An Efficient Algorithm for Projective Dependency Parsing," Proc. of IWPT, pp.149-160, 2003.
20 Newell, A, "Physical sysbol systems", Cognitive science, 4, pp. 135-183.
21 R. McDonald, F. Pereira, "Online Learning of approximate dependency parsing algorithms", Proc, of EACL, 2006.