Browse > Article
http://dx.doi.org/10.9709/JKSS.2010.19.1.103

High Speed Korean Dependency Analysis Using Cascaded Chunking  

Oh, Jin-Young (창원대학교 컴퓨터공학과)
Cha, Jeong-Won (창원대학교 컴퓨터공학과)
Abstract
Syntactic analysis is an important step in natural language processing. However, we cannot use the syntactic analyzer in Korean for low performance and without robustness. We propose new robust, high speed and high performance Korean syntactic analyzer using CRFs. We treat a parsing problem as a labeling problem. We use a cascaded chunking for Korean parsing. We label syntactic information to each Eojeol at each step using CRFs. CRFs use part-of-speech tag and Eojeol syntactic tag features. Our experimental results using 10-fold cross validation show significant improvement in the robustness, speed and performance of long Korea sentences.
Keywords
Korean Dependency analysis; Cascaded chunking; CRFs(Conditional Random Fields);
Citations & Related Records
연도 인용수 순위
  • Reference
1 Taku Kudo and Yuji Matsumoto, "Japanese Dependency Structure Analysis based on Support Vector Machines," In Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 18-25, 2000.
2 Steven Abney, "Parsing By Chunking," In Principle- Based Parsing. Kluwer Academic Publishers, 1991.
3 Msahiko Haruno, Satoshi Shirai, and Yoshifumi Ooyama, "Using Decision Trees to Construct a Practical Parser," Machine Learning, 34:131–149, 1999.
4 Jung, H.-S., J.-H. Kim, J.-S. Lee, S.-Y. Chun, and M.-J, "Park Design of Korean-English machine translation system (KoEng)," In Proceedings of the 1st Workshop of Machine Translation, pp. 87-96, 1989.
5 S. Della Pietra, V. Della Pietra, and J. Lafferty, "Inducing features of random fields," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 4, pp. 380-393, 1997.   DOI   ScienceOn
6 Slav Petrov and Dan Klein, "Improved Inference for Unlexicalized Parsing," In proceedings of HLT-NAACL 2007, pp. 404-411, 2007.
7 Kiyotaka Uchimoto, Satoshi Sekine, and Hitoshi Isahara, "Japanese Dependency Structure Analysis Based on Maximum Entropy Models," In Proceedings of the EACL, pp. 196-203, 1999.
8 Kudo, T. and Y. Matsumoto, "Japanese Dependency Analysis using cascaded Chunking," In Proceedings of the CoNLL-2003, pp. 63-69, 2002.
9 Masakazu Fujio and Yuji Matsumoto, "Japanese Dependency Structure Analysis based on Lexicalized Statistics," In Proceedings of EMNLP '98, pp. 87-96, 1998.
10 Jeongwon Cha, Geunbae Lee, and Jong-Hyeok Lee, "Morpho-syntactic categorial modeling of Korean," Computers and the Humanitie Journal, vol 36, no. 4, pp. 431-453, 2002.   DOI   ScienceOn
11 Kiyotaka Uchimoto, Masaki Murata, Satoshi Sekine, and Hitoshi Isahara, "Dependency model using posterior context," In Procedings of Sixth International Workshop on Parsing Technologies, pp. 321-322, 2000.
12 홍진표, 차정원, "어절패턴 사전을 이용한 새로운 한국어 형태소 분석기," 한국정보과학회 학술발표논문집, 35(1(C)), pp. 279-284, 2008년.
13 Hoojung Chung, Statistical Korean Dependency Parsing Model based on the surface Contextual Information, Ph.D. dissertation, 2004.
14 J. Lafferty, A. McCallum, and F. Pereira, "Conditional random fields: Probabilistic models for segmenting and labeling sequence data," In Proceedings. 18th International Conference on Machine Learning, pp. 282-289, 2001.
15 Charniak, E., "A Maximum-Entropy-Inspired Parse," In Proceedings of NAACL-2000, pp. 132-139, 2000.
16 Dan Klein and Christopher D. Manning., "Accurate Unlexicalized Parsing," ACL 2003, pp. 423-430, 2003.
17 Eugene Charniak and Mark Johnson, "Coarse-to-fine n-best parsing and MaxEnt discriminative reranking," In ACL 2005, pp. 173-180, 2005.
18 Geum, J. C. and G. Kim, "Implementation of HPSG parsig mechanism for Korean syntactic structure analysis," In Proceedings of the Spring Conference of Korea Information Science Society, pp. 139-142, 1998.
19 A.L. Berger, V.J. Della Pietra, and S.A. Della Pietra, "A maximum entropy approach to natural language processing," Computational Linguistics, vol. 22, no. 1, pp. 39-71, 1996.
20 Charniak, E., "Statistical parsing with a context-free grammar and word statistics," In Proceedings of the Fourteenth National Conference on Artificial Intelligence. Menlo Park, AAAI Press/MIT, pp. 598-603, 1997.
21 Yang, J, A study on the Korean analyzer based on HPSG, Master's thesis, Dept. of Computer Engineering. Seoul National University, 1990.
22 Zhou, H., T. Yu, et al, "Japanese Dependency Analysis Based on SVMs and CRFs," International Journal of Mathematics and Computersin Simulation, 1(3): 233-237, 2007.
23 Yong-Hun Lee and Jong-Hyeok Lee, "Korean Parsing using Machine Learning Techniques," KCC 2008, pp. 285-288, 2008.
24 Yoon, D. H. and Y. T. Kim, "Analysis techniques for Korean sentence based on Lexical Functional Grammar," In Proceedings of the International Parsing Workshop '89, pp. 369-78, 1989.