Browse > Article

Range Detection of Wa/Kwa Parallel Noun Phrase using a Probabilistic Model and Modification Information  

Choi, Yong-Seok (한국과학기술원 전산학과)
Shin, Ji-Ae (정보통신대학교 공학부)
Choi, Key-Sun (한국과학기술원 전산학과)
Abstract
Recognition of parallel structure at early stage of sentence parsing can reduce the complexity of parsing. In this paper, we propose an unsupervised language-independent probabilistic model for recongition of parallel noun structures. The proposed model is based on the idea of swapping constituents, which replies the properties of symmetry (two or more identical constituents are repeated) and of reversibility (the order of constituents is inter-changeable) in parallel structures. The non-symmetric patterns that cannot be captured by the general symmetry rule are resolved additionally by the modifier information. In particular this paper shows how the proposed model is applied to recognize Korean parallel noun phrases connected by "wa/kwa" particle. Our model is compared with other models including supervised models and performs better on recongition of parallel noun phrases.
Keywords
Korean parsing; Natural Language Processing; Unsupervised Probabilistic Model; Language-Independent Model; Parallel Structure;
Citations & Related Records
연도 인용수 순위
  • Reference
1 이관규, '국어 대등구성 연구', 서광학술 자료사, 1992
2 Quinlan, J. Ross, 'C4.5:Programs for Machine Learning', Morgan Kaufmann Publishers, 1993
3 Eric Sven Ristad. 1998. Maximum entropy modeling toolkit, release 1.6 beta. http://www.mnemonic. com/software/memt
4 Choi, Yong-Seok, Ji-Ae Shin, Key-Sun Choi (2006), Identification of Boundaries in Parallel Noun Phrases: A Probabilistic Swapping Model, International Journal of Computer Processing of Oriental Languages, 19(2&3), 109-132   DOI
5 Kurohashi, S. and Nagao, M., 'A Syntactic analysis method of long Japanese sentences based on detection of conjunctive structures,' Computational Linguistics, Vol.20, No.4, pp. 507-534, 1994
6 Jaynes, E.T., 'Information theory and statistical mechanics,' Physics Reviews106, pp. 620-630, 1957   DOI
7 Choi, Key-Sun, Hee-Sook Bae, Procedures and Problems in Korean-Chinese-Japanese Wordnet with Shared Semantic Hierarchy, WordNet Conference, pp. 320-325, 2004.1, Brno, Czech
8 Resnik, Philip, 'Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language,' Journal of Artificial Intelligence Research, Vol.11, pp. 95-130, 1999   DOI
9 Abney, S., 'Parsing by Chunks,' In R.C. Berwick, S.P. Abney and C. Tenny, editors, Principle-Based Parsing: Computation and Psycholinguistics, Kluwer, pp. 257-278, 1991
10 Brown, P. F., S. A. Della Pietra, V. J. Della Pietra, and R. L. Mercer. 'The mathematics of statistical machine translation: Parameter estimation. Computational linguistics, Vol.19, pp. 263-312, 1993
11 Joachims, Thorsten, Learning to Classify Text Using Support Vector Machines. Dissertation, Kluwer, 2002
12 Kurohashi, Sadao and Makoto Nagao, 1994a. KN Parser: Japanese dependency/case structure analyzer. In Proceedings of Workshop on Sharable Natural Language Resources, pages 4855
13 Yoon, Juntae, Key-Sun Choi, Mansuk Song 'Corpus-Based Approach for Nominal Compound Analysis for Korean Based on Linguistic and Statistical Information,' Natural Language Engineering vol 7/No 3, 251-270, 2001
14 Och, Franz Josef, Hermann Ney, 'A Systematic Comparison of Various Statistical Alignment Models,' Computational Linguistics, 29(1):19-51, 2003   DOI   ScienceOn
15 The KAIST corpus 1996-1997, Korea Advanced Institute of Science and Technology, http://korterm.org/, 1997
16 Corbett, Edward P. J. Classical Rhetoric for the Modern Student. 3rd ed. NY: Oxford University Press, p. 428. 1990
17 박준식, '품사 패턴을 이용한 한국어 병렬 구문의 해석', 한국과학기술원 석사학위 논문, 1998