Browse > Article

Generation of Natural Referring Expressions by Syntactic Information and Cost-based Centering Model  

Roh Ji-Eun (포항공과대학교 컴퓨터공학과)
Lee Jong-Hyeok (포항공과대학교 컴퓨터공학과)
Abstract
Text Generation is a process of generating comprehensible texts in human languages from some underlying non-linguistic representation of information. Among several sub-processes for text generation to generate coherent texts, this paper concerns referring expression generation which produces different types of expressions to refer to previously-mentioned things in a discourse. Specifically, we focus on pronominalization by zero pronouns which frequently occur in Korean. To build a generation model of referring expressions for Korean, several features are identified based on grammatical information and cost-based centering model, which are applied to various machine learning techniques. We demonstrate that our proposed features are well defined to explain pronominalization, especially pronominalization by zero pronouns in Korean, through 95 texts from three genres - Descriptive texts, News, and Short Aesop's Fables. We also show that our model significantly outperforms previous ones with a 99.9% confidence level by a T-test.
Keywords
text generation; referring expression generation; pronominalization; zero pronoun; feature; cost-based centering model; machine learning;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 김미영, '한국어 담화의 중심화', 서울 대학교 언어학과 석사 학위 논문, 1994
2 김미경, '정보구조화 관점에서 본 생략의 의미와 조건', 담화와 인지, 제6권, 2호, pp. 61-88, 1999
3 Strube, M. and Hahn, U., 'Functional Centering: Grounding Referential Coherence in Information Structure,' Computational Linguistics 25(3), pp. 309-344, 1999
4 Hall, M. A., 'Correlation-based Feature Subset Selection for Machine Learning,' PhD Thesis at the University of Waikato, 1998
5 Hashimoto Sachie, 'Anaphoric Expression Selection in the Generation of Japanese,' Information Processing Society of Japan, No.143, 2001
6 Yeh, Ching-Long, Mellish, Chris, 'An Empirical Study on the Generation of Anaphora in Chinese,' Computational Linguistics, 23-1, pp. 169-190, 1997
7 Artstein, R., 'Animacy and null subjects,' Proceedings of Console VII, pp. 1-15, 1999
8 Yamura-Takei , M., Fujiwara M., and Aizawa, T., 'Centering as an Anaphora Generation Algorithm: A Language Learning Aid Perspective,' NLPRS, Tokyo, Japan, pp. 557-562, 2001
9 McKeown, K.R., 'Text Generation: Using Discourse Strategies and Focus Constraints to Generate Natural Language Text,' Cambridge, U.K.: Cambridge University Press, 1985
10 Prasad, R., 'Constraints on the generation of referring expressions, with special Reference to Hindi', U of Pennsylvania, PhD Thesis, 2003
11 Roh, J.E. and Lee, J.H., 'Coherent Text Generation using Entity-based Coherence Measures,' ICCPOL, Shen-Yang, China, pp. 243-249, 2003
12 Cheng, H., 'Experimenting with the Interaction between Aggregation and Text Planning,' Proceedings of ANLP-NAACL, USA, 2000
13 Mittal, V., Moore, J., Carenini, G., and Roth, S., 'Describing Complex Charts in Natural Language: A Caption Generation System,' Computational Linguistics, 1998
14 Kibble, R. and Power, R., 'Using centering theory to plan coherent texts,' In Proceedings of the 12th Amsterdam Colloquium., 1999
15 Roh, J.E., Kang, S.J. and Lee, J.H., 'Korean Text Generation from Database for Home shopping Sites,' NLPRS, Tokyo, Japan, pp. 419-426, 2001
16 Kibble, R. and Power, R., 'An integrated frame-work for text planning and pronominalization,' INLG, Mitzpe Ramon, Israel, pp. 77-84, 2000
17 Grosz, B.J., Joshi, A.K. and Weinstein, S., 'Centering: A Framework for Modeling the Local Coherence of Discourse,' Computational Linguistics 21(2), pp. 203-225, 1995
18 Poesio, M., Stevenson, R., Eugenio, B. D., Hitzeman, J., and Cheng, H., MS, 'Centering: A Parametric Theory and its Instantiations,' to appear in Computational Linguistics, 2004
19 김미경, '중심화 이론에서 본 한국어 논항의 생략현상', 언어, 28권, 제1호, pp. 29-49, 2003
20 류병률, '한국어 담화상의 중심화와 영형 조응 현상', 서울 대학교 언어학과 석사 학위논문, 2001