Text Structuring using Centering Theory

중심화 이론을 이용한 텍스트 구조화

  • 노지은 (포항공과대학교 컴퓨터공학과) ;
  • 나승훈 (포항공과대학교 컴퓨터공학과) ;
  • 이종혁 (포항공과대학교 컴퓨터공학과)
  • Published : 2007.06.15

Abstract

This paper investigates Centering-based metrics to evaluate ordering of utterances for text structuring. We point out a problem of MIN.NOCB metric which has been regarded as the simplest and best measure to evaluate coherence of ordering within Centering framework, and propose a new Centering-based metric, MAX.CPS as an alternative or supplementary one. This paper introduces a framework which pre-estimates the effectiveness of a metric on a given input ordering, and selects an applicable metric according to the pre-estimation result. Using this framework, we propose a new policy which can generate more optimal ordering within Centering framework. Moreover, we evaluate several kinds of Cf-ranking methods in terms of Centering-based metrics, and find that simply ranking entities by their linear order is generally the most suitable because of characteristics in Korean.

본 논문에서는 자연스러운 텍스트 생성을 위한 여러 과정 중, 문장 순서를 결정하기 위한 텍스트 구조화(text structuring)에 관한 것으로, 중심화 이론(centering theory)에 기반하여 문장 순서의 자연스러움을 판단할 수 있는 다양한 평가 척도를 논의한다. 먼저, 기존 연구들에서 중심화 이론에 기반한 문장 순서의 평가 척도들 중 가장 효과적이라고 알려진 MIN.NOCB를 텍스트 구조화에 적용할 때 발생할 수 있는 문제점을 지적하고, 대안이 될 수 있는 새로운 평가 척도인 MAX.CPS를 제안한다. 또, 임의의 평가 척도가 주어진 문장들에 대해 가질 수 있는 기대치를 먼저 예측하고, 그것에 따라 다른 평가 척도를 적용하게 하는 프레임워크를 제안하여, 중심화 이론 안에서 최상의 문장 순서를 찾기 위한 새로운 방법론을 모색한다. 또한, 중심화 이론의 적용에 있어 핵심이라 할 수 있는, 명사들의 돋보임성(salience)을 서열화(cf-ranking) 하는 다양한 방식을 중심화 기반 문장 순서 평가 척도의 관점에서 분석하였다. 그 결과, 텍스트 구조화에 관한 한, 단순히 문장에서 실현된 순서에 따라 명사들의 돋보임성의 서열을 정하는 것이 한국어의 특성상 가장 간단하면서도 효율적임을 입증하였다.

Keywords

References

  1. Grosz, B.J., Joshi, A.K., and Weinstein, S., 'Centering: a framework for modeling the local coherence of discourse,' Proc. Computational Linguistics 21(2): 203-225, 1995
  2. Karamanis, N., Poesio, M., Mellish, C., and Oberlander, J., 'Evaluating Centering. based metrics of coherence using a reliably annotated corpus,' Proc. the 38th Annual Meeting of the Association for Computational Linguistics, pp.391-398, 2004a https://doi.org/10.3115/1218955.1219005
  3. Karamanis, N., Mellish, C., Oberlander, J., and Poesio, M., 'A Corpus-based Methodology for Evaluating Metrics of Coherence for Text Structuring,' Proc. of INLG-04, pp. 90-99. Brockenhurst, UK, 2004b
  4. Kibble, R., and Power, R., 'An integrated framework for text planning and pronominalization,' Proc. 1st International Natural Language Generation, Mitzpe Ramon, Israel, pp.77-84, 2000 https://doi.org/10.3115/1118253.1118265
  5. Miltsakaki, E., and Kukich, K., 'The role of Centering theory's rough shift in the teaching and evaluation of writing skills,' Proc. the 38th Annual Meeting of the Association for Computational Linguistics, Hong Kong, pp. 408-415, 2000 https://doi.org/10.3115/1075218.1075270
  6. V. Mittal, J. Moore, G. Carenini, and S. Roth, 'Describing complex charts in natural language: a caption generation system,' Proc. Computational Linguistics, Special issue on Natural Language Generation, vol.24, no.3, pp.431-467, 1998
  7. Roh, J.E., and Lee, J.H., 'An empirical study for generating zero pronoun in Korean based on Cost-based Centering Model,' Proc. Australasian Language Technology Association, Melbourne, Australia, pp.90-97, 2003
  8. Kim, M.K., 'Conditions on deletion in Korean based on information packaging,' Proc. Discourse and Cognition, 1(2): 61-88, 1999
  9. Kim, M.K., 'Zero vs. overt NPs in Korean discourse: a centering analysis,' Korean Journal of Linguistics, 28(1): 29-49, 2003
  10. Kim, M.Y., 'The centering of Korean discourse,' Seoul National University, M.S. Thesis, 1994
  11. Ryu, B.R., 'Centering and zero anaphora in the Korean discourse,' Seoul National University, M.S. Thesis, 2001
  12. Walker, M., Iida, M., and Cote, S., 'Japanese discourse and the process of centering,' Proc. Computational Linguistics, 20(2): 193-232, 1994
  13. Kameyama, M., 'Intra sentential centering: a case study,' In Walker, M.A., Joshi, A.K., and Prince, E.F., editors, Centering Theory in Discourse, chapter 6, pp.89-112, Oxford, 1998.
  14. Hudson, S.D., and Tanenhaus, M.K., 'Assigning antecedents to ambiguous pronouns: The role of the center of attention as the default assignment,' In M.A. Walker, A.K. Joshi, and E.F. Prince, editors, Centering in Discourse. Oxford University Press, pp.199-226, 1998
  15. Eugenio, B. D., 'Centering in Italian,' In Walker, M.A., Joshi, A.K., and Prince, E.F., editors, Centering Theory in Discourse, chapter 7, pp. 115.138, Oxford, 1998
  16. Hoffman, B., 'The Computational Analysis of the Syntax and Discourse Use of Free Word order in Turkish,' University of Pennsylvania, Ph.D. dissertation, 1995
  17. Gordon, P.C., Grosz, B.J. and Gillion, L.A. 'Pronouns, names, and the centering of attention in discourse,' Proc. Cognitive Science, 17: 311-348, 1993
  18. Brennan, S.E., Friedman, M.W. and Pollard. C.J. 'A centering approach to pronouns,' Proc. The 25th Annual Meeting of the Association for Computational Linguistics, pp.155-162, 1987 https://doi.org/10.3115/981175.981197
  19. Cote, S., 'Ranking forward looking centers,' In M.A. Walker, A.K. Joshi, and E.F. Prince, editors, Centering Theory in Discourse. Oxford, chapter 4, pp. 55-70, 1998
  20. Rambow, O., 'Pragmatics aspects of scrambling and topicalization in German. Proc. Workshop on Centering Theory in Naturally Occurring Discourse,' Philadelphia. Institute for Research in Cognitive Science (IRCS), 1993
  21. Strube, M., and Hahn, U., 'Functional centering: grounding referential coherence in information structure,' Proc. Computational Linguistics, 25(3): 309-344, 1999
  22. Roh, J.E., and Lee, J.H., 'Generation of natural referring expressions by syntactic information and Cost-based Centering Model,' Journal of KISS: Software and Applications, vol.21, no.12, pp.1649- 1659, 2004
  23. Byron, D., and Stent, A., 'A preliminary model of centering in dialog,' Proc. 36th Annual Meeting of the Association for Computational Linguistics, Montreal, Canada, pp. 1475-1477, 1998 https://doi.org/10.3115/980691.980811
  24. Passonneau, R.J., 'Getting and keeping the center of attention,' In Bates, M. and Weischedel, R.R., editors, Challenges in Natural Language Processing, Cambridge University Press, pp.179-227, 1993
  25. Poesio, M., Stevenson, R., Cheng, H., Eugenio, B.D., and Hitzeman, J., 'Centering: a parametric theory and its instantiations,' Proc. Computational Linguistics, 30(3): 309-363, 2004 https://doi.org/10.1162/0891201041850911
  26. Tetreault, J.R., 'A corpus based evaluation of centering and pronoun resolution,' Proc. Computational Linguistics, 2(4): 507-520, 2001 https://doi.org/10.1162/089120101753342644
  27. Kim, M.Y., 'An optimality approach to the referential interpretation of zero anaphors in Korean,' Seoul National University, PhD. Thesis, 2003
  28. Viterbi, A. J., 'Error bounds for convolutional codes and an asymptotically optimal decoding algorithm,' IEEE Trans. Information Theory, IT-13:260-269, 1967 https://doi.org/10.1109/TIT.1967.1054010
  29. Bak, S.Y., 'Topic in Korean discourse, Korean Journal of Linguistics,' 11(2): 1-15, 1986
  30. Yang, D.W., 'Topicalization and Relativization in Korean,' Pan Korean Book Cor., 1975
  31. Chae, W., 'Meaning of topic marker -nun, Korean Journal of Linguistics,' 4: 93-111, 1976
  32. Lapata, M., 'Probabilistic text structuring: Experiments with sentence ordering,' Proc. the 37th Annual Meeting of the Association for Computational Linguistics., pp.545-552, 2003 https://doi.org/10.3115/1075096.1075165