Browse > Article

Context-Weighted Metrics for Example Matching  

Kim, Dong-Joo (Dept. of Com. Sci. & Eng., Hanyang University)
Kim, Han-Woo (Dept. of Com. Sci. & Eng., Hanyang University)
Publication Information
Abstract
This paper proposes a metrics for example matching under the example-based machine translation for English-Korean machine translation. Our metrics served as similarity measure is based on edit-distance algorithm, and it is employed to retrieve the most similar example sentences to a given query. Basically it makes use of simple information such as lemma and part-of-speech information of typographically mismatched words. Edit-distance algorithm cannot fully reflect the context of matched word units. In other words, only if matched word units are ordered, it is considered that the contribution of full matching context to similarity is identical to that of partial matching context for the sequence of words in which mismatching word units are intervened. To overcome this drawback, we propose the context-weighting scheme that uses the contiguity information of matched word units to catch the full context. To change the edit-distance metrics representing dissimilarity to similarity metrics, to apply this context-weighted metrics to the example matching problem and also to rank by similarity, we normalize it. In addition, we generalize previous methods using some linguistic information to one representative system. In order to verify the correctness of the proposed context-weighted metrics, we carry out the experiment to compare it with generalized previous methods.
Keywords
example matching; context-weighted metrics; edit-distance; translation memory; example-based machine translation;
Citations & Related Records
연도 인용수 순위
  • Reference
1 E. W. Meyers and W. Miller, 'Row replacement algorithms for screen editors,' ACM Transactions on Programming Languages and Systems (TOPLAS), vol.11, no.1, pp.33-56, 1989   DOI   ScienceOn
2 S. Needleman and D. Wunsch, 'A General Method Applicable to the search for similarities in the amino acid sequence of two proteins,' Journal of Molecular Biology, vol.48, no.3, pp.443-453, 1970   DOI
3 J. Zobel and P. Dart, 'Phonetic String Matching : Lessons from Information Retrieval,' In Proc. of the 19th Annual International ACM SIGIR Conf., pp.166-172, 1996   DOI
4 V. I. Levenshtein, 'Binary Codes Capable of Correcting Deletions, Insertions, and Reversals,' Soviet Physics-Doklady, vol.10 no.8 pp.707-710, 1996, Translated from Doklady Akademii Nauk SSSR, vol.163, no.4 pp.845-848, 1965
5 F. J. Damerau, 'A Technique for Computer Detection and Correction of Spelling Errors,' Communications of the ACM, vol.7, no.3, pp.171-176, 1964   DOI
6 E. Sumita, 'Example-based machine translation using DP-matching between word sequences,' In Proc. of the ACL Workshop on Data-Driven Methods in MT., pp.1-8, 2001   DOI
7 L. Cranias, H. Papageorgiou and S. Piperidis, 'A matching technique in example-based machine translation,' In Proc. 15th Int. Conf. on Computational Linguistics, pp.100-104, 1994   DOI
8 E. Sumita and H. Iida, 'Experiments and Prospects of Example-Based Machine Translation,' In Proc. of the 29th Annual Meeting of the ACL, pp.185-192, 1991   DOI
9 S. Nirenburg, C. Domashnev and D. Grannes, 'Two approaches to matching in example-based machine translation,' In Proc. 5th Int. Conf. on Theoretical and Methodological Issues in Machine Translation, pp.47-57, 1993
10 T. Baldwin and H. Tanaka, 'The Effects of Word Order and Segmentation on Translation Retrieval Performance,' In Proc. of the 18th Int. Conf. on Computational Linguistics, pp.35-41, 2000   DOI
11 L. Cranias, H. Papageorgiou and S. Piperidis, 'Clustering A technique for search space reduction in example-based machine translation,' In Proc. Int. Conf. on Systems, Man, and Cybernetics, pp.1-6, 1994
12 T. Doi, H. Yamamoto and E. Sumita, 'Graph-based retrieval tor example-based machine translation using edit-distance,' In Proc. Workshop Example-Based Machine Translation at MT Submmit X, pp.51-58, 2005
13 M. Kay, 'The Proper Place of Men and Machines in Language Translation, Research Report CSL-80-11, Xerox Palo Alto Research Center. Palo Alto. Calif., Reprinted in Machine Translatin vol.12, pp.3-23 (1997), 1980
14 M. Nagao, 'A Framework of a Mechanical Translation between Japanese and English by Analogy. Principle,' In Artificial and human intelligence, A. Elithorn and R. Banerji (Eds.), Amsterdam: North-Holland, pp.173-180, 1984
15 F. Mandreoli, R. Martoglia and P. Tiberio, 'Searching similar (sub)sentences for example-based machine translation,' In Atti del Decimo Convegno Nazionale su Sistemi Evoluti per Basi di Dati (SEBD), pp.208-221, Isola d'Elba, Italy, 2002
16 H. L. Somers, 'New Paradigms' in MT: the State of the Play now that the Dust has Settled,' In 10th European Summer School in Logic, Language and Information, Workshop on Machine Translation, pp.22-33, 1998