Discriminative Models for Automatic Acquisition of Translation Equivalences

Zhang, Chun-Xiang;Li, Sheng;Zhao, Tie-Jun;

International Journal of Control, Automation, and Systems

Volume 5 Issue 1
/
Pages.99-103
/
2007
/
1598-6446(pISSN)
/
2005-4092(eISSN)

Institute of Control, Robotics and Systems (제어로봇시스템학회)

Discriminative Models for Automatic Acquisition of Translation Equivalences

Zhang, Chun-Xiang (School of Computer Science and Technology, Harbin Institute of Technology) ;
Li, Sheng (School of Computer Science and Technology, Harbin Institute of Technology) ;
Zhao, Tie-Jun (School of Computer Science and Technology, Harbin Institute of Technology)

Published : 2007.02.28

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Translation equivalence is very important for bilingual lexicography, machine translation system and cross-lingual information retrieval. Extraction of equivalences from bilingual sentence pairs belongs to data mining problem. In this paper, discriminative learning methods are employed to filter translation equivalences. Discriminative features including translation literality, phrase alignment probability, and phrase length ratio are used to evaluate equivalences. 1000 equivalences randomly selected are filtered and then evaluated. Experimental results indicate that its precision is 87.8% and recall is 89.8% for support vector machine.

Keywords

References

W. A. Gale and K. W. Church, 'Identifying word correspondences in parallel texts,' Proc. of the 4th DARPA Workshop on Speech and Natural Language, pp. 152-157, 1991
H. Kaji, Y. Kida, and Y. Morimoto, 'Learning translation templates from bilingual texts,' Proc. of the 14th International Conference on Computational Linguistics, pp. 672-678, 1992
D. W. Oard and B. J. Dorr, A Survey of Multilingual Text Retrieval, Technical Report, University of Maryland, 1996
Y. Zhang, S. Vogel, and A. Waibel, 'Integrated phrase segmentation and alignment model for statistical machine translation,' Proc. of International Conference on Natural Language Processing and Knowledge Engineering, 2003
F. Wong, D. C. Hu, Y. H. Mao, and M. C. Dong, 'A flexible example annotation schema: Translation corresponding tree representation,' Proc. of the 20th International Conference on Computational Linguistics, pp. 1079-1085, 2004
K. Imamura and E. Sumita, 'Bilingual corpus cleaning focusing on translation literality,' Proc.of the 7th International Conference on Spoken Language Pro-cessing, pp. 1713-1716, 2002
P. F. Brown, S. A. Della Pietra, V. J. Della Pietra, and R. L. Mercer, 'The mathematics of statistical machine translation: Parameter estimation,' Computational Linguistics, vol. 19, no. 2, pp. 263-311, 1993
K. W. Church, 'Char_align: A program for aligning parallel texts at the character level,' Proc. of Meeting of the Association for Computational Linguistics, pp. 1-8, 1993
C. Cortes and V. Vapnik, 'Support-vector networks,' Machine Learning, vol. 20, no. 3, pp. 273-297, 1995
Y. Li, H. Zaragoza, R. Herbrich, J. Shawe-Taylor, and J. Kandola, 'The perceptron algorithm with uneven margins,' Proc. of the 9th International Conference on Machine Learning, pp. 379-386, 2002

International Journal of Control, Automation, and Systems

Discriminative Models for Automatic Acquisition of Translation Equivalences

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)