Browse > Article
http://dx.doi.org/10.4218/etrij.13.0112.0016

Multicriteria-Based Computer-Aided Pronunciation Quality Evaluation of Sentences  

Yoma, Nestor Becerra (Department of Electrical Engineering, Universidad de Chile)
Berrios, Leopoldo Benavides (Department of Electrical Engineering, Universidad de Chile)
Sepulveda, Jorge Wuth (Department of Electrical Engineering, Universidad de Chile)
Torres, Hiram Vivanco (Department of Linguistics, Universidad de Chile)
Publication Information
ETRI Journal / v.35, no.1, 2013 , pp. 89-99 More about this Journal
Abstract
The problem of the sentence-based pronunciation evaluation task is defined in the context of subjective criteria. Three subjective criteria (that is, the minimum subjective word score, the mean subjective word score, and first impression) are proposed and modeled with the combination of word-based assessment. Then, the subjective criteria are approximated with objective sentence pronunciation scores obtained with the combination of word-based metrics. No a priori studies of common mistakes are required, and class-based language models are used to incorporate incorrect and correct pronunciations. Incorrect pronunciations are automatically incorporated by making use of a competitive lexicon and the phonetic rules of students' mother and target languages. This procedure is applicable to any second language learning context, and subjective-objective sentence score correlations greater than or equal to 0.5 can be achieved when the proposed sentence-based pronunciation criteria are approximated with combinations of word-based scores. Finally, the subjective-objective sentence score correlations reported here are very comparable with those published elsewhere resulting from methods that require a priori studies of pronunciation errors.
Keywords
Computer-aided pronunciation training; subjective criteria; second language learning; ASR;
Citations & Related Records

Times Cited By Web Of Science : 0  (Related Records In Web of Science)
연도 인용수 순위
  • Reference
1 S. Nakagawa and K. Ohta, "A Statistical Method of Evaluating Pronunciation Proficiency for Presentation in English," Proc. InterSpeech, Antwerp, Belgium, Aug. 2007.
2 L. Neumeyer et al., "Automatic Text-Independent Pronunciation Scoring of Foreign Language Student Speech," Proc. ICSLP, 1996, pp. 1457-1460.
3 H. Franco et al., "Automatic Pronunciation Scoring for Language Instruction," ICASSP, vol. 2, 1997, pp. 1471-1474.
4 A. Neri, C. Cucchiarini, and W. Strik, "Automatic Speech Recognition for Second Language Learning: How and Why It Actually Works," Proc. 15th Int. Congress Phonetic Sci., Barcelona, Spain, 2003, pp. 1157-1160.
5 J. Tepperman et al., "A Bayesian Network Classifier for Word-Level Reading Assessment," Proc. InterSpeech, Antwerp, Belgium, Aug. 2007.
6 C. Molina et al., "ASR Based Pronunciation Evaluation with Automatically Generated Competing Vocabulary and Classifier Fusion," Speech Commun., vol. 51, no. 6, June 2009, pp. 485-498.   DOI   ScienceOn
7 O. Deshmukh, S. Joshi, and A. Verma, "Automatic Pronunciation Evaluation and Classification," INTERSPEECH, 2008, pp. 1721-1724.
8 T. Cincarek et al., "Automatic Pronunciation Scoring of Words and Sentences Independent from the Non-native's First Language," Computer Speech Language, vol. 23, no. 1, Jan. 2009, pp. 65-88.   DOI   ScienceOn
9 S. Xu et al., "Automatic Pronunciation Evaluation Based on Feature Extraction and Combination," Proc. 3rd Int. Conf. Innovative Computing Inf. Control, 2008, pp. 172-176.
10 N. Moustroufas and V. Digalakis, "Automatic Pronunciation Evaluation of Foreign Speakers Using Unknown Text," Computer Speech Language, vol. 21, no. 1, Jan. 2007, pp. 219-230.   DOI   ScienceOn
11 L. Neumeyer et al., "Automatic Scoring of Pronunciation Quality," Speech Commun., vol. 30, no. 2-3, Feb. 2000, pp. 83-93.   DOI   ScienceOn
12 H. Franco et al., "Combination of Machine Scores for Automatic Grading of Pronunciation Quality," Speech Commun., vol. 30, no. 2-3, Feb. 2000, pp. 121-130.   DOI   ScienceOn
13 K.Y. Kwan, T. Lee, and C. Yang, "Unsupervised N-Best Based Model Adaptation Using Model-Level Confidence Measures," Proc. ICSLP, 2002, pp. 69-72.
14 S. Wei et al., "Pronunciation Space Models for Pronunciation Evaluation," 6th Int. Symp. Chinese Spoken Language Process. (ISCSLP), Dec. 2008, pp. 1-4.
15 W. Ward and S. Issar, "A Class Based Language Model for Speech Recognition," Proc. ICASSP, 1996, pp. 416-418.
16 L.I. Kuncheva, J.C. Bezdeck, and R.P.W. Duin, "Decision Templates for Multiple Classifier Fusion: An Experimental Comparison," Pattern Recog., vol. 34, no. 2, 2001, pp. 299-314.   DOI   ScienceOn
17 J. Zhang et al., "Improvements in Audio Processing and Language Modeling in the CU Communicator," Eurospeech, Aalborg, Denmark, 2001.
18 J. Sooful and E. Botha, "Comparison of Acoustic Distance Measures for Automatic Cross-Language Phoneme Mapping," Proc. ICSLP, Denver, CO, USA, 2002, pp. 521-524.
19 J. Kittler et al., "On Combining Classifiers," IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 3, Mar. 1998, pp. 226-239.   DOI   ScienceOn
20 L.I. Kuncheva, "A Theoretical Study on Six Classifier Fusion Strategies," IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 2, Feb. 2002, pp. 281-286.   DOI   ScienceOn
21 J. Kittler and F.M. Alkoot, "Sum versus Vote Fusion in Multiple Classifier Systems," IEEE Trans. Pattern Anal. Mach. Intell., vol. 25, issue 1, 2003, pp. 110-115.   DOI   ScienceOn
22 G. Fumera and F. Roli, "A Theoretical and Experimental Analysis of Linear Combiners for Multiple Classifier Systems," IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 6, June 2005, pp. 942-956.   DOI   ScienceOn
23 J. Garofalo et al., Continuous Speech Recognition (CSR-I) Wall Street Journal (WSJ0) News, Complete, Linguistic Data Consortium, Philadelphia, PA, USA, 1993.
24 Linguistic Data Consortium, LATINO-40 Spanish Read News Corpus, database, Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, USA, 1995.