[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.7776/ASK.2006.25.6.277

Automatic Recognition of Pitch Accent Using Distributed Time-Delay Recursive Neural Network

Kim Sung-Suk (용인대학교 컴퓨터정보학과)

Publication Information

The Journal of the Acoustical Society of Korea / v.25, no.6, 2006 , pp. 277-281 More about this Journal

Abstract

This paper presents a method for the automatic recognition of pitch accents over syllables. The method that we propose is based on the time-delay recursive neural network (TDRNN). which is a neural network classifier with two different representation of dynamic context: the delayed input nodes allow the representation of an explicit trajectory F0(t) along time. while the recursive nodes provide long-term context information that reflects the characteristics of pitch accentuation in spoken English. We apply the TDRNN to pitch accent recognition in two forms: in the normal TDRNN. all of the prosodic features (pitch. energy, duration) are used as an entire set in a single TDRNN. while in the distributed TDRNN. the network consists of several TDRNNs each taking a single prosodic feature as the input. The final output of the distributed TDRNN is weighted sum of the output of individual TDRNN. We used the Boston Radio News Corpus (BRNC) for the experiments on the speaker-independent pitch accent recognition. π 1e experimental results show that the distributed TDRNN exhibits an average recognition accuracy of 83.64% over both pitch events and non-events.

Keywords

Pitch accent; Prosody; TDRNN; Distributed TDRNN;

Citations & Related Records

Reference

1	P.J. Price, M. Ostendorf, S. Shattuck-Hufnagel, and C. Fong, 'The use of prosody in syntactic disambiguation,' J. Acoust. Soc. Am, 90 (6) 2956-2970, 1991 DOI
2	Sung-Suk Kim, 'Time-delay recurrent neural network for temporal correlations and prediction,' Neuorcomputing, 20 253-263, Elsevier 1998
3	Jennifer Cole, Hansook Choi, Heejin Kim, and Mark Hasegawa-Johnson, 'The effect of accent on the acoustic cues to stop voicing in radio news speech,' in Proc. Inter. Conf. Phonetic 2003
4	Rumelhart D. E., McClelland J. L., and the PDP Research Group, 'Learning representations by back-propagating errors,' in Parallel Distributed Processing, 1 318-362. MIT Press, 1986
5	M. Ostendorf, P.J. Price, and S. Shattuck-Hufnagel, 'The Boston University Radio News Corpus,' Linguistic Data Consortium, 1995
6	Mary E. Beckman and Janet Pierrehumbert, 'Intonational structure in Japanese and English,' Phonology Yearbook, 3 255-309, 1986 DOI
7	M. Ostendorf and K. Ross, 'A multi-level model for recognition of intonation labels,' in Computing prosody: computational models for processing spontaneous speech. Springer-Verlag New York, Inc., 1997
8	P. Taylor, S. King, S. Isard, H. Wright and J. Kowtko, 'Using intonation to constrain language models in speech recognition,' in Proc. EUROSPEECH, 1997
9	Christine H. Nakatani and Julia Hirschberg, 'A corpus-based study of repair cues in spontaneous speech,' J. Acoust. Soc. Am, 95 (3) 1603-1616, 1994 DOI ScienceOn
10	Paul Taylor,'Analysis and synthesis of intonation using the . Tilt model,' J. Acoust. Soc. Am, 107 (3) 1697-1714, 2000 DOI ScienceOn
11	Joseph F. Pitrelli, Mary Beckman, and Julia Hirschberg, 'Evaluation of prosodic transcription labeling reliability in the TOBI framework,' in Proc. ICSLP, 1994
12	Ji-Hwan Kim and Philip C. Woodland, 'The use of prosody in a combined system for punctuation generation and speech recognition,' in Proc. EUROSPEECH, 2001

KSCI

Automatic Recognition of Pitch Accent Using Distributed Time-Delay Recursive Neural Network 분산 시간지연 회귀신경망을 이용한 피치 악센트 자동 인식

Automatic Recognition of Pitch Accent Using Distributed Time-Delay Recursive Neural Network