Browse > Article

Word Boundary Detection of Voice Signal Using Recurrent Fuzzy Associative Memory  

Ma Chang-Su ((주)핸디소프트)
Kim Gye-Young (숭실대학교 컴퓨터학부)
Abstract
We describe word boundary detection that extracts the boundary between speech and non-speech. The proposed method uses two features. One is the normalized root mean square of speech signal, which is insensitive to white noises and represents temporal information. The other is the normalized met-frequency band energy of voice signal, which is frequency information of the signal. Our method detects word boundaries using a recurrent fuzzy associative memory(RFAM) that extends FAM by adding recurrent nodes. Hebbian learning method is employed to establish the degree of association between an input and output. An error back-propagation algorithm is used for teaming the weights between the consequent layer and the recurrent layer. To confirm the effectiveness, we applied the suggested system to voice data obtained from KAIST.
Keywords
word boundary detection; met-frequency; RFAM; hebbian learning;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Tong Zhao, Peng-Yung Woo, 'Fuzzy Speech Recognition,' International Joint Conference on Neural Networks, pp. 2959-2961, 1999   DOI
2 Vittorio Gorrini, Hugues Bersini, 'Recurrent Fuzzy Systems,' IEEE World Congress on Computational Intelligence, pp. 193-198, 1994
3 Gin-Der Wu, Chin-Teng Lin, 'A Recurrent Neural Fuzzy Network for Word Boundary Detection in Variable Noise-Level Environments,' IEEE Systems, Man and Cybernetics, Vol. 31, No. 1, pp. 84-97, 2001   DOI   ScienceOn
4 Doroteo Torre Toledano, 'Neural Network Boundary Refining for Automatic Speech Segmentation,' IEEE International Conference on Acoustics, Speech, and Signal Processing, pp, 3438-3441, 2000   DOI
5 배명진, 이상효, '디지털 음성분석', 동영출판사, 1998
6 장대식, '퍼지연상기억장치에 기반한 퍼지 추론 시스템', 숭실대학교 석사청구논문, 1995
7 Martin T. Hagan, Howard B. Demuth, 'Neural Network Design,' PWS Publishing Company, 1995
8 Mark Marzinzik, Birger Kollmeier, 'Speech Pause Detection for Noise Spectrum Estimation by Tracking Power Envelope Dynamics,' IEEE Speech and Audio Processing, pp. 109-118, 2002   DOI   ScienceOn
9 석종원, 배건성, '웨이블렛 변환을 이용한 음성신호의 끝점 검출', 한국음향학회지, 18권, 6호, pp. 57-64, 1999   과학기술학회마을
10 D. O. Hebb, 'The Organization of Behavior,' John Wiley & Sons, New York, 1949
11 F. Beritelli, 'Robust word boundary detection using fuzzy logic,' Electronics Letters, Vol. 36, No.9, pp, 846-848, 2000   DOI   ScienceOn
12 Fabien Gouyon, Francois Pachet, Olivier Delerue, 'On The Use of Zero-Crossing Rate for an Application of Classification of Percussive Sounds,' Conference on Digital Audio Effects, pp. 1-6, 2000
13 Ramana Rao G.V., Srichand J., 'Word Boundary Detection Using Pitch Variations,' Fourth International Conference on Spoken Language Processing, pp. 813-816, 1996   DOI
14 Gin-Der Wu, Chin-Teng Lin, 'Word Boundary Detection with Mel-Scale Frequency Bank in Noisy Environment,' IEEE Speech and Audio Processing, Vol. 8, No.5, pp. 541-554, 2000   DOI   ScienceOn
15 Sirko Molau, Michael Pitz, Ralf Schliiter, Hermann Ney, 'Computing Mel-Frequency Cepstral Coefficients on The Spectrum,' IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 73-76, 2001   DOI
16 Alain Biem, Shigeru Katagiri, Biing-Hwang Juang, 'An Application of Discriminative Feature Extraction of Filter-Bank-Based Speech Recognition,' IEEE Transaction on Speech and Audio Processing, pp. 96-110, 2001   DOI   ScienceOn