Browse > Article

Robust Person Identification Using Optimal Reliability in Audio-Visual Information Fusion  

Tariquzzaman, Md. (School of Electronics & Computer Engineering, Chonnam National University)
Kim, Jin-Young (School of Electronics & Computer Engineering, Chonnam National University)
Na, Seung-You (School of Electronics & Computer Engineering, Chonnam National University)
Choi, Seung-Ho (Dept. of Computer Eng, Dongshin University)
Abstract
Identity recognition in real environment with a reliable mode is a key issue in human computer interaction (HCI). In this paper, we present a robust person identification system considering score-based optimal reliability measure of audio-visual modalities. We propose an extension of the modified reliability function by introducing optimizing parameters for both of audio and visual modalities. For degradation of visual signals, we have applied JPEG compression to test images. In addition, for creating mismatch in between enrollment and test session, acoustic Babble noises and artificial illumination have been added to test audio and visual signals, respectively. Local PCA has been used on both modalities to reduce the dimension of feature vector. We have applied a swarm intelligence algorithm, i.e., particle swarm optimization for optimizing the modified convection function's optimizing parameters. The overall person identification experiments are performed using VidTimit DB. Experimental results show that our proposed optimal reliability measures have effectively enhanced the identification accuracy of 7.73% and 8.18% at different illumination direction to visual signal and consequent Babble noises to audio signal, respectively, in comparison with the best classifier system in the fusion system and maintained the modality reliability statistics in terms of its performance; it thus verified the consistency of the proposed extension.
Keywords
Person Identification; Local PCA; Reliability Measures; Particle Swarm Optimization;
Citations & Related Records
연도 인용수 순위
  • Reference
1 A,K. Jain, A. Ross, S. Prabhakar, 'An introduction to biometric recognition,' IEEE Tran. Circuits Sys. Video Technol., vol. 14, no. 1, PP. 4-20, 2004   DOI   ScienceOn
2 M. Brand, N. Oliver, A. Pentland, 'Coupled hidden Markov models for complex action recognition,' In Proc. of IEEE Internat. Cont. on Computer Vision and Pattern Recognition, PP. 994-999. 1997   DOI
3 S. Furui, 'Cepstral Analysis technique for automatic speaker verification,' IEEE Trans. on Acoustics, Speech, and Signal Proc., vol. 29, no. 2, pp. 254-272, 1981   DOI
4 D. A. Reynolds, 'An overview of automatic speaker recognition technology,' Proc. IEEE Internat. Conf. on Acoustics, Speech and Signal Processing, vol. 4, PP. 4072-4075, 2000   DOI
5 C. H. Sit, M. W. Mak and S. Y. Kung, 'Maximum likelihood and maximum a posteriori adaptation for distributed speaker recognition systems,' In Proc. of 1st Internat. Conf. on Biometric Authentication, PP. 640-647, 2004
6 K. Yiu, M .. Mak and S. Kung, 'Environment adaptation for robust speaker verification,' In Proc. EUROSPEECH, pp. 2973-2976, 2003
7 C.H. Lee, C.H. Lin and B.H. Juang, 'A Study on speaker adaptation on the parameters of continuous density hidden Markov models,' IEEE Trans. of Signal Proc. vol. 39, no. 4, pp. 806-814, 1991   DOI
8 C. Sanderson, Biometric Person Recognition: Face, Speech and Fusion, VDM-Verlag, 2008
9 M. Heckmann, F. Berthommier, and K. Kristian, 'Noise adaptive stream weighting in audio-visual speaker identification,' EURASIP J. Applied Signal Proc. vol. 2002, pp. 1260-1273, 2002.   DOI   ScienceOn
10 B.S. Atal. 'Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification.' J. Acoust. Soc. Am.. vol. 55, no. 6. PP. 1304-1312. 1974   DOI   ScienceOn
11 M. Tariquzzaman, Jin Young Kim and Joon-Hee Hong, 'Improvement of reliability based information integration in audio-visual person identification,' J. Korean Soc. of Phonetic Sc~ Speech Technol. Vol. 62, PP. 149-161, 2007   과학기술학회마을
12 R.J.Mammone, X. Zhang and R. P. Ramachandran, 'Robust speaker recognition: a feature-based approach,' IEEE Signal Processing Magazine vol. 13, no. 5, PP. 58-71, 1996   DOI   ScienceOn
13 U. V. Chaudhari, et al .. 'Audio-visual speaker recognition using time-varying stream reliability prediction,' Proceeding of IEEE Int. Conference on Acoustics, speech and signal proc. vol. 5, PP. 712-715, 2003
14 D. Stephane, and R. Christophe, 'Robust feature extraction and acoustic modeling at multitel: experiments on the Aurora databases,' In Proc. Eurospeech, PP. 1789-1792, 2003
15 E. Mengusoglu, 'Confidence measure based model adaptation for speaker verification,' In Proc. 2nd lASTED Internat Conf. on ommunications, Internet and Information Technology, pp. 408- 411, 2000
16 J. P. Campbell, 'Speaker recognition: a tutorial,' In Proc. IEEE, vol. 85, no. 9, pp. 1437-1462, 1997   DOI   ScienceOn
17 N. Kambhatla, and T.K. Leen, 'Dimension reduction by local PCA,' Neural Computation, vol. 9, no. 7, PP. 1493-1503, 1997   DOI   ScienceOn
18 N. A. Fox, Audio and video based person identification, Ph.D. thesis, University College Dublin, 2005
19 H. Hermansky and N. Morgan, 'RASTA processing of speech,' IEEE Trans. on Speech and Audio Proc .. vol. 2, no. 4, PP. 578 -589, 1994   DOI   ScienceOn
20 R. Eberhart and J. Kennedy, 'A new optimizer using particle swarm theory,' In Proc. Sixth Int. Symposium on Micro Machine and Human Science, PP. 39-43, 1995   DOI
21 D. A. Reynolds, R. C. Ross, 'Robust text-independent speaker identification using Gaussian mixture speaker models.' IEEE Trans. Speech Audio Proc. vol. 3, no.1, PP. 72-82, 1995   DOI   ScienceOn
22 R. Chengalvarayan and L. Deng, 'A maximum a posteriori approach to speaker adaptation using the trended hidden Markov model,' IEEE Trans. Speech Audio Proc. vol. 9, no. 6, PP. 549-557. 2001   DOI   ScienceOn