A Study on the Improvement of DTW with Speech Silence Detection

음성의 묵음구간 검출을 통한 DTW의 성능개선에 관한 연구

  • 김종국 (숭실대학교 정보통신공학과 음성통신연구실) ;
  • 조왕래 (숭실대학교 정보통신공학과 음성통신연구실) ;
  • 배명진 (숭실대학교 정보통신공학과 음성통신연구실)
  • Published : 2003.12.01

Abstract

Speaker recognition is the technology that confirms the identification of speaker by using the characteristic of speech. Such technique is classified into speaker identification and speaker verification: The first method discriminates the speaker from the preregistered group and recognize the word, the second verifies the speaker who claims the identification. This method that extracts the information of speaker from the speech and confirms the individual identification becomes one of the most efficient technology as the service via telephone network is popularized. Some problems, however, must be solved for the real application as follows; The first thing is concerning that the safe method is necessary to reject the imposter because the recognition is not performed for the only preregistered customer. The second thing is about the fact that the characteristic of speech is changed as time goes by, So this fact causes the severe degradation of recognition rate and the inconvenience of users as the number of times to utter the text increases. The last thing is relating to the fact that the common characteristic among speakers causes the wrong recognition result. The silence parts being included the center of speech cause that identification rate is decreased. In this paper, to make improvement, We proposed identification rate can be improved by removing silence part before processing identification algorithm. The methods detecting speech area are zero crossing rate, energy of signal detect end point and starting point of the speech and process DTW algorithm by using two methods in this paper. As a result, the proposed method is obtained about 3% of improved recognition rate compare with the conventional methods.

Keywords