• Title/Summary/Keyword: reference speaker

Search Result 87, Processing Time 0.028 seconds

Text-Dependent Speaker Recognition Using DTW and State-Dependent Parameter Weighting Method of HMM (DTW 와 HMM의 상태별 파라미터 가중 기법을 이용한 문맥 종속형 화자인식)

  • 이철희;정성환;김종교
    • Proceedings of the IEEK Conference
    • /
    • 2000.06d
    • /
    • pp.77-80
    • /
    • 2000
  • In this paper, the speaker-recognition process based on both DTW and discrete HMM was performed using the method to evaluate state-dependent parameter weighting from training data so as the personal audio-characteristics are to be well reflected. In the suggested method below, we found the optimal state sequence using the Viterbi algorithm. The optimal path could be evaluated after comparing the sequence of base pattern which already have, with that of the other patterns. After that the frame of which the pattern was matched with the base pattern in the same state are to be found so that the reference pattern can be gained by weighting on the numbers of matched frames.

  • PDF

A Study on Number sounds Speaker recognition using the Pitch detection and the Fuzzified pattern (피치 검출과 퍼지화 패턴을 이용한 숫자음 화자 인식에 관한 연구)

  • 김연숙;김희주;김경재
    • Journal of the Korea Society of Computer and Information
    • /
    • v.8 no.3
    • /
    • pp.73-79
    • /
    • 2003
  • This paper proposes speaker recognition algorithm which includes both the pitch detection and the fuzzified pattern matching. This study utilizes pitch pattern using a pitch and speech parameter uses binary spectrum. In this paper. makes reference pattern using fuzzy membership function in order to include time variation width for non-utterance time and performs vocal track recognition of common character using fuzzified pattern matching.

  • PDF

A Study on Korean, English and Japanese Speaker Recognitions Using the Peak and Valley Pitch Detection and the Fuzzy Theory (PVPF방법과 퍼지 이론을 이용한 한국어, 영어 및 일본어 화자 인식에 관한 연구)

  • Kim, Yeon-Suk
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.2
    • /
    • pp.522-533
    • /
    • 1999
  • This paper proposes speaker recognition algorithm which includes both the pitch parameter and the fuzzy inference. This study proposes a pitch detection method PVPF(peak and valley pitch detection fuction) by means of comparing spectra which utilizes the transform characteristics between time and frequency. In this paper, makes reference pattern using membership function and performs vocal tract recognition of common character using fuzzy pattern matching in order to include time variation width for non-linear utterance time.

  • PDF

A Study on Korean and English Speaker Recognitions using the Fuzzy Theory (퍼지 이론을 이용한 한국어 및 영어 화자 인식에 관한 연구)

  • 김연숙;김희주;김경재
    • Journal of the Korea Society of Computer and Information
    • /
    • v.7 no.3
    • /
    • pp.49-55
    • /
    • 2002
  • This paper proposes speaker recognition algorithm which includes both the pitch parameter and the fuzzy. This study proposes a pitch detection method for the peak and valley pitch detection function by means of comparing spectra which utilizes the transform characteristics between time and frequency. It measures the similarity to the original spectrum while arbitrarily varying the period in the time domain. It heavily weights the error due to the changing characteristics of the phonemes, while it is strong against noise. In this paper, makes reference pattern using membership function and performs vocal track recognition of common character using fuzzy pattern matching in odor to include time variation width for non-linear utterance time.

  • PDF

A Study on Korean and Japanese Speaker Recognitions using the Fuzzy Theory (퍼지 이론을 이용한 한국어 및 일어 화자 인식에 관한 연구)

  • 김연숙;김창완
    • Journal of the Korea Society of Computer and Information
    • /
    • v.5 no.3
    • /
    • pp.51-57
    • /
    • 2000
  • This paper proposes speaker recognition algorithm which includes both the pitch and the fuzzy. This study proposes a pitch detection method for the peak and valley pitch detection function by means of comparing spectra which utilizes the transform characteristics between time and frequency. It measures the similarity to the original spectrum while arbitrarily varying the period in the time domain. It heavily weights the error due to the changing characteristics of the phonemes, while it is strong against noise. In this paper, makes reference pattern using membership function and performs vocal track recognition of common character using fuzzy pattern matching in order to include time variation width for non-linear utterance time.

  • PDF

Optimally Weighted Cepstral Distance Measure for Speech Recognition (음성 인식을 위한 최적 가중 켑스트랄 거리 측정 방법)

  • 김원구
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06c
    • /
    • pp.133-137
    • /
    • 1994
  • In this paper, a method for designing an optimal weight function for the weighted cepstral distance measure is proposed. A conventional weight function or cepstral lifter is obtained eperimentally depending on the spectral components to be emphasized. The proposed method minimizes the error between word reference patterns and the traning data. To compare the proposed optimal weight function with conventional function, speech recognition systems based on Dpynamic Time Warping and Hidden Markov Models were constructed to conduct speaker independent isolated word necogination eperiment. Results show that the proposed method gives better performance than conventional weight functions.

  • PDF

On a updating reference pattern of speaker recognition using F1/F0 in the WINDOWS environment (위도우즈 환경에서 F1/F0 율을 이용한 화자인식의 기준패턴 형성에 관한연구)

  • 정종순;이윤주;배재옥;배명진
    • Proceedings of the IEEK Conference
    • /
    • 1998.06a
    • /
    • pp.611-614
    • /
    • 1998
  • 윈도우즈 95와 같은 멀티미디어 환경 하에서 개인 신분 확인을 위한 방법은 비밀번호를 키보드로 입력받는 것이었으나, 본 논문에서는 음성을 이용하는 방법으로 기존의 방법이 기준패턴의 시간에 따라 변하는 특성을 보상하지 못한다는 단점을 보완하는 방법이다. 즉, 이를 위해 음성신호의 특징인 기본주파수와 제1포만트의 비율을 이용하여 기준패턴을 형성화하는 방법에 관한 것이다. 제안한 방법으로 실험한 결과, 98%의 전체 인식율을 얻게 되었고, 윈도우즈 환경에서 비밀번호 사용 대신 음성 사용에 대한 가능성을 보여 주었다.

  • PDF

Using Corpora for the Study of Word-Formation: A Case Study in English Negative Prefixation

  • Kwon, Heok-Seung
    • Korean Journal of English Language and Linguistics
    • /
    • v.1 no.3
    • /
    • pp.369-386
    • /
    • 2001
  • This paper will show that traditional approaches to the derivation of different negative words have been of an essentially hypothetical nature, based on either linguists' intuitions or rather scant evidence, and that native-speaker dictionary entries show meaning potentials (rather than meanings) which are in fact linguistic and cognitive prototypes. The purpose of this paper is to demonstrate that using a large corpus of natural language can provide better answers to questions about word-formation (i.e., with particular reference to negative prefixation) than any other source of information.

  • PDF

Speech Recognition in the Car Noise Environment (자동차 소음 환경에서 음성 인식)

  • 김완구;차일환;윤대희
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.2
    • /
    • pp.51-58
    • /
    • 1993
  • This paper describes the development of a speaker-dependent isolated word recognizer as applied to voice dialing in a car noise environment. for this purpose, several methods to improve performance under such condition are evaluated using database collected in a small car moving at 100km/h The main features of the recognizer are as follow: The endpoint detection error can be reduced by using the magnitude of the signal which is inverse filtered by the AR model of the background noise, and it can be compensated by using variants of the DTW algorithm. To remove the noise, an autocorrelation subtraction method is used with the constraint that residual energy obtainable by linear predictive analysis should be positive. By using the noise rubust distance measure, distortion of the feature vector is minimized. The speech recognizer is implemented using the Motorola DSP56001(24-bit general purpose digital signal processor). The recognition database is composed of 50 Korean names spoken by 3 male speakers. The recognition error rate of the system is reduced to 4.3% using a single reference pattern for each word and 1.5% using 2 reference patterns for each word.

  • PDF

Stress Effects on Korean Vowels with Reference to Rhythm

  • Yun, Il-Sung
    • MALSORI
    • /
    • no.67
    • /
    • pp.1-16
    • /
    • 2008
  • Stress effects upon Korean vowels were investigated with reference to rhythm. We measured three acoustic correlates (Duration: VOT, Vowel Duration; F0; Intensity) of stress from the seven pairs of stressed vs. unstressed Korean vowels /i, ${\varepsilon}(e)$, a, o, u, i, e/. The results of the experiment revealed that stress gave only inconsistent and weak effects on duration, which supports that Korean is not a stress-timed language as far as strong stress effects on duration are still considered crucial in stress-timing. On the other hand, Korean stressed vowels were most characterized with higher F0 and next with stronger intensity. But speakers generally showed tactics to reversely use F0 and intensity in stressing an utterance rather than proportionately strengthening both of the two acoustic correlates of stress. There was found great inter-speaker variability especially in the variations of duration.

  • PDF