• Title/Summary/Keyword: Speech rate

Search Result 1,246, Processing Time 0.026 seconds

Automatic Speaker Identification by Sustained Vowel Phonation (지속적으로 발성한 모음에 의한 화자인식)

  • Bae, Geon-Seong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.11 no.1
    • /
    • pp.35-41
    • /
    • 1992
  • A speaker identification scheme using the speaker-based VQ codecook of a sustained vowel is proposed and tested. With the pitch synchronous LPC vector of the sustained vowel /i/ as a feature vector, a VQ codebook size of 4 was found to be suitable to characterize each speaker's feature space. For 40 normal speakers (20 males, 20 females), we achieved the correct identification rate of 99.4% with a training data set, and 89.4% with a test data set with speech samples of only 50 pitch periods.

  • PDF

A doulbe talk detector using the reflection coefficients (반사계수를 이용한 동시통화 검출기)

  • 유재하;조성호;윤대희
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.34S no.10
    • /
    • pp.141-150
    • /
    • 1997
  • In this paepr, we propose an intelligent double talk detector that can enhance the performance of the acoustic echo cancellation system. The conventional double talk detection methods often misunderstand the echo path changes as double talk. Although there exist several detection methods that can distinguish the echo path changes from the double-talks, they show poor tracking performance because of the excessive decision delay for the discrimination and can only be used after the adaptive digital filter converges. A new and more effective ditetion algorithm has been proposed, where the detection mechanism is performed by observing the change rate of the reflection coefficients of the two lattice predictors that re placed on the near-end and far-end terminals. The excellence of the proposed method is verified by extensive computer simulations using real speech signals.

  • PDF

Developments of Glove-based Input Device. (장갑형 입력장치의 개발)

  • 원대희;이호길;김진영;박종현
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 2001.04a
    • /
    • pp.211-216
    • /
    • 2001
  • Recently, the research for the mobile computing such as PDA, Palm PC and wearable computing related technologies is widely under development, specially for the input device. Among the mobile input methods are speech recognition, handwriting recognition and cording type. However these systems have the problems of the data input appraratus like input speed and recognition rate. This paper presents the Glove-based input device which could solve the system's data input problem. By the experimental results suggest the method of proposional input method that utilize the hand's movement is appropriate for the effective mobile input devices.

  • PDF

Noise Robust Speaker Identification using Reliable Sub-Band Selection in Multi-Band Approach (신뢰성 높은 서브밴드 선택을 이용한 잡음에 강인한 화자식별)

  • Kim, Sung-Tak;Ji, Mi-Gyeong;Kim, Hoi-Rin
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.127-130
    • /
    • 2007
  • The conventional feature recombination technique is very effective in the band-limited noise condition, but in broad-band noise condition, the conventional feature recombination technique does not produce notable performance improvement compared with the full-band system. To cope with this drawback, we introduce a new technique of sub-band likelihood computation in the feature recombination, and propose a new feature recombination method by using this sub-band likelihood computation. Furthermore, the reliable sub-band selection based on the signal-to-noise ratio is used to improve the performance of this proposed feature recombination. Experimental results shows that the average error reduction rate in various noise condition is more than 27% compared with the conventional full-band speaker identification system.

  • PDF

Fast Speaker Adaptation in Noisy Environment using Environment Clustering (잡음 환경하에서 환경 군집화를 이용한 고속화자 적응)

  • Kim, Young-Kuk;Song, Hwa-Jeon;Kim, Hyung-Soon
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.33-36
    • /
    • 2007
  • In this paper, we investigate a fast speaker adaptation method based on eigenvoice in several noisy environments. In order to overcome its weakness against noise, we propose a noisy environment clustering method which divides the noisy adaptation utterances into utterance groups with similar environments by the vector quantization based clustering using a cepstral mean as a feature vector. Then each utterance group is used for adaptation to make an environment dependent model. According to our experiment, we obtained 19-37 % relative improvement in error rate compared with the simultaneous speaker adaptation and environmental compensation method

  • PDF

Trends of Low Bit-Rate Speech Coding (저 전송율 음성 부호화 연구 동향)

  • 최용수
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.08a
    • /
    • pp.113-120
    • /
    • 1998
  • 정보화 시대가 발전함에 따라 음성 통신 및 저장 시스템은 점점 더 우리 생활 깊숙이 자리잡아 가고 있다 따라서 급증하는 수요에 보다 더 효과적으로 대처하기 위한 연구가 진행되어 왔다. 그 한가지 예가 기존의 음성 부호화 시스템의 음질을 유지하면서 압축율을 크게 높일 수 있는 부호화 방법에 대한 연구 및 표준화 작업이다. 본 논문에서는 최근 확정된 음성 부호화기 표준안인 US DoD 2.4 kbps MELP, MPEG-4 HVXC, CDMA 용 IS-127 EVRC 음성 부호화기에 대해 비교적 자세히 설명하고, 현재 진행중인 ITU-T 4kbps 표준안으로 제안된 부호화 방법들이 경향을 살펴본다. 또한 새로운 연구 분야인 인터넷 전화기와 인식-합성 기법을 이용한 아주 낮은 전송율 음성 부호화기에 대한 연구 동향을 소개한다.

  • PDF

Efficient Triphone Clustering Using Monophone Distance (모노폰 거리를 이용한 트라이폰 클러스터링 방법 연구)

  • Bang Kyu-Seop;Yook Dong-Suk
    • Proceedings of the KSPS conference
    • /
    • 2006.05a
    • /
    • pp.41-44
    • /
    • 2006
  • The purpose of state tying is to reduce the number of models and to use relatively reliable output probability distributions. There are two approaches: one is top down clustering and the other is bottom up clustering. For seen data, the performance of bottom up approach is better than that of top down approach. In this paper, we propose a new clustering technique that can enhance the undertrained triphone clustering performance. The basic idea is to tie unreliable triphones before clustering. An unreliable triphone is the one that appears in the training data too infrequently to train the model accurately. We propose to use monophone distance to preprocess these unreliable triphones. It has been shown in a pilot experiment that the proposed method reduces the error rate significantly.

  • PDF

A Study on Improvement of Bit Rate using Duration Control of Speech in G.723.1 Vocoder (Duration Control 의한 G.723.1 보코더 전송률 개선에 관한 연구)

  • 장경아;유영민;배명진
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.2475-2478
    • /
    • 2003
  • CELP계열의 부호화기인 G.723.1 5.3kbps ACELP를 기반으로 하여 음질을 유지하면서 전송률을 낮출 수 있는 새로운 부호화 방법을 제안한다. 본 논문에서 적용한 부호화 방법은 음성 합성시 파라미터로 사용되는 지속시간 변경에 의해 CELP형 보코더의 전송률을 감소하고자 한다. 먼저 음성을 보코더 입력단에 입력하기 전 지속시간을 FFT 변환 특성을 이용해 음색의 변경 없이 지속시간을 줄임으써 계산시간을 줄이고 진폭과 위상 각각 1/2ⁿ배의 interpolation과 Decimation을 수행하여 부호화한다. 이렇게 부호화된 데이터는 G.723.1 복호화를 거치고, 다시 FFT point의 1/2ⁿ배 point로 IFFT과정을 수행함으로써 스팩트럼의 변경 없이 지속시간을 변경하여 원 음성을 합성하게 된다. G.723.1 보코더를 통과한 후 파형을 복원 실험한 결과 기존의 5.3kbps ACELP보다 46%정도 감소하였다.

  • PDF

A Transcoding Algorithm for the Next Generation Speech Communication System (차세대 음성통신 시스템을 위한 상호부호화 알고리듬)

  • 이문근;강홍구;박영철;윤대희
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.2224-2227
    • /
    • 2003
  • 본 논문에서는 비동기식 3 세대 이동통신망인 WCDMA의 표준 음성 부호화기인 AMR(Adaptive Multi-Rate)[1]과 VoIP(Voice over Internet Protocol) 응용분야에 최근 널리 활용되고 있는 ITU-T 8kbit/s 0.729A[2]의 효율적인 연동을 위한 상호부호화(transcoding) 알고리듬을 제안한다. AMR은 통신 채널 환경에 따라 4.75kbit/s부터 12.2kbit/s까지 가변 하여 통화품질을 보장한다. 따라서, 제안된 상호부호화 알고리듬은 순방향 8 모드, 역방향 8모드를 합하여 총 16모드를 지원한다. 제안된 알고리듬의 성능 평가를 위해 지연 추정, 연산량 측정과 주/객관적 음질평가를 수행한 결과, 제안한 알고리듬은 기존의 tandem보다 최소 5㎳의 짧은 지연, 평균 50.2%의 적은 연산량으로 우수한 음질의 복호화 음성 신호를 제공함을 확인하였다.

  • PDF

Discriminative Training of Predictive Neural Network Models (예측신경회로망 모델의 변별력 있는 학습)

  • Na, Kyung-Min;Rheem, Jae-Yeol;Ann, Sou-Guil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.1E
    • /
    • pp.64-70
    • /
    • 1994
  • Predictive neural network models are powerful speech recognition models based on a nonlinear pattern prediction. But those models suffer from poor discrimination between acoustically similar words. In this paper we propose an discriminative training algorithm for predictive neural network models. This algorithm is derived from GPD (Generalized Probabilistic Descent) algorithm coupled with MCEF(Minimum Classification Error Formulation). It allows direct minimization of a recognition error rate. Evaluation of our training algoritym on ten Korean digits shows its effectiveness by 30% reduction of recognition error.

  • PDF