• Title/Summary/Keyword: Distant Speech

Search Result 20, Processing Time 0.027 seconds

The Robot Speech Recognition using TMS320VC5510 DSK (TMS320VC5510 DSK를 이용한 음성인식 로봇)

  • Choi, Ji-Hyun;Chung, Ik-Joo
    • Journal of Industrial Technology
    • /
    • v.27 no.A
    • /
    • pp.211-218
    • /
    • 2007
  • As demands for interaction of humans and robots are increasing, robots are expected to be equipped with intelligibility which humans have. Especially, for natural communication, hearing capabilities are so essential that speech recognition technology for robot is getting more important. In this paper, we implement a speech recognizer suitable for robot applications. One of the major problem in robot speech recognition is poor speech quality captured when a speaker talks distant from the microphone a robot is mounted with. To cope with this problem, we used wireless transmission of commands recognized by the speech recognizer implemented using TMS320VC5510 DSK. In addition, as for implementation, since TMS320VC5510 DSP is a fixed-point device, we represent efficient realization of HMM algorithm using fixed-point arithmetic.

  • PDF

ARMA Filtering of Speech Features Using Energy Based Weights (에너지 기반 가중치를 이용한 음성 특징의 자동회귀 이동평균 필터링)

  • Ban, Sung-Min;Kim, Hyung-Soon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.31 no.2
    • /
    • pp.87-92
    • /
    • 2012
  • In this paper, a robust feature compensation method to deal with the environmental mismatch is proposed. The proposed method applies energy based weights according to the degree of speech presence to the Mean subtraction, Variance normalization, and ARMA filtering (MVA) processing. The weights are further smoothed by the moving average and maximum filters. The proposed feature compensation algorithm is evaluated on AURORA 2 task and distant talking experiment using the robot platform, and we obtain error rate reduction of 14.4 % and 44.9 % by using the proposed algorithm comparing with MVA processing on AURORA 2 task and distant talking experiment, respectively.

Recognition Performance Improvement of Unsupervised Limabeam Algorithm using Post Filtering Technique

  • Nguyen, Dinh Cuong;Choi, Suk-Nam;Chung, Hyun-Yeol
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.8 no.4
    • /
    • pp.185-194
    • /
    • 2013
  • Abstract- In distant-talking environments, speech recognition performance degrades significantly due to noise and reverberation. Recent work of Michael L. Selzer shows that in microphone array speech recognition, the word error rate can be significantly reduced by adapting the beamformer weights to generate a sequence of features which maximizes the likelihood of the correct hypothesis. In this approach, called Likelihood Maximizing Beamforming algorithm (Limabeam), one of the method to implement this Limabeam is an UnSupervised Limabeam(USL) that can improve recognition performance in any situation of environment. From our investigation for this USL, we could see that because the performance of optimization depends strongly on the transcription output of the first recognition step, the output become unstable and this may lead lower performance. In order to improve recognition performance of USL, some post-filter techniques can be employed to obtain more correct transcription output of the first step. In this work, as a post-filtering technique for first recognition step of USL, we propose to add a Wiener-Filter combined with Feature Weighted Malahanobis Distance to improve recognition performance. We also suggest an alternative way to implement Limabeam algorithm for Hidden Markov Network (HM-Net) speech recognizer for efficient implementation. Speech recognition experiments performed in real distant-talking environment confirm the efficacy of Limabeam algorithm in HM-Net speech recognition system and also confirm the improved performance by the proposed method.

Interference Suppression Using Principal Subspace Modification in Multichannel Wiener Filter and Its Application to Speech Recognition

  • Kim, Gi-Bak
    • ETRI Journal
    • /
    • v.32 no.6
    • /
    • pp.921-931
    • /
    • 2010
  • It has been shown that the principal subspace-based multichannel Wiener filter (MWF) provides better performance than the conventional MWF for suppressing interference in the case of a single target source. It can efficiently estimate the target speech component in the principal subspace which estimates the acoustic transfer function up to a scaling factor. However, as the input signal-to-interference ratio (SIR) becomes lower, larger errors are incurred in the estimation of the acoustic transfer function by the principal subspace method, degrading the performance in interference suppression. In order to alleviate this problem, a principal subspace modification method was proposed in previous work. The principal subspace modification reduces the estimation error of the acoustic transfer function vector at low SIRs. In this work, a frequency-band dependent interpolation technique is further employed for the principal subspace modification. The speech recognition test is also conducted using the Sphinx-4 system and demonstrates the practical usefulness of the proposed method as a front processing for the speech recognizer in a distant-talking and interferer-present environment.

A Phonetic Study of Korean Intervocalic Laryngeal Consonants

  • Oh, Mi-Ra;Johnson, Keith
    • Speech Sciences
    • /
    • v.1
    • /
    • pp.83-101
    • /
    • 1997
  • This paper aims at exploring a putative positional neutralization produced at the phonetics/phonology interface. It was designed to determine whether Korean intervocalic laryngeal consonants are phonetically distant from geminates, plain consonants, or laryngeal consonants in consonant clusters. It was found that the contrast between laryngeal singletons and geminates was neutralized intervocalically, and that both of these were patterned with heteroganic consonant sequences rather than with plain singletons.

  • PDF

Hands-free Speech Recognition based on Echo Canceller and MAP Estimation (에코제거기와 MAP 추정에 기초한 핸즈프리 음성 인식)

  • Sung-ill Kim;Wee-jae Shin
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.4 no.3
    • /
    • pp.15-20
    • /
    • 2003
  • For some applications such as teleconference or telecommunication systems using a distant-talking hands-free microphone, the near-end speech signals to be transmitted is disturbed by an ambient noise and by an echo which is due to the coupling between the microphone and the loudspeaker. Furthermore, the environmental noise including channel distortion or additive noise is assumed to affect the original input speech. In the present paper, a new approach using echo canceller and maximum a posteriori(MAP) estimation is introduced to improve the accuracy of hands-free speech recognition. In this approach, it was shown that the proposed system was effective for hands-free speech recognition in ambient noise environment including echo. The experimental results also showed that the combination system between echo canceller and MAP environmental adaptation technique were well adapted to echo and noise environment.

  • PDF

Noise robust distant sound recognition (잡음 환경에 강인한 원거리 음향 정보 검출 기술 연구)

  • Yoo, In-Chul;Yook, Dong-Suk
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.37-38
    • /
    • 2007
  • This paper reviews the issues in implementing sound recognizers in real environments. First is the signal corruption caused by background noises and reverberation. Second is the open-set problem which is the problem of rejecting out-of-vocabulary words and noises. These two issues must be solved for noise robust recognizers.

  • PDF

Performance Improvement in Distant-Talking Speech Recognition by an Integration of N-best results using Naive Bayesian Network (다채널 마이크 환경에서 Naive Bayesian Network의 Decision에 의한 음성인식 성능향상)

  • Ji, Mi-kyong;Kim, Hoi-Rin
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.151-154
    • /
    • 2005
  • 원거리 음성인식에서 인식률의 성능향상을 위해 필수적인 다채널 마이크 환경에서 방 안의 도처에 분산되어있는 원거리 마이크를 사용하여 TV, 조명 등의 주변 환경을 음성으로 제어하고자 한다. 이를 위해 각 채널의 인식결과를 통합하여 최적의 결과를 얻고자 채널의N-best 결과와 N-best 결과에 포함된 hypothesis의 frame-normalized likelihood 값을 사용하여 Bayesian network을 훈련하고 인식결과를 통합하여 최선의 결과를 decision 하는데 사용함으로써 원거리 음성인식의 성능을 향상시키고 또한 hands-free 응용을 현실화하기위한 방향을 제시한다.

  • PDF

Efficient Acoustic Echo Cancellation System for Distant-Talking Automatic Speech Recognition (원거리 음성 인식을 위한 효율적인 에코제거 시스템)

  • Kim, Ki-Beom;Kim, Sang-Yoon;Lee, Woo-Jung;Kwon, Min-Seok;Ko, Byeong-Seob
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2014.10a
    • /
    • pp.150-155
    • /
    • 2014
  • 본 논문에서는, 원거리 음성인식을 위한 서브밴드 필터링 기반의 빠르고 효율적인 에코제거 시스템을 제안한다. 제안하는 에코제거 시스템은 우선 채널간 유사도 (correlation) 가 높을 경우 적응필터가 오작동하는 것을 방지하기 위해 spatial decorrelation 을 적용하게 된다. 그리고 tree 형태를 가지는 IIR filterbank 기반의 subband 구조를 채택함으로써, 적은 차수로도 효과적인 analysis, synthesis 필터링을 수행할 수 있도록 한다. 이 과정에서 불가피하게 발생하는 서브 밴드간 spectral aliasing은 notch filter를 적용해 해결할 수 있다. 또한 적응 필터로는 improved proportionate normalized least-mean-square (IP-NLMS) 알고리즘을 사용해 수렴속도 및 에코제거 성능에서 우수함을 확인하였다. 마지막으로 decision-directed estimation 기반의 residual echo suppressor를 적용해 잔여 에코를 제거하게 된다. 본 논문에서는 각 단계를 구성하게 된 이론적인 배경을 소개하고, 실제 에코가 존재하는 환경에서 ERLE, 원거리 음성 인식률, computational complexity를 통해 제안하는 에코제거 시스템의 효과를 입증하도록 한다.

  • PDF

Treatment Outcome of Supraglottic Partial Laryngectomy and Neck Dissection for Supraglottic Carcinoma (성문상부암종에서 성문상 후두부분절제술과 경부청소술의 치료성적)

  • Tae, Kyung;Min, Hyun-Jung;Song, Mi-Na;Shin, Kwang-Soo;Lee, Seung-Hwan;Kim, Kyung-Rae;Lee, Hyung-Seok
    • Korean Journal of Head & Neck Oncology
    • /
    • v.23 no.1
    • /
    • pp.15-20
    • /
    • 2007
  • Background and Objectives:Supraglottic partial laryngectomy is oncologically sound surgical procedure for selected cases of laryngeal cancer which maintains physiologic speech and swallowing without permanent tracheostoma. The purpose of this study is to evaluate the oncologic and functional results of supraglottic partial laryngectomy and neck dissection for supraglottic cancer. Materials and Methods:Between 1991-2005, Twenty-three supraglottic cancer patients, underwent supraglottic partial laryngectomy, were studied retrospectively. There were 5 patients with cT1, 14 with cT2, 4 with cT3 and 11 patients with cN0, 1 with cN1, 10 with cN2, 1 with cN3. All patients underwent neck dissection and postoperative radiotherapy was added to twenty patients. They were reviewed with respect to primary subsites, extended subsites, treatment result, survival rate, factors affecting the prognosis, postoperative complication, time of decannulation and oral diet, and postoperative voice. Results:Among eleven patients with clinically negative node, six patients had pathologically positive nodes. So occult metastasis was 54.5%. Two patients recurred at cervical lymph node and one had distant metastasis to lung. Local and regional control were 100% and 91.3%. The overall 3-year and 5-year survival rate were 84%, 78%, respectively. Nineteen cases were squamous cell carcinomas and four were basaloid squamous cell carcinomas. Basaloid subtype was significantly affected to survival. Decannulation and oral feeding were possible in 100%. Conclusions:Supraglottic partial laryngectomy is oncologically safe and functionally good procedure in supraglottic cancers. Elective neck dissection is beneficial in management of occult cervical metastasis.