• 제목/요약/키워드: Distant Speech

검색결과 20건 처리시간 0.022초

TMS320VC5510 DSK를 이용한 음성인식 로봇 (The Robot Speech Recognition using TMS320VC5510 DSK)

  • 최지현;정익주
    • 산업기술연구
    • /
    • 제27권A호
    • /
    • pp.211-218
    • /
    • 2007
  • As demands for interaction of humans and robots are increasing, robots are expected to be equipped with intelligibility which humans have. Especially, for natural communication, hearing capabilities are so essential that speech recognition technology for robot is getting more important. In this paper, we implement a speech recognizer suitable for robot applications. One of the major problem in robot speech recognition is poor speech quality captured when a speaker talks distant from the microphone a robot is mounted with. To cope with this problem, we used wireless transmission of commands recognized by the speech recognizer implemented using TMS320VC5510 DSK. In addition, as for implementation, since TMS320VC5510 DSP is a fixed-point device, we represent efficient realization of HMM algorithm using fixed-point arithmetic.

  • PDF

에너지 기반 가중치를 이용한 음성 특징의 자동회귀 이동평균 필터링 (ARMA Filtering of Speech Features Using Energy Based Weights)

  • 반성민;김형순
    • 한국음향학회지
    • /
    • 제31권2호
    • /
    • pp.87-92
    • /
    • 2012
  • In this paper, a robust feature compensation method to deal with the environmental mismatch is proposed. The proposed method applies energy based weights according to the degree of speech presence to the Mean subtraction, Variance normalization, and ARMA filtering (MVA) processing. The weights are further smoothed by the moving average and maximum filters. The proposed feature compensation algorithm is evaluated on AURORA 2 task and distant talking experiment using the robot platform, and we obtain error rate reduction of 14.4 % and 44.9 % by using the proposed algorithm comparing with MVA processing on AURORA 2 task and distant talking experiment, respectively.

Recognition Performance Improvement of Unsupervised Limabeam Algorithm using Post Filtering Technique

  • Nguyen, Dinh Cuong;Choi, Suk-Nam;Chung, Hyun-Yeol
    • 대한임베디드공학회논문지
    • /
    • 제8권4호
    • /
    • pp.185-194
    • /
    • 2013
  • Abstract- In distant-talking environments, speech recognition performance degrades significantly due to noise and reverberation. Recent work of Michael L. Selzer shows that in microphone array speech recognition, the word error rate can be significantly reduced by adapting the beamformer weights to generate a sequence of features which maximizes the likelihood of the correct hypothesis. In this approach, called Likelihood Maximizing Beamforming algorithm (Limabeam), one of the method to implement this Limabeam is an UnSupervised Limabeam(USL) that can improve recognition performance in any situation of environment. From our investigation for this USL, we could see that because the performance of optimization depends strongly on the transcription output of the first recognition step, the output become unstable and this may lead lower performance. In order to improve recognition performance of USL, some post-filter techniques can be employed to obtain more correct transcription output of the first step. In this work, as a post-filtering technique for first recognition step of USL, we propose to add a Wiener-Filter combined with Feature Weighted Malahanobis Distance to improve recognition performance. We also suggest an alternative way to implement Limabeam algorithm for Hidden Markov Network (HM-Net) speech recognizer for efficient implementation. Speech recognition experiments performed in real distant-talking environment confirm the efficacy of Limabeam algorithm in HM-Net speech recognition system and also confirm the improved performance by the proposed method.

Interference Suppression Using Principal Subspace Modification in Multichannel Wiener Filter and Its Application to Speech Recognition

  • Kim, Gi-Bak
    • ETRI Journal
    • /
    • 제32권6호
    • /
    • pp.921-931
    • /
    • 2010
  • It has been shown that the principal subspace-based multichannel Wiener filter (MWF) provides better performance than the conventional MWF for suppressing interference in the case of a single target source. It can efficiently estimate the target speech component in the principal subspace which estimates the acoustic transfer function up to a scaling factor. However, as the input signal-to-interference ratio (SIR) becomes lower, larger errors are incurred in the estimation of the acoustic transfer function by the principal subspace method, degrading the performance in interference suppression. In order to alleviate this problem, a principal subspace modification method was proposed in previous work. The principal subspace modification reduces the estimation error of the acoustic transfer function vector at low SIRs. In this work, a frequency-band dependent interpolation technique is further employed for the principal subspace modification. The speech recognition test is also conducted using the Sphinx-4 system and demonstrates the practical usefulness of the proposed method as a front processing for the speech recognizer in a distant-talking and interferer-present environment.

A Phonetic Study of Korean Intervocalic Laryngeal Consonants

  • Oh, Mi-Ra;Johnson, Keith
    • 음성과학
    • /
    • 제1권
    • /
    • pp.83-101
    • /
    • 1997
  • This paper aims at exploring a putative positional neutralization produced at the phonetics/phonology interface. It was designed to determine whether Korean intervocalic laryngeal consonants are phonetically distant from geminates, plain consonants, or laryngeal consonants in consonant clusters. It was found that the contrast between laryngeal singletons and geminates was neutralized intervocalically, and that both of these were patterned with heteroganic consonant sequences rather than with plain singletons.

  • PDF

에코제거기와 MAP 추정에 기초한 핸즈프리 음성 인식 (Hands-free Speech Recognition based on Echo Canceller and MAP Estimation)

  • Sung-ill Kim;Wee-jae Shin
    • 융합신호처리학회논문지
    • /
    • 제4권3호
    • /
    • pp.15-20
    • /
    • 2003
  • 핸즈프리 마이크를 이용한 원격회의나 원격 통신 시스템과 같은 몇 가지의 응용분야에서, 음성 신호는 주위 잡음뿐만 아니라 마이크와 스피커사이의 결합에 의해 발생하는 에코에 의해서 왜곡되기 쉽다. 게다가 채널 왜곡이나 부가적인 잡음을 포함한 환경 잡음들은 원래의 입력 음성신호에 영향을 미치리라 고려된다. 본 논문에서는, 이러한 핸즈프리 음성에 있어서의 음성 인식률을 향상시키기 위해 에코 제거기와 최대 사후 추정(MAP)을 이용한 새로운 접근방식을 소개한다. 이 접근방식에서, 제안된 시스템이 에코를 포함한 주위 잡음 환경에서의 핸즈프리 음성인식에 효과적이라는 것을 보여준다 또한, 실험 결과는 에코 제거기와 MAP 환경적응 기술의 결합 시스템이 에코와 잡음 환경에 잘 적응하는 것을 보여준다.

  • PDF

잡음 환경에 강인한 원거리 음향 정보 검출 기술 연구 (Noise robust distant sound recognition)

  • 유인철;육동석
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2007년도 한국음성과학회 공동학술대회 발표논문집
    • /
    • pp.37-38
    • /
    • 2007
  • This paper reviews the issues in implementing sound recognizers in real environments. First is the signal corruption caused by background noises and reverberation. Second is the open-set problem which is the problem of rejecting out-of-vocabulary words and noises. These two issues must be solved for noise robust recognizers.

  • PDF

다채널 마이크 환경에서 Naive Bayesian Network의 Decision에 의한 음성인식 성능향상 (Performance Improvement in Distant-Talking Speech Recognition by an Integration of N-best results using Naive Bayesian Network)

  • 지미경;김희린
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2005년도 추계 학술대회 발표논문집
    • /
    • pp.151-154
    • /
    • 2005
  • 원거리 음성인식에서 인식률의 성능향상을 위해 필수적인 다채널 마이크 환경에서 방 안의 도처에 분산되어있는 원거리 마이크를 사용하여 TV, 조명 등의 주변 환경을 음성으로 제어하고자 한다. 이를 위해 각 채널의 인식결과를 통합하여 최적의 결과를 얻고자 채널의N-best 결과와 N-best 결과에 포함된 hypothesis의 frame-normalized likelihood 값을 사용하여 Bayesian network을 훈련하고 인식결과를 통합하여 최선의 결과를 decision 하는데 사용함으로써 원거리 음성인식의 성능을 향상시키고 또한 hands-free 응용을 현실화하기위한 방향을 제시한다.

  • PDF

원거리 음성 인식을 위한 효율적인 에코제거 시스템 (Efficient Acoustic Echo Cancellation System for Distant-Talking Automatic Speech Recognition)

  • 김기범;김상윤;이우정;권민석;고병섭
    • 한국소음진동공학회:학술대회논문집
    • /
    • 한국소음진동공학회 2014년도 추계학술대회 논문집
    • /
    • pp.150-155
    • /
    • 2014
  • 본 논문에서는, 원거리 음성인식을 위한 서브밴드 필터링 기반의 빠르고 효율적인 에코제거 시스템을 제안한다. 제안하는 에코제거 시스템은 우선 채널간 유사도 (correlation) 가 높을 경우 적응필터가 오작동하는 것을 방지하기 위해 spatial decorrelation 을 적용하게 된다. 그리고 tree 형태를 가지는 IIR filterbank 기반의 subband 구조를 채택함으로써, 적은 차수로도 효과적인 analysis, synthesis 필터링을 수행할 수 있도록 한다. 이 과정에서 불가피하게 발생하는 서브 밴드간 spectral aliasing은 notch filter를 적용해 해결할 수 있다. 또한 적응 필터로는 improved proportionate normalized least-mean-square (IP-NLMS) 알고리즘을 사용해 수렴속도 및 에코제거 성능에서 우수함을 확인하였다. 마지막으로 decision-directed estimation 기반의 residual echo suppressor를 적용해 잔여 에코를 제거하게 된다. 본 논문에서는 각 단계를 구성하게 된 이론적인 배경을 소개하고, 실제 에코가 존재하는 환경에서 ERLE, 원거리 음성 인식률, computational complexity를 통해 제안하는 에코제거 시스템의 효과를 입증하도록 한다.

  • PDF

성문상부암종에서 성문상 후두부분절제술과 경부청소술의 치료성적 (Treatment Outcome of Supraglottic Partial Laryngectomy and Neck Dissection for Supraglottic Carcinoma)

  • 태경;민현정;송미나;신광수;이승환;김경래;이형석
    • 대한두경부종양학회지
    • /
    • 제23권1호
    • /
    • pp.15-20
    • /
    • 2007
  • Background and Objectives:Supraglottic partial laryngectomy is oncologically sound surgical procedure for selected cases of laryngeal cancer which maintains physiologic speech and swallowing without permanent tracheostoma. The purpose of this study is to evaluate the oncologic and functional results of supraglottic partial laryngectomy and neck dissection for supraglottic cancer. Materials and Methods:Between 1991-2005, Twenty-three supraglottic cancer patients, underwent supraglottic partial laryngectomy, were studied retrospectively. There were 5 patients with cT1, 14 with cT2, 4 with cT3 and 11 patients with cN0, 1 with cN1, 10 with cN2, 1 with cN3. All patients underwent neck dissection and postoperative radiotherapy was added to twenty patients. They were reviewed with respect to primary subsites, extended subsites, treatment result, survival rate, factors affecting the prognosis, postoperative complication, time of decannulation and oral diet, and postoperative voice. Results:Among eleven patients with clinically negative node, six patients had pathologically positive nodes. So occult metastasis was 54.5%. Two patients recurred at cervical lymph node and one had distant metastasis to lung. Local and regional control were 100% and 91.3%. The overall 3-year and 5-year survival rate were 84%, 78%, respectively. Nineteen cases were squamous cell carcinomas and four were basaloid squamous cell carcinomas. Basaloid subtype was significantly affected to survival. Decannulation and oral feeding were possible in 100%. Conclusions:Supraglottic partial laryngectomy is oncologically safe and functionally good procedure in supraglottic cancers. Elective neck dissection is beneficial in management of occult cervical metastasis.