• 제목/요약/키워드: Simulated speech

검색결과 70건 처리시간 0.022초

난청인의 주파수 선택도와 비대칭적 청각 필터를 고려한 난청 시뮬레이터 개발에 관한 연구 (A Study on Development of a Hearing Impairment Simulator considering Frequency Selectivity and Asymmetrical Auditory Filter of the Hearing Impaired)

  • 주상익;강현덕;송영록;이상민
    • 전기학회논문지
    • /
    • 제59권4호
    • /
    • pp.831-840
    • /
    • 2010
  • In this paper, we propose a hearing impairment simulator considering reduced frequency selectivity and asymmetrical auditory filter of the hearing impaired, and we verified the reduced frequency selectivity and asymmetrical auditory filter affected in speech perception through experiments. The reduced frequency selectivity has made embodied by spectral smearing using LPC(linear prediction coding). The shapes of auditory filter are asymmetrical different with each center frequency. Hearing impaired person which has hearing loss was differently changed with that of normal hearing people and it has different value for speech of quality through auditory filter. The experiments confirmed subjective test and objective test. The subjective experiments are composed of 4 kinds of tests: pure tone test, SRT(speech reception threshold) test, and WRS(word recognition score) test without spectral smearing, and WRS test with spectral smearing. The experiment of the hearing impairment simulator was performed from 9 subjects who have normal ears. The amount of spectral smearing was controlled by LPC order. The asymmetrical auditory filter of proposed hearing impairment simulator was simulated and then some tests to estimate the filter's performance objectively were performed. The objective experiment as simulated auditory filter's performance evaluation method used PESQ(perceptual evaluation of speech quality) and LLR(log likelihood ratio) for speech through auditory filter. The processed speech was evaluated objective speech quality and distortion using PESQ and LLR value. When hearing loss processed, PESQ and LLR value have big difference according to asymmetrical auditory filter in hearing impairment simulator.

난청인의 난청 정도에 따른 비대칭 청각 필터 구현의 객관적 평가 (An Objective Estimation for Simulating of Asymmetrical Auditory Filter of the Hearing Impaired According to Hearing Loss Degree)

  • 주상익;전유용;송영록;이상민
    • 재활복지공학회논문지
    • /
    • 제3권1호
    • /
    • pp.27-34
    • /
    • 2009
  • 난청인의 청력 손실은 개인별로 다양하므로 기존의 대칭적으로 주파수 밴드별 청각 필터를 구현하는 방법은 다양한 형태의 난청인의 청력 손실을 적절하게 모사해주지 못한다. 각 중심주파수와 음성의 입력크기에 따라 청각 필터의 형태가 비대칭적으로 바뀌고 청력손실이 있는 난청인은 청력 손실에 따라 청각필터의 형태가 정상인들과는 다른 형태로 바뀌게 되며 음질에도 차이가 있다. 본 연구에서는 난청인의 난청 정도에 따라 변하는 비대칭 청각 특성을 잘 반영한 청각필터를 구현하여 몇 가지 실험을 통해 각 구현된 청각 필터의 성능을 객관적으로 평가하였다. 실험은 구현된 청각 필터를 통한 음성의 perceptual evaluation of speech quality (PESQ) 와 log likelihood ratio (LLR)를 사용하였으며 그 값을 통해 처리된 음성의 객관적인 음질과 왜곡정도를 평가 하였다. 청력 손실을 주었을 때 대칭과 비대칭 청각 필터사이의 PESQ 와 LLR 값을 실험해 본 결과 청각 필터 간의 큰 차이를 보였다. 위 실험 결과들로 대칭과 비대칭 청각 필터의 형태에 따라서 음성의 음질에 영향을 받는다는 것을 알 수 있었다. 특히, 난청이 있을 때 중심 주파수별 청각 필터의 비대칭적 형태 변화가 난청인이 받아들이는 음질에 영향이 있었다.

  • PDF

모의 지능로봇에서의 음성 감정인식 (Speech Emotion Recognition on a Simulated Intelligent Robot)

  • 장광동;김남;권오욱
    • 대한음성학회지:말소리
    • /
    • 제56호
    • /
    • pp.173-183
    • /
    • 2005
  • We propose a speech emotion recognition method for affective human-robot interface. In the Proposed method, emotion is classified into 6 classes: Angry, bored, happy, neutral, sad and surprised. Features for an input utterance are extracted from statistics of phonetic and prosodic information. Phonetic information includes log energy, shimmer, formant frequencies, and Teager energy; Prosodic information includes Pitch, jitter, duration, and rate of speech. Finally a pattern classifier based on Gaussian support vector machines decides the emotion class of the utterance. We record speech commands and dialogs uttered at 2m away from microphones in 5 different directions. Experimental results show that the proposed method yields $48\%$ classification accuracy while human classifiers give $71\%$ accuracy.

  • PDF

16Kbps와 40Kbps의 Dual Rate G.723 ADPCM 음성 codec 구현 (Implementation of Dual Rate G.723 ADPCM Speech codec)

  • 김재오;한경호
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 1998년도 하계학술대회 논문집 G
    • /
    • pp.2480-2482
    • /
    • 1998
  • In this paper, the implementation of dual rate ADPCM using G.723 16Kbps and 40Kbps speech codec algorithm is handled. For small signals, the low rate 16Kbps coding algorithm shows the same SNR as the high rate 40Kbps coding algorithm, while the low rate 16Kbps coding algorithm shows the lower SNR than the high rate 40Kbps coding algorithm for large signal. To obtain the good trade-off between the data rate and synthesized speech quality, we applied low rate 16Kbps for the small signal and high rate 40Kbps for the large signal. Various threshold values determining the rate are tested for good trade off data rate and speech quality. Also the low pass filter effect of speech input and output devices is simulated at several cut-off frequencies. To simulation result shows the good speech quality at a low rate comparing with 16Kbps & 40Kbps.

  • PDF

Wiener Filtering을 이용한 잡음환경에서의 음성인식 (Speech Recognition in Noisy Environments using Wiener Filtering)

  • 김진영;엄기완;최홍섭
    • 음성과학
    • /
    • 제1권
    • /
    • pp.277-283
    • /
    • 1997
  • In this paper, we present a robust recognition algorithm based on the Wiener filtering method as a research tool to develop the Korean Speech recognition system. We especially used Wiener filtering method in cepstrum-domain, because the method in frequency-domain is computationally expensive and complex. Evaluation of the effectiveness of this method has been conducted in speaker-independent isolated Korean digit recognition tasks using discrete HMM speech recognition systems. In these tasks, we used 12th order weighted cepstral as a feature vector and added computer simulated white gaussian noise of different levels to clean speech signals for recognition experiments under noisy conditions. Experimental results show that the presented algorithm can provide an improvement in recognition of as much as from $5\%\;to\;\20\%$ in comparison to spectral subtraction method.

  • PDF

자동차 환경에서의 노이즈 DB 및 한국어 음성 DB 구축 (Creation and Assessment of Korean Speech and Noise DB in Car Environments)

  • 이광현;김봉완;이용주
    • 대한음성학회지:말소리
    • /
    • 제48호
    • /
    • pp.141-153
    • /
    • 2003
  • Researches into robust recognition in noise environments, especially in car environments, are being carried out actively in speech community. In this paper we will report on three types of corpora that SiTEC (Speech Information TEchnology & industry promotion Center) has created for research into speech recognition in car noise environments. The first is the recordings of 900 Korean native speakers, distributed according to gender, age, and region, who uttered application words in car environments. The second is the collections of mixed noise in 3 car types by model while setting up various noise patterns which can be obtained with the car engine on or off, at different driving speed, and in different road conditions with windows open or closed. The third is the recordings of simulated speech by HATS (Head and Torso Simulator) in car environments with the internal and external noise factors added. These three types of recordings were all made through synchronized 8 channel microphones that are fixed in a car. The creation and applications of these corpora will be reported on in detail.

  • PDF

배경잡음을 고려한 4배 가변 압축률을 갖는 ADPCM의 C6000 DSP 실시간 구현 (Implementation of Quad Variable Rates ADPCM Speech CODEC on C6000 DSP considering the Environmental Noise)

  • 김대성;한경호
    • 전력전자학회:학술대회논문집
    • /
    • 전력전자학회 2002년도 전력전자학술대회 논문집
    • /
    • pp.727-729
    • /
    • 2002
  • In this paper, we proposed quad variable rates ADPCM coding method and its implementation on C6000 DSP, which is modified from the standard ADPCM of ITU G.726 for speech quality improvement considering the environmental noise Four coding rates, 16Kbps, 24Kbps, 32Kbps and 40Kbps are used for speech window samples and the rate decision threshold is decided by the environmental noise level. The object of the proposed method is to reduce the coding rate while retaining the speech quality and the speech quality is considerably close to 40Kbps single rate coder with the coding rate close to 16Kbps single rate coder under the environmental noise. The environmental noise level affects the coding rate and the noise level is calculated per every speech window samples. At high noise level, more samples are coded at higher rates to enhance the quality, but at low noise level, only the big speech signals are coded at higher rates and more speech samples are coded at lower coding rates to reduce the coding rates. The influence of the noise on tile speech signal is considerably high for small signals and the small signal has the higher ZCR (zero crossing rate). The method is simulated in PC and to be implemented on C6000 floating point DSP board in real time operations.

  • PDF

음성인식 시스템에서의 원격 음성입력기의 성능평가 (A Performance of a Remote Speech Input Unit in Speech Recognition System)

  • 이광석
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국해양정보통신학회 2009년도 추계학술대회
    • /
    • pp.723-726
    • /
    • 2009
  • 본 연구에서는, 음성인식 시스템에서의 마이크 어레이 기반으로 한 beamforming 방법을 기반으로 음성신호에 대한 에러감소 알고리듬의 성능평가를 위한 시뮬레이션 하였으며 그 성능을 분석하였다. 또한, 마이크 어레이로 부터 취득한 음성신호로 부터 각 채널에 대한 최대 신호대잡음비 구하고 음성신호별로 신호대잡음비를 비교 검토하였다. 음성 인식률은 경우1에서는 54.2%에서 61.4%로, 경우2에서는 더 낮은 신호대잡음비로 41.2%에서 50.5%로 각각 개선됨을 알 수 있었다. 따라서 평균 에러감소율은 경우1에서 15.7%를 보였다.

  • PDF

한국어 말하기 평가에서 '담화 능력' 등급 기술을 위한 기초 연구 -'부탁'에 대한 '거절하기' 과제를 중심으로- (A Basic Study on the Development of a Grading Scale of Discourse Competence in Korean Speaking Assessment -Focusing on the Scale of 'REFUSAL' Task)

  • 이혜용;이향
    • 한국어교육
    • /
    • 제29권3호
    • /
    • pp.255-292
    • /
    • 2018
  • Most grading scales of Korean language proficiency tests are based on existing grading scales that are not empirically verified. The purpose of this study is to develop an empirically verified scale descriptor. The 'Performance data-driven approach' that is suggested by Fulcher (1987) was used to develop the detailed description of characteristics for each level of performance. This study is focused on the functional phase of speech samples analysis (coding data) to create explanatory categories of discourse skills into which individual observations of speech phenomena can be scored. The speech samples that were collected through this study demonstrated stages of speech that can be a foundation of a grading scale. The data used in the study was collected from 23 native speakers of Korean. Speech samples were recorded from simulated speaking tests using the 'REFUSAL' task, and transcribed for analysis. The transcript was analyzed using discourse analysis. The result showed that the 'REFUSAL' task needs to go through four functional phases in actual communication. Furthermore, this study found specific and detailed explanatory categories of discourse competence based on the actual native speaker's speech data. Such findings are expected to contribute to the development of more valid and reliable speaking assessment.

TMS320C30을 이용한 단일채널 적응잡음제거기 구현 (Implementation of the single channel adaptive noise canceller using TMS320C30)

  • 정성윤;우세정;손창희;배건성
    • 음성과학
    • /
    • 제8권2호
    • /
    • pp.73-81
    • /
    • 2001
  • In this paper, we focus on the real time implementation of the single channel adaptive noise canceller(ANC) by using TMS320C30 EVM board. The implemented single channel adaptive noise canceller is based on a reference paper [1] in which it is simulated by using the recursive average magnitude difference function(AMDF) to get a properly delayed input speech on a sample basis as a reference signal and normalized least mean square(NLMS) algorithm. To certify results of the real time implementation, we measured the processing time of the ANC and enhancement ratio according to various signalto-noise ratios(SNRs). Experimental results demonstrate that the processing time of the speech signal of 32ms length with delay estimation of every 10 samples is about 26.3 ms, and almost the same performance as given in [1] is obtained with the implemented system.

  • PDF