• Title/Summary/Keyword: Noise speech data

Search Result 144, Processing Time 0.022 seconds

Effective Feature Vector for Isolated-Word Recognizer using Vocal Cord Signal (성대신호 기반의 명령어인식기를 위한 특징벡터 연구)

  • Jung, Young-Giu;Han, Mun-Sung;Lee, Sang-Jo
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.3
    • /
    • pp.226-234
    • /
    • 2007
  • In this paper, we develop a speech recognition system using a throat microphone. The use of this kind of microphone minimizes the impact of environmental noise. However, because of the absence of high frequencies and the partially loss of formant frequencies, previous systems developed with those devices have shown a lower recognition rate than systems which use standard microphone signals. This problem has led to researchers using throat microphone signals as supplementary data sources supporting standard microphone signals. In this paper, we present a high performance ASR system which we developed using only a throat microphone by taking advantage of Korean Phonological Feature Theory and a detailed throat signal analysis. Analyzing the spectrum and the result of FFT of the throat microphone signal, we find that the conventional MFCC feature vector that uses a critical pass filter does not characterize the throat microphone signals well. We also describe the conditions of the feature extraction algorithm which make it best suited for throat microphone signal analysis. The conditions involve (1) a sensitive band-pass filter and (2) use of feature vector which is suitable for voice/non-voice classification. We experimentally show that the ZCPA algorithm designed to meet these conditions improves the recognizer's performance by approximately 16%. And we find that an additional noise-canceling algorithm such as RAST A results in 2% more performance improvement.

A Clinical Study on Binaural Hearing Aid (양이 보청효과에 관한 연구)

  • 김기령;김영명;심윤주
    • Proceedings of the KOR-BRONCHOESO Conference
    • /
    • 1978.06a
    • /
    • pp.9.2-9
    • /
    • 1978
  • Monaural and binaural hearing aid performance under quiet and noisy conditions were compared in regard to (1) the degree of hearing impairment, (2) the symmetry of pure tone audiogram, (3) the automatic gain control of the hearing aid. (4) hearing impairement with recruitment and, word discrimination ability. Performance using binaural hearing aids was consistently superior to that using monaural hearing aids. The results were as follows. 1. Speech detection thresholds were enhanced by a mean of 4.25dB when tested with danavox 747 PP stereo type hearing aid and by a mean of 4.12 dB when tested hearing aids connected seperately to the right and left ears. 2. Binaurally tested speech reception thresholds were superior to monaurally tested thresholds by a mean of 3.56dB when tested in quiet and by a mean of 5.56dB when tested in noise. 3. Binaurally tested word discrimination scores were also superior by a mean of 17.09% in quiet and by a mean 19.63% in noise. 4. Both SRT and word discrimination scores were performed best by subjects with moderately-severe impairement. The performance by one mildly impaired subject was the poorest of all performances. The levels of performance order were; moderately-severe loss, severe loss. moderate loss and mild loss. 5. The data obtained using AGC aids when compaired with that of linear amplification show that when AGC aids were worn in both ears. the results were very poor but when one AGC aid was worn in one ear and linear amplification in the other. the results were good. 6. The advantages of binaural hearing aids were obvious even in cases 1) with great diferences in hearing thresholds between right and left ears, 2) when the subject was unable to discriminate words without vision and. 3) when the subject had extreme recruitme t phenomenon.

  • PDF

Audio Stream Delivery Using AMR(Adaptive Multi-Rate) Coder with Forward Error Correction in the Internet (인터넷 환경에서 FEC 기능이 추가된 AMR음성 부호화기를 이용한 오디오 스트림 전송)

  • 김은중;이인성
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.26 no.12A
    • /
    • pp.2027-2035
    • /
    • 2001
  • In this paper, we present an audio stream delivery using the AMR (Adaptive Multi-Rate) coder that was adopted by ETSI and 3GPP as a standard vocoder for next generation IMT-2000 service in which includes combined sender (FEC) and receiver reconstruction technique in the Internet. By use of the media-specific FEC scheme, the possibility to recover lost packets can be much increased due to the addition of repair data to a main data stream, by which the contents of lost packets can be recovered. The AMR codec is based on the code-excited linear predictive (CELP) coding model. So we use a frame erasure concealment for CELP-based coders. The proposed scheme is evaluated with ITU-T G.729 (CS-ACELP) coder and AMR - 12.2 kbit/s through the SNR (Signal to Noise Ratio) and the MOS (Mean Opinion Score) test. The proposed scheme provides 1.1 higher in Mean Opinion Score value and 5.61 dB higher than AMR - 12.2 kbit/s in terms of SNR in 10% packet loss, and maintains the communicab1e quality speech at frame erasure rates lop to 20%.

  • PDF

Cortical Network Activated by Korean Traditional Opera (Pansori): A Functional MR Study

  • Kim, Yun-Hee;Kim, Hyun-Gi;Kim, Seong-Yong;Kim, Hyoung-Ihl;Todd. B. Parrish;Hong, In-Ki;Sohn, Jin-Hun
    • Proceedings of the Korean Society for Emotion and Sensibility Conference
    • /
    • 2000.04a
    • /
    • pp.113-119
    • /
    • 2000
  • The Pansori is a Korean traditional vocal music that has a unique story and melody which converts deep emotion into art. It has both verbal and emotional components. which can be coordinated by large-scale neural network. The purpose of this study is to illustrate the cortical network activated by a Korean traditional opera, Pansori, with different emotional valence using functional MRI (fMRI).Nine right-handed volunteers participated. Their mean age was 25.3 and the mean modified Edinburgh score was +90.1. Activation tasks were designed for the subjects to passively listen to the two parts of Pansories with sad or hilarious emotional valence. White noise was introduced during the control periods. Imaging was conducted on a 1.5T Siemens Vision Vision scanner. Single-shot echoplanar fMRI scans (TR/TE 3840/40 ms, flip angle 90, FOV 220, 64 x 64 matrix, 6mm thickness) were acquired in 20 contiguous slices. Imaging data were motion-corrected, coregistered, normalized, and smoothed using SPM-96 software.Bilateral posterior temporal regions were activated in both of Pansori tasks, but different asymmetry between the tasks was found. The Pansori with sad emotion showed more activation in the light superior temporal regions as well as the right inferior frontal and the orbitofrontal areas than in the right superior temporal regions as well as the right inferior frontal and the orbitofrontal areas than in the left side. In the Pansori with hilarious emotion, there was a remarkable activation in the left hemisphere especially at the posterior temporal and the temporooccipital regions as well as in the left inferior and the prefrontal areas. After subtraction between two tasks, the sad Pansori showed more activation in the right temporoparietal and the orbitofrontal areas, in contrast, the one with hilarious emotion showed more activation in the left temporal and the prefrontal areas. These results suggested that different hemispheric asymmetry and cortical areas are subserved for the processing of different emotional valences carried by the Pansories.

  • PDF