• Title/Summary/Keyword: noise masking

Search Result 148, Processing Time 0.025 seconds

Speech recognition in car noise environments using multiple models according to noise masking levls (잡음 마스킹 레벨에 따른 복수 모델을 이용한 자동차 소음환경에서의 음성인식)

  • 정회인
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.08a
    • /
    • pp.60-64
    • /
    • 1998
  • 음성인식 시스템의 실용화 과정에서 훈련환경과 테스트 환경의 불일치로 인한 인식성능의 저하는 반드시 극복되어야 할 문제이다. 본 논문에서는 잡음 tR인 입력음성의 비음성구간에서 잡음레벨을 추정하여 음성 스펙트럼에서 추정된 잡음레벨을 빼는 스펙트럼 차감법고 스펙트럼 영역에서 미리 정해진 마스킹 레벨보다 낮은 에너지 값을 마스킹 레벨로 올려주는 잡음 마스킹을 함께 사용함으로써 훈련 환경과 테스트환경의 불일치를 줄이는 방법을 제안한다. 그리고 복수의 마스킹 레벨에 대한 모델들을 미리 만들어 두고 추정된 잡음 레벨에 따라 적합한 마스킹 레벨의 보델을 사용하여 인식을 수해?는 다중 모델 방법을 적용하였다. 자동차 소음환경에서 두 가지 마스킹 레벨에 대한 모델을 이용한 화자독립고립단어 인식 실험을 통하여 본 논문에서 제안한 방식은 정차중 무시동 환경에서 95.8%, 정차중 시동 환경에서 95.6%, 한적한 도로환경에서 92.8%, 복잡한 시내도로 환경에서 89.6%, 고속도로 환경에서 74.4%의 인식성능을 나타내었으며, 평균 90.7%의 성능을 얻을 수 있다.

  • PDF

Modified SNR-Normalization Technique for Robust Speech Recognition

  • Jung, Hoi-In;Shim, Kab-Jong;Kim, Hyung-Soon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.3E
    • /
    • pp.14-18
    • /
    • 1997
  • One fo the major problems in speech recognition is the mismatch between training and testing environments. Recently, SNR normalization technique, which normalizes the dynamic range of frequency channels in mel-scaled filterbank, was proposed[1]. While it showed improved robustness against additive noise, it requires a reliable speech detection mechanism and several adaptation parameters to be optimized. In this paper, we propose a modified SNR normalization technique. In this technique, we take simply the maximum of filterbank output and predetermined masking constant for each frequency band. According to the speaker-independent isolated word recognition in car noise environments, proposed modification yields better recognition performance that the original SNR normalization method, with rather reduced complexity.

  • PDF

The Noise Characteristics and Appropriate Talk Distance in Dental Clinic (치과병원의 소음특성과 적절한 대화거리)

  • Ji, Dong-Ha;Choi, Mi-Suk
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.5
    • /
    • pp.2516-2523
    • /
    • 2013
  • Noise occurred when medical treatment in dental clinic will affect the patients. This study was measured the noise level and frequency in case of medical examination and also has evaluated the degree of indoor noise using the NR-curve, NRN and a distance to conversation between worker and patients using the PSIL. It shows that noise level was 69.3~81.5dB(A) and frequency was very high (more than 4K(Hz)) and analysis by NR-curve showed that it was exceed the noise permit level and distance to conversation was less than 1meter by PSIL. To remedy a fear of noise in patients and provide a conversational satisfaction, it's considered that choosing the low noise-vib. equipment, using the masking effect and set the room to explain. So It is possible to improve their competitiveness.

A Post-processing for Binary Mask Estimation Toward Improving Speech Intelligibility in Noise (잡음환경 음성명료도 향상을 위한 이진 마스크 추정 후처리 알고리즘)

  • Kim, Gibak
    • Journal of Broadcast Engineering
    • /
    • v.18 no.2
    • /
    • pp.311-318
    • /
    • 2013
  • This paper deals with a noise reduction algorithm which uses the binary masking in the time-frequency domain. To improve speech intelligibility in noise, noise-masked speech is decomposed into time-frequency units and mask "0" is assigned to masker-dominant region removing time-frequency units where noise is dominant compared to speech. In the previous research, Gaussian mixture models were used to classify the speech-dominant region and noise-dominant region which correspond to mask "1" and mask "0", respectively. In each frequency band, data were collected and trained to build the Gaussian mixture models and detection procedure is performed to the test data where each time-frequency unit belongs to speech-dominant region or noise-dominant region. In this paper, we consider the correlation of masks in the frequency domain and propose a post-processing method which exploits the Viterbi algorithm.

Speech Reinforcement Based on Soft Decision Under Far-End Noise Environments (원단 잡음 환경에서 Soft Decision에 기반한 새로운 음성 강화 기법)

  • Choi, Jae-Hun;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.7
    • /
    • pp.379-385
    • /
    • 2008
  • In this paper, we propose an effective speech reinforcement technique under the near-end and the far-end noise environments. In general, since the intelligibility of the far-end speech for the near-end listener is significantly reduced under near-end noise environments, we require a far-end speech reinforcement approach to avoid this phenomena. Specifically, based on the estimated background noise spectrum of the near-end, we reinforce the far-end speech spectrum by incorporating the more general cases under the near-end with background noise. Also, we propose the novel approach to reinforce the actual speech signal except for the noise signal in the far-end noisy speech signal. The performance of the proposed algorithm is evaluated by the CCR (Comparison Category Rating) test of the method for subjective determination of transmission quality in ITU-T P.800 under various noise environments and shows better performances compared with the conventional method.

DR Image Enhancement Using Multiscale Non-Linear Gain Control For Laplacian Pyramid Transformation (라플라시안 피라미드에서의 다중스케일 비선형 이득 조절을 이용한 DR 영상 개선)

  • Shin, Dong-Kyu;Lee, Jin-Su;Kim, Sung-Hee;Park, In-Sung;Kim, Dong-Youn
    • Journal of Biomedical Engineering Research
    • /
    • v.28 no.2
    • /
    • pp.199-204
    • /
    • 2007
  • In digital radiography, to improve the contrast of digital radiography image, the multi-scale nonlinear amplification algorithm based on unsharp masking is one of the major image enhancement algorithms. In this paper, we used the Laplacian pyramid to decompose a digital radiography(DR) image. In our simulation, the DR image was decomposed into seven layers and the coefficients of the each layer was amplified with nonlinear function. We also imported a noise containment algorithm to limit noise amplification. To enhance the contrast of image, we proposed a new adaptive non-linear gain amplification coefficients. As a result of having applied to some clinical data, a detail visibility was improved significantly without unacceptable noise boosting. Images that acquired with the proposed adaptive non-linear gain coefficients have shown superior quality to those that applied similar gain control method and expected to be accepted in the clinical applications.

Application of Blind Deconvolution with Crest Factor for Recovery of Original Rolling Element Bearing Defect Signals (볼 베어링 결함신호 복원을 위한 파고율을 이용한 Blind Deconvolution의 응용)

  • Son, Jong-Duk;Yang, Bo-Suk;Tan, A.C.C.;Mathew, J.
    • Proceedings of the KSME Conference
    • /
    • 2004.11a
    • /
    • pp.585-590
    • /
    • 2004
  • Many machine failures are not detected well in advance due to the masking of background noise and attenuation of the source signal through the transmission mediums. Advanced signal processing techniques using adaptive filters and higher order statistics have been attempted to extract the source signal from the measured data at the machine surface. In this paper, blind deconvolution using the eigenvector algorithm (EVA) technique is used to recover a damaged bearing signal using only the measured signal at the machine surface. A damaged bearing signal corrupted by noise with varying signal-to-noise (s/n) was used to determine the effectiveness of the technique in detecting an incipient signal and the optimum choice of filter length. The results show that the technique is effective in detecting the source signal with an s/n ratio as low as 0.21, but requires a relatively large filter length.

  • PDF

Effects of Talker Sidetone and Room Noise on the Speech Level of a Talker (송화측음 및 실내소음이 송화 음성레벨에 미치는 영향)

  • Kang, Kyeong-Ok;Kang, Seong-Hoon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.11 no.1
    • /
    • pp.52-59
    • /
    • 1992
  • In order to see the effects of talker sidetone on a talker's speech level quantitatively when he converses with others on a telephone, we reviewed the measuring algorithm of speech level and assessed variation of speech level due to that of sidetone masking rating(STMR). We measured room noise effects on speech level, when STMR values were changed, as well. If we consider the effects of talker sidetone and room noise on speech level, the results of experiments suggest that a talker continuously tries to keep the psychological loudness of his own speech, heard by himeself via a telephone handset, at the constant and comfortable level by controlling his speaking level according as STMR value and room noise are change. That is, because the amount of his speech masked by his talker sidetone and room noise is different when STMR value and room noise are changed, we can see the tendency that he controls his speaking level in order to keep the perceived psychological loudness of his own speech to be constant.

  • PDF

NDVI Noise Interpolation Using Harmonic Analysis (조화 분석을 이용한 식생지수 보정 기법에 관한 연구)

  • Park, Soo-Jae;Han, Kyung-Soo;Pi, Kyoung-Jin
    • Korean Journal of Remote Sensing
    • /
    • v.26 no.4
    • /
    • pp.403-410
    • /
    • 2010
  • NDVI(Normalized Difference Vegetation Index), which is broadly used as short-term data composite, is an important parameter for climate change and long-term land surface monitoring. Although atmospheric correction is performed, NDVI dramatically appears several low peak noise in the long-term time series. They are related to various contaminated sources, such as cloud masking problem and wet ground condition. This study suggests a simple method through harmonic analysis for reducing NDVI noise using SPOT/VGT NDVI 10-day MVC data. The harmonic analysis method is compared with the polynomial regression method suggested previously. The polynomial regression method overestimates the NDVI values in the time series. The proposed method showed an improvement in NDVI correction of low peak and overestimation.

Extraction of Characteristics of Concrete Surface Cracks

  • Ahn, Sang-Ho
    • Journal of information and communication convergence engineering
    • /
    • v.5 no.2
    • /
    • pp.126-130
    • /
    • 2007
  • This paper proposes a method that automatically extracts characteristics of cracks such as length, thickness and direction, etc., from a concrete surface image with image processing techniques. This paper, first, uses the closing morphologic operation to adjust the effect of light extending over the whole concrete surface image. After applying the high-pass filtering operation to sharpen boundaries of cracks, we classify intensity values of the image into 8 groups and remove intensity values belong to the highest frequency group among them for the removal of background. Then, we binarize the preprocessed image. The auxiliary lines used to measure cracks of concrete surface are removed from the binarized image with position information extracted by the histogram operation. Then, cracks broken by the removal of background are extended to reconstruct an original crack with the $5{\times}5$ masking operation. We remove unnecessary information by applying three types of noise removal operations successively and extracts areas of cracks from the binarized image. At last, the opening morphologic operation is applied to compensate extracted cracks and characteristics of cracks are measured on the compensated ones. Experiments using real images of concrete surface showed that the proposed method extracts cracks well and precisely measures characteristics of cracks.