• Title/Summary/Keyword: 마스킹 임계값

Search Result 10, Processing Time 0.024 seconds

Enhanced Adjustment Strategy of Masking Threshold for Speech Signals in Low Bit-Rate Audio Coding (저전송률 오디오 부호화에서 음성 신호의 성능 개선을 위한 마스킹 임계값 적응기법 향상)

  • Lee, Chang-Heon;Kang, Hong-Goo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.1
    • /
    • pp.62-68
    • /
    • 2010
  • This paper proposes a new masking threshold adjustment strategy to improve the performance for speech signals in low bit-rate audio coding. After determining formant regions, the masking threshold is adjusted by using the energy ratio of each sub-band to the average energy of each formant. More quantization noises are added to the bands that have relatively large energy, but less distortion is allowed in spectral valley regions by allocating more bits, which reflects the concept of perceptual weighting widely used in speech coding. From the results of objective speech quality measure, we verified that the proposed method improves quality for the speech input signals compared to the conventional one.

Local-property aware masking method on hardware implementation (국부특성을 반영한 하드웨어 기반의 마스킹 방식)

  • 정영훈
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2003.11a
    • /
    • pp.220-223
    • /
    • 2003
  • 이진 출력 기기에서 연속 계조의 영상을 받아들여 이진값으로 출력하는 이진화 알고리즘 중한 예로 마스킹(masking) 방법이 있으며, 마스크 방식의 단점을 보완하였다. 동일한 마스크의 반복적인 사용으로 인하여 영상의 부분적인 특성을 잘 표현해 주지 못하는 마스크들의 단정을 보완하기 위해서 국부 적응 임계값과 테이블 방식의 적응 파라메타를 제안하였으며, 결과적으로 시각적으로 중요한 경계성분을 강조와 국부 처리시 계조 표현력이 부족한 배경영역도 충분히 표현할 수 있었다.

  • PDF

Adaptive Enhancement Algorithm of Perceptual Filter Using Variable Threshold (가변 임계값을 이용한 지각 필터의 적응적인 음질 개선 알고리즘)

  • 차형태
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.6
    • /
    • pp.446-453
    • /
    • 2004
  • In this paper, a new adaptive perceptual filter using variable threshold to enhance audio signals degraded by additively nonstationary noise is proposed. The adaptive perceptual filter updates variable threshold each time according to the power of signal and the effect of noise variation. So the noisy audio signal is enhanced by the method which controls a residual noise effectively. The proposed algorithm uses the perceptual filter which transforms a time domain signal into frequency domain and calculates an intensity energy and an excitation energy in bark domain. In this method. the stage updated the response of filter is decided by threshold. The proposed algorithm using vairable threshold effectively controls a residual noise using the energy difference of audio signals degraded by the additive nonstationary noise. The proposed method is tested with the noisy audio signals degraded by nonstationary noise at various signal -to-noise ratios (SNR). We carry out NMR and MOS test when the input SNR is 15dB. 20dB. 25dB and 30dB. An approximate improvement of 17.4dB. 15.3dB, 12.8dB. 9.8dB in NMR and enhancement of 2.9, 2.5, 2.3, 1.7 in MOS test is achieved with the input signals. respectively.

Enhanced Pre echo Control Algorithm for MPEG Audio Coders (MPEG 오디오 부호화기를 위한 향상된 프리 에코 컨트롤 알고리듬)

  • Lee Chang-Joon;Lee Jae-Seong;Park Young-Cheol
    • Journal of Broadcast Engineering
    • /
    • v.11 no.2 s.31
    • /
    • pp.191-199
    • /
    • 2006
  • This paper presents an efficient pre echo control scheme for MPEG Audio coders based on the psychoacoustic model II (PAM-II). Pre echo control is the final step for the calculation of masking threshold in the PAM II. It is to minimize the spread of quantization error over the processing frame. In the conventional encoders, pre echo is reduced by restricting the estimated masking threshold not to exceed the one obtained in the previous frame. The conventional method performs pre echo control not only for short blocks but also for long blocks, which lowers the masking threshold in long blocks and, in turn, increases the quantization noise level of corresponding blocks. This paper proposes an efficient pre echo control process. The test result shows a mean enhancement of more than 0.4 especially for complex signals on the ITU R 5 point audio impairment scale.

Adaptive Spectral Subtraction Method Using SNR and Masking Effect for Robust Speech Recognition in Noisy Environments (잡음환경에 강인한 음성인식을 위해 SNR과 마스킹 효과를 이용한 적응 스펙트럼 차감법)

  • 김태준;김종훈;이경모;이정현
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10b
    • /
    • pp.580-582
    • /
    • 2004
  • 스펙트럼 차감과정에서 발생하는 잔류 잡음을 제거하는 방법으로 파라메터를 이용하는 적응 스펙트럼 차감법이 있다. 이는 파라메터를 증가시켜 잔류 잡음을 감소시키는 방법이지만 파라메터를 과도하게 증가시킬 경우 음성 왜곡이 발생한다. 따라서, 적절한 파라메터를 추출하기 위하여 SNR이나, 마스킹 효과 등을 이용한 방법들이 제안되었으나 과도한 잡음의 제거로 인한 음성 왜곡 문제와 낮은 SNR에서 부정확한 파라메터의 추출 문제는 여전히 해결해야 할 과제로 남아있다. 본 논문은 기존의 SNR을 이용한 방법에 마스킹 효과를 적용한 수정된 적응 스펙트럼 차감법을 제안한다. 제안된 방법에서는 마스킹 임계치를 이용하여 잡음 추정값을 재 계산 항으로써 SNR을 향상시켰고, 이를 이용하여 파라메터를 추출함으로써 성능을 개선했다 성능평가 결과, 제안한 차감법을 적용한 음성신호를 고립단어 음성인식 시스템에 적용했을 때 기존의 방법 보다 인식률이 향상된 것을 확인할 수 있었다.

  • PDF

Bit Operation Optimization and DNN Application using GPU Acceleration (GPU 가속기를 통한 비트 연산 최적화 및 DNN 응용)

  • Kim, Sang Hyeok;Lee, Jae Heung
    • Journal of IKEEE
    • /
    • v.23 no.4
    • /
    • pp.1314-1320
    • /
    • 2019
  • In this paper, we propose a new method for optimizing bit operations and applying them to DNN(Deep Neural Network) in software environment. As a method for this, we propose a packing function for bitwise optimization and a masking matrix multiplication operation for application to DNN. The packing function converts 32-bit real value to 2-bit quantization value through threshold comparison operation. When this sequence is over, four 32-bit real values are changed to one 8-bit value. The masking matrix multiplication operation consists of a special operation for multiplying the packed weight value with the normal input value. And each operation was then processed in parallel using a GPU accelerator. As a result of this experiment, memory saved about 16 times than 32-bit DNN Model. Nevertheless, the accuracy was within 1%, similar to the 32-bit model.

Design of the Noise Suppressor Using the Perceptual Model and Wavelet Packet Transform (인지 모델과 웨이블릿 패킷 변환을 이용한 잡음 제거기 설계)

  • Kim, Mi-Seon;Park, Seo-Young;Kim, Young-Ju;Lee, In-Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.7
    • /
    • pp.325-332
    • /
    • 2006
  • In this paper. we Propose the noise suppressor with the Perceptual model and wavelet packet transform. The objective is to enhance speech corrupted colored or non-stationary noise. If corrupted noise is colored. subband approach would be more efficient than whole band one. To avoid serious residual noise and speech distortion, we must adjust the Wavelet Coefficient Threshold (WCT). In this Paper. the subband is designed matching with the critical band and WCT is adapted noise masking threshold (NMT) and segmental signal to noise ratio (seg_SNR). Consequently. it has similar Performance with EVRC in PESQ-MOS. But it's better than wavelet packet transform using universal threshold about 0.289 in PESQ-MOS. The important thing is that it's more useful than EVRC in coded speech. In coded speech. PESQ-MOS is higher than EVRC about 0.23.

New Speech Enhancement Method using Psychoacoustic Criteria (심리 음향 기준을 이용한 새로운 음질 개선 방법)

  • 김대경;박장식;손경식
    • Journal of Korea Multimedia Society
    • /
    • v.4 no.1
    • /
    • pp.56-66
    • /
    • 2001
  • The spectral subtraction algorithm using a criterion based on the human perception has been recently developed. The speech processed with Virag's algorithm sounds more pleasant to a human listener than those obtained by the classical methods. However, Virag's algorithm requires a robust voice activity detector (VAD). In the ESS (extended spectral subtraction) algorithm without VAD, the residual noise becomes more noticeable as the SNR decrease. In this paper we propose a new speech enhancement method, the combination of Wiener filter and spectral subtraction based on noise masking characteristics in the human auditory system. There is no need of VAD because the noise can be successively updated even during speech activity using Wiener filter. The adjustment of the subtraction parameter based on the masking threshold makes the residual noise inaudible. The proposed method has been compared with conventional spectral subtraction algorithms. Objective and subjective evaluation of the proposed system is performed with several noise types having different time-frequency distributions. The application of objective measures, the study of the speech spectrograms, as well as subjective listening tests, confirm that the enhanced speech with proposed algorithm is more pleasant to a human listener.

  • PDF

Multi-resolution SAR Image-based Agricultural Reservoir Monitoring (농업용 저수지 모니터링을 위한 다해상도 SAR 영상의 활용)

  • Lee, Seulchan;Jeong, Jaehwan;Oh, Seungcheol;Jeong, Hagyu;Choi, Minha
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.5_1
    • /
    • pp.497-510
    • /
    • 2022
  • Agricultural reservoirs are essential structures for water supplies during dry period in the Korean peninsula, where water resources are temporally unequally distributed. For efficient water management, systematic and effective monitoring of medium-small reservoirs is required. Synthetic Aperture Radar (SAR) provides a way for continuous monitoring of those, with its capability of all-weather observation. This study aims to evaluate the applicability of SAR in monitoring medium-small reservoirs using Sentinel-1 (10 m resolution) and Capella X-SAR (1 m resolution), at Chari (CR), Galjeon (GJ), Dwitgol (DG) reservoirs located in Ulsan, Korea. Water detected results applying Z fuzzy function-based threshold (Z-thresh) and Chan-vese (CV), an object detection-based segmentation algorithm, are quantitatively evaluated using UAV-detected water boundary (UWB). Accuracy metrics from Z-thresh were 0.87, 0.89, 0.77 (at CR, GJ, DG, respectively) using Sentinel-1 and 0.78, 0.72, 0.81 using Capella, and improvements were observed when CV was applied (Sentinel-1: 0.94, 0.89, 0.84, Capella: 0.92, 0.89, 0.93). Boundaries of the waterbody detected from Capella agreed relatively well with UWB; however, false- and un-detections occurred from speckle noises, due to its high resolution. When masked with optical sensor-based supplementary images, improvements up to 13% were observed. More effective water resource management is expected to be possible with continuous monitoring of available water quantity, when more accurate and precise SAR-based water detection technique is developed.

Hybrid Tone Mapping Technique Considering Contrast and Texture Area Information for HDR Image Restoration (HDR 영상 복원을 위해 대비와 텍스쳐 영역 정보를 고려한 혼합 톤 매핑 기법)

  • Kang, Ju-Mi;Park, Dae-Jun;Jeong, Jechang
    • Journal of Broadcast Engineering
    • /
    • v.22 no.4
    • /
    • pp.496-508
    • /
    • 2017
  • In this paper, we propose a Tone Mapping Operator (TMO) that preserves global contrast and precisely preserves boundary information. In order to reconstruct a High Dynamic Range (HDR) image to a Low Dynamic Range (LDR) display by using Threshold value vs. Intensity value (TVI) based on Human Visual System (HVS) and contrast value. As a result, the global contrast of the image can be preserved. In addition, by combining the boundary information detected using Guided Image Filtering (GIF) and the detected boundary information using the spatial masking of the Just Noticeable Difference (JND) model, And improved the perceived image quality of the output image. The conventional TMOs are classified into Global Tone Mapping (GTM) and Local Tone Mapping (LTM). GTM preserves global contrast, has the advantages of simple implementation and fast execution time, but it has a disadvantage in that the boundary information of the image is lost and the regional contrast is not preserved. On the other hand, the LTM preserves the local contrast and boundary information of the image well, but some areas are expressed unnatural like the occurrence of the halo artifact phenomenon in the boundary region, and the calculation complexity is higher than that of GTM. In this paper, we propose TMO which preserves global contrast and combines the merits of GTM and LTM to preserve boundary information of images. Experimental results show that the proposed tone mapping technique has superior performance in terms of cognitive quality.