• Title/Summary/Keyword: Audio enhancement

Search Result 59, Processing Time 0.027 seconds

Adaptive Enhancement Algorithm of Perceptual Filter Using Variable Threshold (가변 임계값을 이용한 지각 필터의 적응적인 음질 개선 알고리즘)

  • 차형태
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.6
    • /
    • pp.446-453
    • /
    • 2004
  • In this paper, a new adaptive perceptual filter using variable threshold to enhance audio signals degraded by additively nonstationary noise is proposed. The adaptive perceptual filter updates variable threshold each time according to the power of signal and the effect of noise variation. So the noisy audio signal is enhanced by the method which controls a residual noise effectively. The proposed algorithm uses the perceptual filter which transforms a time domain signal into frequency domain and calculates an intensity energy and an excitation energy in bark domain. In this method. the stage updated the response of filter is decided by threshold. The proposed algorithm using vairable threshold effectively controls a residual noise using the energy difference of audio signals degraded by the additive nonstationary noise. The proposed method is tested with the noisy audio signals degraded by nonstationary noise at various signal -to-noise ratios (SNR). We carry out NMR and MOS test when the input SNR is 15dB. 20dB. 25dB and 30dB. An approximate improvement of 17.4dB. 15.3dB, 12.8dB. 9.8dB in NMR and enhancement of 2.9, 2.5, 2.3, 1.7 in MOS test is achieved with the input signals. respectively.

Signal Quality Enhancement using Perceptual Convolutional Noise Suppression (지각형 컨벌루션 잡음 제어를 통한 음질 개선 방법)

  • 김헌중;한헌수;홍민철;차형태
    • Journal of Broadcast Engineering
    • /
    • v.8 no.1
    • /
    • pp.11-18
    • /
    • 2003
  • In this paper, we introduce a novel signal quality enhancement algorithm with a perceptual interference analysis and perceptual convolutional noise suppression. A perceptual convolutional noise is reflected in the audible disturbance that can still be recognized after the additional noise suppression and tonality change which is caused by the noise energy excitation. The enhancement system is organized with a perceptual additional noise suppression part and a perceptual convolutional noise suppression part. Experimental results show that these two parts have an equivalent quality enhancement performance.

Robust Speech Recognition in the Car Interior Environment having Car Noise and Audio Output (자동차 잡음 및 오디오 출력신호가 존재하는 자동차 실내 환경에서의 강인한 음성인식)

  • Park, Chul-Ho;Bae, Jae-Chul;Bae, Keun-Sung
    • MALSORI
    • /
    • no.62
    • /
    • pp.85-96
    • /
    • 2007
  • In this paper, we carried out recognition experiments for noisy speech having various levels of car noise and output of an audio system using the speech interface. The speech interface consists of three parts: pre-processing, acoustic echo canceller, post-processing. First, a high pass filter is employed as a pre-processing part to remove some engine noises. Then, an echo canceller implemented by using an FIR-type filter with an NLMS adaptive algorithm is used to remove the music or speech coming from the audio system in a car. As a last part, the MMSE-STSA based speech enhancement method is applied to the out of the echo canceller to remove the residual noise further. For recognition experiments, we generated test signals by adding music to the car noisy speech from Aurora 2 database. The HTK-based continuous HMM system is constructed for a recognition system. Experimental results show that the proposed speech interface is very promising for robust speech recognition in a noisy car environment.

  • PDF

Bandwidth enhancement scheme for VoIP application based on H.323 (H.323 기반 VoIP 어플리케이션에서의 대역폭 향상을 위한 방법)

  • 김기훈;박동선;이승상;박종빈
    • Proceedings of the IEEK Conference
    • /
    • 2003.11c
    • /
    • pp.149-152
    • /
    • 2003
  • In this paper, we propose a scheme that applies to the VoIP application based on H.323 protocol to enhance the bandwidth efficiency. We multiplex the audio and video stream. In this scheme, audio frame is carried with video stream. And we applies not only multiplexing but also (in header compressing to the real audio/video stream to increase the bandwidth efficiency. With the multiplexing and RTP header compressing, we gain the bandwidth efficiency. In the finite network environment, We can assign bandwidth to other users who want to use other service. and other VoIP users. If we can apply the real time network situation to the our VoIP application, we can get more efficient performance.

  • PDF

Sound Quality Enhancement in MPEG Surround by Using ILD Distortion (ILD DISTORTION을 이용한 MPEG SURROUND의 음질 개선)

  • Chon, Sang-Bae;Choi, In-Yong;Sung, Koeng-Mo
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.241-242
    • /
    • 2006
  • MPEG Surround is an audio coding technology that represents multi-channel audio signal with downmixed audio signal(s) and very low bitrate side information based on Binaural Cue Coding. The side information consists of Inter-Channel Level Difference, Inter-Channel Correlation, and payloads. These two parameters are correspondent to the well-known spatial parameters in psycho-acoustics, Inter-aural Level Difference (ILD) and Inter-Aural Cross Correlation (IACC). Though ICLD is to provide perceptually equivalent ILD to the listener, however, the ILD of the original multi-channel audio signal and that of the MPEG Surround encoded signal was different. The difference between two ILD values is defined as ILD Distortion (ILDD). This paper provides how ILDD can be applied to enhance sound quality in MPEG Surround and how much ILDD is decreased.

  • PDF

The Implementation of Multi-Channel Audio Codec for Real-Time operation (실시간 처리를 위한 멀티채널 오디오 코덱의 구현)

  • Hong, Jin-Woo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.2E
    • /
    • pp.91-97
    • /
    • 1995
  • This paper describes the implementation of a multi-channel audio codec for HETV. This codec has the features of the 3/2-stereo plus low frequency enhancement, downward compatibility with the smaller number of channels, backward compatibility with the existing 2/0-stereo system(MPEG-1 audio), and multilingual capability. The encoder of this codec consists of 6-channel analog audio input part with the sampling rate of 48 kHz, 4-channel digital audio input part and three TMS320C40 /DSPs. The encoder implements multi-channel audio compression using a human perceptual psychoacoustic model, and has the bit rate reduction to 384 kbit/s without impairment of subjective quality. The decoder consists of 6-channel analog audio output part, 4-channel digital audio output part, and two TMS320C40 DSPs for a decoding procedure. The decoder analyzes the bit stream received with bit rate of 384 kbit/s from the encoder and reproduces the multi-channel audio signals for analog and digital outputs. The multi-processing of this audio codec using multiple DSPs is ensured by high speed transfer of date between DSPs through coordinating communication port activities with DMA coprocessors. Finally, some technical considerations are suggested to realize the problem of real-time operation, which are found out through the implementation of this codec using the MPEG-2 layer II sudio coding algorithm and the use of the hardware architecture with commercial multiple DSPs.

  • PDF

Performance Improvement of Perceptual Filter Using Noise Energy Control (잡음 에너지 제어를 통한 지각 필터 성능 개선)

  • Seo Joung-Kook;Cha Hyung-Tai
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.1
    • /
    • pp.43-51
    • /
    • 2005
  • In this paper, we propose an algorithm that improves a tone quality of a noisy audio signal in order to enhance a Performance of perceptual filter using noise energy control. Most of the algorithms which were proposed by the other researchers usually applied a filter using the noise energy acquired from a silent range. In this case. the improvement rate of tone quality decreases if the noise energy is changed by the magnitude or environment variation in a signal frame. But the Proposed method Provides the means to find a food estimated noise through energy control of the estimated noise which is obtained from a silent range. Also we can get the enhancement of tone qualify in low frequency band unlike other methods. To show the performance of the Proposed algorithm, various input signals which had a different signal-to-noise ratio (SNR) such as 5dB, l0dB, 15dB and 20dB were used to test the proposed algorithm. With the proposed algorithm, we could confirm the enhancement of tone quality in terms of segmental SNR (SSNR). noise-to-mask ration (NMR) and mean opinion score (MOS) test.

A Study on the Car Audio Sound Quality Enhancement under Vehicle Noise and Its Subjective Evaluation (차량 주행소음을 고려한 자동차 오디오 음질 개선 및 주관적 음질평가 연구)

    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.8
    • /
    • pp.108-115
    • /
    • 1999
  • In this study we suggested a digital filter method to enhance car audio sound quality against the sound distortion due to cabin's acoustic characteristics and car driving noises. The digital filters designed were based on the characteristics on car driving noises and cabin acoustic characteristics. Car driving noises were analyzed by two ways; one is an objective method, octave band frequency analysis method. The other is a subjective method; sensory evaluation method, NCB method. On these results, seven sets of modified coefficients of eleven band digital filters were obtained. To find optimum audio sound quality among nine sound samples filtered by designing seven types of digital filters, which were mixed car driving noises at 100km/h, subjective evaluation method was used, paired comparison method; Scheffe' seven point method.

  • PDF

Estimation of speech feature vectors and enhancement of speech recognition performance using lip information (입술정보를 이용한 음성 특징 파라미터 추정 및 음성인식 성능향상)

  • Min So-Hee;Kim Jin-Young;Choi Seung-Ho
    • MALSORI
    • /
    • no.44
    • /
    • pp.83-92
    • /
    • 2002
  • Speech recognition performance is severly degraded under noisy envrionments. One approach to cope with this problem is audio-visual speech recognition. In this paper, we discuss the experiment results of bimodal speech recongition based on enhanced speech feature vectors using lip information. We try various kinds of speech features as like linear predicion coefficient, cepstrum, log area ratio and etc for transforming lip information into speech parameters. The experimental results show that the cepstrum parameter is the best feature in the point of reconition rate. Also, we present the desirable weighting values of audio and visual informations depending on signal-to-noiso ratio.

  • PDF

Compacted Codeword based Huffman Decoding for MPEG-2 AAC Audio (MPEG-2 AAC 오디오 코더를 위한 컴팩트화 코드워드 기반 허프만 디코딩 기법)

  • Lee, Jae-Sik;Lee, Eun-Seo;Chang, Tae-Gyu
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.369-370
    • /
    • 2006
  • This paper presents a new method for Huffman decoding specially designed for the MPEG-2 AAC audio. The method significantly enhances the processing efficiency of the conventional Huffman decoding realized with the ordinary binary tree search method. A data structure is newly designed based on the numerical interpretation of the incoming bit stream and its utilization for the offset oriented nodes allocation. The experimental results show the average performance enhancement of 54% and 665%, compared to those of the conventional binary tree search method and the sequential search method, respectively.

  • PDF