Search | Korea Science

Audio Enhancement Algorithm Using Adaptive Perceptual Filter (적응 지각 필터를 이용한 오디오 음질 개선 알고리즘)

엄혜영;한헌수;홍민철;차형태
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.8
- /
- pp.687-693
- /
- 2003
In this paper, a new adaptive audio signal enhancement algorithm is proposed. In order to remove a broadband noise from a noisy signal, a filter is designed and applied adaptively to noisy audio signal. The noisy signal is first transformed to frequency domain and divided into bark domain to calculate excitation energy. A filter will be calculated to eliminate the noise by using the excitation energy and noisy energy which is obtained from a silent area. The filter is adaptively adjusted and continuously applied until the threshold point is met. The algorithm also works well even though the noise's energy change all of a sudden. SNR, NMR comparison and MOS Test are performed to show the effectiveness of the proposed algorithm.
PDF KSCI

Design and Implementation of the low power and high quality audio encoder/decoder for voice synthesis (음성 합성용 저전력 고음질 부호기/복호기 설계 및 구현)

Park, Nho-Kyung;Park, Sang-Bong;Heo, Jeong-Hwa
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.13 no.6
- /
- pp.55-61
- /
- 2013
In this paper, we describe design and implementation of audio encoder/decoder for voice synthesis. It uses the encoding of difference value of successive samples instead of the original sample value. and has the compression ratio of 4. The function is verified by using FPGA and the performance is measured by the fabricated chip using $0.35{\mu}m$ standard CMOS process. The system clock is 16.384MHz. The measured THD+n is from -40dB to -80dB with frequency variation and the power consumption is about 80mW. It is suited for the mobile application of high audio quality and low power consumption.
https://doi.org/10.7236/JIIBC.2013.13.6.55 인용 PDF KSCI

Representative Melodies Retrieval using Waveform and FFT Analysis of Audio (오디오의 파형과 FFT 분석을 이용한 대표 선율 검색)

Chung, Myoung-Bum;Ko, Il-Ju
- Journal of KIISE:Software and Applications
- /
- v.34 no.12
- /
- pp.1037-1044
- /
- 2007
Recently, we extract the representative melody of the music and index the music to reduce searching time at the content-based music retrieval system. The existing study has used MIDI data to extract a representative melody but it has a weak point that can use only MIDI data. Therefore, this paper proposes a representative melody retrieval method that can be use at all audio file format and uses digital signal processing. First, we use Fast Fourier Transform (FFT) and find the tempo and node for the representative melody retrieval. And we measure the frequency of high value that appears from PCM Data of each node. The point which the high value is gathering most is the starting point of a representative melody and an eight node from the starting point is a representative melody section of the audio data. To verity the performance of the method, we chose a thousand of the song and did the experiment to extract a representative melody from the song. In result, the accuracy of the extractive representative melody was 79.5% among the 737 songs which was found tempo.
PDF KSCI

Low-Power and High-Efficiency Class-D Audio Amplifier Using Composite Interpolation Filter for Digital Modulators

Kang, Minchul;Kim, Hyungchul;Gu, Jehyeon;Lim, Wonseob;Ham, Junghyun;Jung, Hearyun;Yang, Youngoo
- JSTS:Journal of Semiconductor Technology and Science
- /
- v.14 no.1
- /
- pp.109-116
- /
- 2014
This paper presents a high-efficiency digital class-D audio amplifier using a composite interpolation filter for portable audio devices. The proposed audio amplifier is composed of an interpolation filter, a delta-sigma modulator, and a class-D output stage. To reduce power consumption, the designed interpolation filter has an optimized composite structure that uses a direct-form symmetric and Lagrange FIR filters. Compared to the filters with homogeneous structures, the hardware cost and complexity are reduced by about half by the optimization. The coefficients of the digital delta-sigma modulator are also optimized for low power consumption. The class-D output stage has gate driver circuits to reduce shoot-through current. The implemented class-D audio amplifier exhibited a high efficiency of 87.8 % with an output power of 57 mW at a load impedance of $16{\Omega}$ and a power supply voltage of 1.8 V. An outstanding signal-to-noise ratio of 90 dB and a total harmonic distortion plus noise of 0.03 % are achieved for a single-tone input signal with a frequency of 1 kHz.
https://doi.org/10.5573/JSTS.2014.14.1.109 인용 PDF KSCI

Towards Low Complexity Model for Audio Event Detection

Saleem, Muhammad;Shah, Syed Muhammad Shehram;Saba, Erum;Pirzada, Nasrullah;Ahmed, Masood
- International Journal of Computer Science & Network Security
- /
- v.22 no.9
- /
- pp.175-182
- /
- 2022
In our daily life, we come across different types of information, for example in the format of multimedia and text. We all need different types of information for our common routines as watching/reading the news, listening to the radio, and watching different types of videos. However, sometimes we could run into problems when a certain type of information is required. For example, someone is listening to the radio and wants to listen to jazz, and unfortunately, all the radio channels play pop music mixed with advertisements. The listener gets stuck with pop music and gives up searching for jazz. So, the above example can be solved with an automatic audio classification system. Deep Learning (DL) models could make human life easy by using audio classifications, but it is expensive and difficult to deploy such models at edge devices like nano BLE sense raspberry pi, because these models require huge computational power like graphics processing unit (G.P.U), to solve the problem, we proposed DL model. In our proposed work, we had gone for a low complexity model for Audio Event Detection (AED), we extracted Mel-spectrograms of dimension 128×431×1 from audio signals and applied normalization. A total of 3 data augmentation methods were applied as follows: frequency masking, time masking, and mixup. In addition, we designed Convolutional Neural Network (CNN) with spatial dropout, batch normalization, and separable 2D inspired by VGGnet [1]. In addition, we reduced the model size by using model quantization of float16 to the trained model. Experiments were conducted on the updated dataset provided by the Detection and Classification of Acoustic Events and Scenes (DCASE) 2020 challenge. We confirm that our model achieved a val_loss of 0.33 and an accuracy of 90.34% within the 132.50KB model size.
https://doi.org/10.22937/IJCSNS.2022.22.9.26 인용 PDF KSCI

The Implementation of Multi-Channel Audio Codec for Real-Time operation (실시간 처리를 위한 멀티채널 오디오 코덱의 구현)

Hong, Jin-Woo
- The Journal of the Acoustical Society of Korea
- /
- v.14 no.2E
- /
- pp.91-97
- /
- 1995
This paper describes the implementation of a multi-channel audio codec for HETV. This codec has the features of the 3/2-stereo plus low frequency enhancement, downward compatibility with the smaller number of channels, backward compatibility with the existing 2/0-stereo system(MPEG-1 audio), and multilingual capability. The encoder of this codec consists of 6-channel analog audio input part with the sampling rate of 48 kHz, 4-channel digital audio input part and three TMS320C40 /DSPs. The encoder implements multi-channel audio compression using a human perceptual psychoacoustic model, and has the bit rate reduction to 384 kbit/s without impairment of subjective quality. The decoder consists of 6-channel analog audio output part, 4-channel digital audio output part, and two TMS320C40 DSPs for a decoding procedure. The decoder analyzes the bit stream received with bit rate of 384 kbit/s from the encoder and reproduces the multi-channel audio signals for analog and digital outputs. The multi-processing of this audio codec using multiple DSPs is ensured by high speed transfer of date between DSPs through coordinating communication port activities with DMA coprocessors. Finally, some technical considerations are suggested to realize the problem of real-time operation, which are found out through the implementation of this codec using the MPEG-2 layer II sudio coding algorithm and the use of the hardware architecture with commercial multiple DSPs.
PDF

Application of Turbo Code for Digital Audio Broadcasting (DAB) System (디지털 오디오 방송을 위한 터보부호의 응용)

김한종
- The Journal of Korean Institute of Electromagnetic Engineering and Science
- /
- v.13 no.2
- /
- pp.176-187
- /
- 2002
The digital Audio Broadcasting (DAB) system adopts Coded OFDM(COFDM) for channel coding. The COFDM is a combined technique of multicarrier transmission(OFDM) and punctured convolutional coding with viterbi error correction. Because the channel coding is an important topic for OFDM systems, this paper proposes a new turbo coded OFDM system that replaces the existing RCPC codec by a turbo codec without modifying the puncturing procedure and puncturing vectors defined in the standard DAB system for compatibility. The performance of a new system is compared to that of the conventional system under the frequency selective Rician fading channel and the frequency selective Rayleigh fading channel in conjunction with DAB transmission mode I suitable for the terrestrial single frequency network(SFN) broadcasting. The standard system's performance was improved with the aid of turbo codec.
PDF KSCI

Analyses on limitations of binaural sound based on the first order Ambisonics for virtual reality audio (1차 Ambisonics에 의해 생성되는 가상현실 오디오용 양이 사운드의 한계에 대한 분석)

Chang, Ji-Ho;Cho, Wan-Ho.
- The Journal of the Acoustical Society of Korea
- /
- v.38 no.6
- /
- pp.637-650
- /
- 2019
This paper analyzes the limitations of binaural sound that is reproduced with headphones based on Ambisonics for Virtual Reality (VR) audio. VR audio can be provided with binaural sound that compensates head rotation of a listener. Ambisonics is widely used for recording and reproducing ambient sound fields around a listener in VR audio, and the First order Ambisonics (FOA) is still being used for VR audio because of its simplicity. However, the maximum frequencies with this order is too low to perfectly reproduce ear signals, and thus the binaural reproduction has inherent limitations in terms of spectrum and sound localization. This paper investigates these limitations by comparing the signals arrived at ear positions in the reference field and the reproduced field. An incidence wave is defined as a reference field, and reproduced over virtual loudspeakers. Frequency responses, inter-aural level differences, and inter-aural phase differences are compared. The results show, above the maximum cut off frequency in general, that the reproduced levels decrease, and the horizontal localization can be provided only around the forward direction.
https://doi.org/10.7776/ASK.2019.38.6.637 인용 PDF KSCI

FREQUENCY SELECTIVE RECURSIVE LP OF HARMONIC SPECTRA

SeungHyonNam
- Journal of the Korean Geophysical Society
- /
- v.4 no.4
- /
- pp.231-238
- /
- 2001
In this paper, an efficient LP method ofr discrete harmonic spectra is proposed and discussed. A new efficient LP method is a combination of recursive and frequency selective LP. While the recursive LP provides better spectral matching in spectral hills, frequency selective LP eliminates numerical instability and improves spectral matching when the harmonics are confined in the low frequncy region. The proposed LP method is applied to the HILN coder. Simulation results using a verification model(VM) software for real audio signals show a definite trend of significant improvement.
PDF

Helmet Tracking Techniques Using Phase Difference between Acoustic Beating Envelope which Wave Length is Longer than Audio Frequency (고주파 맥놀이 신호의 포락선 위상차를 이용한 음향식 헬멧자세추정 기법)

Choi, Kyong-Sik;Kim, Sang-Seok;Park, Chan-Heum;Yang, Jun-Ho
- Journal of the Korea Institute of Military Science and Technology
- /
- v.16 no.1
- /
- pp.27-33
- /
- 2013
Helmet Mounted Display(HMD) has great advantages on the navigation and mission symbologies for the pilot's forward looking display and, therefore, has been remarkably drawing attention as the up coming display of the next generation aircraft. The essential technology to process the Line of Sight-Foward(LOS-F) data in real-time is to estimate exact helmet situation and position. In this paper, we research a acoustic helmet tracking technique. For the reason that mechanical acoustic noises might interfere with Helmet Tracking System(HTS) and unnecessary acoustic noises are inevitable when using acoustic technique, this approach has not been adapted. In order to overcome this problem. We propose that acoustic wave of which the wave length is longer than audio frequency and, especially, we used beating signal envelope which is composed of two close high frequency.
https://doi.org/10.9766/KIMST.2013.16.1.027 인용 PDF KSCI

Search Result 376, Processing Time 0.034 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)