• Title/Summary/Keyword: 3D Audio

Search Result 210, Processing Time 0.035 seconds

The Design of Digital Audio Interpolation Filter for Integrating Off-Chip Analog Low-Pass Filter (칩 외부의 아날로그 저역통과 필터를 집적시키기 위한 디지털 오디오용 보간 필터 설계)

  • Shin, Yun-Tae;Lee, Jung-Woong;Shin, Gun-Soon
    • Journal of IKEEE
    • /
    • v.3 no.1 s.4
    • /
    • pp.11-21
    • /
    • 1999
  • This paper has been proposed a structure composed of FIRs and IIR filters as digital interpolation filter to integrate the off-chip analog low-pass filter of audio DAC. The passband ripple (>$0.41{\times}fs$), passband attenuation(>at$0.41{\times}fs$) and stopband attenuation(<$0.59{\times}fs$) of the ${\Delta}{\Sigma}$ modulator output using the proposed digital interpolation filter had ${\pm}0.001[dB]$, -0.0025[dB] and -75[dB], respectively. Also the inband group delay was 30.07/fs[s] and the error of group delay was 0.1672%. Also, the attenuation of stopband has been increased -20[dB] approximately at 65[kHz], out-of-band. Therefore the RC products of analog low-pass filter on chip have been decreased compared with the conventional digital interpolation filter structure.

  • PDF

Improvement of Head Related Transfer Function to Create Realistic 3D Sound (현실감있는 입체음향 생성을 위한 머리전달함수의 개선)

  • Koo, Kyo-Sik;Cha, Hyung-Tai
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.3
    • /
    • pp.381-386
    • /
    • 2008
  • Virtual 3D audio methods that create 3D sound effects are researched highly for multimedia devices using 2 speakers or headphone. The most typical method to create 3D effects is a technology through use of head related transfer function (HRTF) which contains the information that sound arrives from a sound source to the ears of the listener. But it can decline some 3D effects by cone of confusion between front and back directions due to the non-individual HRTF depending on each listener. In this paper, we propose a new method to use psychoacoustic theory that creates realistic 3D audio. In order to improve 3D sound, we calculate the excitation energy of each symmetric HRTF and extract the ratio of energy of each bark range. Informal listening tests show that the proposed method improves the front-bach sound localization characteristics much better than the conventional methods.

Visual Object Tracking Fusing CNN and Color Histogram based Tracker and Depth Estimation for Automatic Immersive Audio Mixing

  • Park, Sung-Jun;Islam, Md. Mahbubul;Baek, Joong-Hwan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.3
    • /
    • pp.1121-1141
    • /
    • 2020
  • We propose a robust visual object tracking algorithm fusing a convolutional neural network tracker trained offline from a large number of video repositories and a color histogram based tracker to track objects for mixing immersive audio. Our algorithm addresses the problem of occlusion and large movements of the CNN based GOTURN generic object tracker. The key idea is the offline training of a binary classifier with the color histogram similarity values estimated via both trackers used in this method to opt appropriate tracker for target tracking and update both trackers with the predicted bounding box position of the target to continue tracking. Furthermore, a histogram similarity constraint is applied before updating the trackers to maximize the tracking accuracy. Finally, we compute the depth(z) of the target object by one of the prominent unsupervised monocular depth estimation algorithms to ensure the necessary 3D position of the tracked object to mix the immersive audio into that object. Our proposed algorithm demonstrates about 2% improved accuracy over the outperforming GOTURN algorithm in the existing VOT2014 tracking benchmark. Additionally, our tracker also works well to track multiple objects utilizing the concept of single object tracker but no demonstrations on any MOT benchmark.

Adaptive TCX Windowing Technology for Unified Structure MPEG-D USAC

  • Lee, Tae-Jin;Beack, Seung-Kwon;Kang, Kyeong-Ok;Kim, Whan-Woo
    • ETRI Journal
    • /
    • v.34 no.3
    • /
    • pp.474-477
    • /
    • 2012
  • The MPEG-D unified speech and audio coding (USAC) standardization process was initiated by MPEG to develop an audio codec that is able to provide consistent quality for mixed speech and music contents. The current USAC reference model structure consists of frequency domain (FD) and linear prediction domain (LPD) core modules and is controlled using a signal classifier tool. In this letter, we propose an LPD single-mode USAC structure using an adaptive widowing-based transform-coded excitation module. We tested our system using official test items for all mono-evaluation modes. The results of the experiment show that the objective and subjective performances of the proposed single-mode USAC system are better than those of the FD/LPD dual-mode USAC system.

A data-flow oriented framework for video-based 3D reconstruction (삼차원 재구성을 위한 Data-Flow 기반의 프레임워크)

  • Kim, Albert
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.04a
    • /
    • pp.71-74
    • /
    • 2009
  • The data-flow paradigm has been employed in various application areas. It is particularly useful where large data-streams must be processed, for example in video and audio processing, or for scientific visualization. A video-based 3D reconstruction system should process multiple synchronized video streams. The system exhibits many properties that can be targeted using a data-flow approach that is naturally divided into a sequence of processing tasks. In this paper we introduce our concept to apply the data-flow approach to a multi-video 3D reconstruction system.

음성인식 기반 인터렉티브 미디어아트의 연구 - 소리-시각 인터렉티브 설치미술 "Water Music" 을 중심으로-

  • Lee, Myung-Hak;Jiang, Cheng-Ri;Kim, Bong-Hwa;Kim, Kyu-Jung
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02a
    • /
    • pp.354-359
    • /
    • 2008
  • This Audio-Visual Interactive Installation is composed of a video projection of a video Projection and digital Interface technology combining with the viewer's voice recognition. The Viewer can interact with the computer generated moving images growing on the screen by blowing his/her breathing or making sound. This symbiotic audio and visual installation environment allows the viewers to experience an illusionistic spacephysically as well as psychologically. The main programming technologies used to generate moving water waves which can interact with the viewer in this installation are visual C++ and DirectX SDK For making water waves, full-3D rendering technology and particle system were used.

  • PDF

Design of three stage decimation filter using CSD code (CSD 코드를 사용한 3단 Decimation Filter 설계)

  • Byun, San-Ho;Ryu, Seong-Young;Choi, Young-Kil;Roh, Hyung-Dong;Lee, Hyun-Tae;Kang, Kyoung-Sik;Roh, Jeong-Jin
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.511-512
    • /
    • 2006
  • Three stage(CIC-FIR-FIR) decimation filter in delta-sigma A/D converter for audio is designed. A canonical signed digit(CSD) code method is used to minimize area of multipliers. This filter is designed in 0.25um CMOS process and incorporates $1.36\;mm^2$ of active area. Measured results show that this decimation filter is suitable for digital audio A/D converters.

  • PDF

Enhancement of Super-wideband Coder by Considering Audio Feature in MDCT Domain (MDCT 도메인에서 오디오 신호 특징을 고려한 초광대역 코덱 개선)

  • Hong, Ki-Bong;Jeong, Gyu-Hyeok;Lee, In-Sung
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.5
    • /
    • pp.129-136
    • /
    • 2011
  • This paper presents the coding method that have multi-mode and efficiency of audio codecs using the feature of audio signal. Recently, the developed extension super-wideband codec based on G.718 wideband divides two mode between Generic and Sinusiodal. So codec efficently encode audio signal exist in super-wideband. But the codec is not as efficent coding for harmonic component of wind instrument and string instrument and individual-Line component of percussion instrument. The proposed method are modeling and encoding multiple pitch and individual-line feature using multi mode coding. For the performance evaluation, we used SNR in MDCT domain for objective test and MUSHRA test for subjective test. As a result, the performance of SNR and MUSHRA test of the proposed method have better performance than the G.718 super-wideband codec.

A Digital Input Class-D Audio Amplifier (디지털 입력 시그마-델타 변조 기반의 D급 오디오 증폭기)

  • Jo, Jun-Gi;Noh, Jin-Ho;Jeong, Tae-Seong;Yoo, Chang-Sik
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.11
    • /
    • pp.6-12
    • /
    • 2010
  • A sigma-delta modulator based class-D audio amplifier is presented. Parallel digital input is serialized to two-bit output by a fourth-order digital sigma-delta noise shaper. The output of the digital sigma-delta noise shaper is applied to a fourth-order analog sigma-delta modulator whose three-level output drives power switches. The pulse density modulated (PDM) output of the power switches is low-pass filtered by an LC-filter. The PDM output of the power switches is fed back to the input of the analog sigma-delta modulator. The first integrator of the analog sigma-delta modulator is a hybrid of continuous-time (CT) and switched-capacitor (SC) integrator. While the sampled input is applied to SC path, the continuous-time feedback signal is applied to CT path to suppress the noise of the PDM output. The class-D audio amplifier is fabricated in a standard $0.13-{\mu}m$ CMOS process and operates for the signal bandwidth from 100-Hz to 20-kHz. With 4-${\Omega}$ load, the maximum output power is 18.3-mW. The total harmonic distortion plus noise and dynamic range are 0.035-% and 80-dB, respectively. The modulator consumes 457-uW from 1.2-V power supply.

Design of a 94.8dB SNR 1-bit 4th-order high-performance delta-sigma Modulator (94.8dB의 SNR을 갖는 1-bit 4차 고성능 델타-시그마 모듈레이터 설계)

  • Choi, Young-Kil;Roh, Hyung-Dong;Byun, San-Ho;Lee, Hyun-Tae;Kang, Kyoung-Sik;Roh, Jeong-Jin
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.507-508
    • /
    • 2006
  • High performance delta-sigma modulator is developed for audio-codec applications(i.e.. 16-bit resolution at a 20kHz signal bandwidth). The modulator is realized with fully-differential switched capacitor integrators. All stages employ a single-stage folded-cascode amplifier. The presented delta-sigma modulator when clocked at 3.2MHz achieves 85.2dB peak-SNDR and 94.8dB SNR. This modulator is designed in a SAMSUNG $0.18{\mu}m$ CMOS process. Finally, this paper shows the test setup and FFT result gained from delta-sigma modulator chip designed for audio applications.

  • PDF