• Title/Summary/Keyword: Realistic Audio

Search Result 64, Processing Time 0.03 seconds

Real-time 3D Audio Downmixing System based on Sound Rendering for the Immersive Sound of Mobile Virtual Reality Applications

  • Hong, Dukki;Kwon, Hyuck-Joo;Kim, Cheong Ghil;Park, Woo-Chan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.12
    • /
    • pp.5936-5954
    • /
    • 2018
  • Eight out of the top ten the largest technology companies in the world are involved in some way with the coming mobile VR revolution since Facebook acquired Oculus. This trend has allowed the technology related with mobile VR to achieve remarkable growth in both academic and industry. Therefore, the importance of reproducing the acoustic expression for users to experience more realistic is increasing because auditory cues can enhance the perception of the complicated surrounding environment without the visual system in VR. This paper presents a audio downmixing system for auralization based on hardware, a stage of sound rendering pipelines that can reproduce realiy-like sound but requires high computation costs. The proposed system is verified through an FPGA platform with the special focus on hardware architectural designs for low power and real-time. The results show that the proposed system on an FPGA can downmix maximum 5 sources in real-time rate (52 FPS), with 382 mW low power consumptions. Furthermore, the generated 3D sound with the proposed system was verified with satisfactory results of sound quality via the user evaluation.

The Design of Object-based 3D Audio Broadcasting System (객체기반 3차원 오디오 방송 시스템 설계)

  • 강경옥;장대영;서정일;정대권
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.7
    • /
    • pp.592-602
    • /
    • 2003
  • This paper aims to describe the basic structure of novel object-based 3D audio broadcasting system To overcome current uni-directional audio broadcasting services, the object-based 3D audio broadcasting system is designed for providing the ability to interact with important audio objects as well as realistic 3D effects based on the MPEG-4 standard. The system is composed of 6 sub-modules. The audio input module collects the background sound object, which is recored by 3D microphone, and audio objects, which are recorded by monaural microphone or extracted through source separation method. The sound scene authoring module edits the 3D information of audio objects such as acoustical characteristics, location, directivity and etc. It also defines the final sound scene with a 3D background sound, which is intended to be delievered to a receiving terminal by producer. The encoder module encodes scene descriptors and audio objects for effective transmission. The decoder module extracts scene descriptors and audio objects from decoding received bistreams. The sound scene composition module reconstructs the 3D sound scene with scene descriptors and audio objects. The 3D sound renderer module maximizes the 3D sound effects through adapting the final sound to the listner's acoustical environments. It also receives the user's controls on audio objects and sends them to the scene composition module for changing the sound scene.

Introduction and Standard Status of High Order Multichannel Audio System for Realistic Audio Broadcasting (실감 오디오 방송을 위한 초다채널 오디오 시스템 및 표준화 동향)

  • Seo, J.I.;Kang, K.O.
    • Electronics and Telecommunications Trends
    • /
    • v.27 no.6
    • /
    • pp.49-56
    • /
    • 2012
  • 본고는 3DTV, UHDTV(Ultra High Definition Television)와 같은 실감방송 환경에서 실감 오디오 서비스를 제공하기 위한 초다채널 오디오 기술의 최근 연구 및 개발 동향을 소개한다. 스테레오와 5.1 채널로 대표되는 기존의 오디오 기술은 2차원 평면상에서만 음장을 형성할 수 있다는 표현의 한계를 가지고 있다. 3D 영화의 성공과 UHDTV로 대표되는 초고화질 비디오와 부합하기 위해서는 오디오도 3차원 공간상에서 표현되어야 하며 이를 위해서는 필연적으로 출력채널 수가 증가하여야 한다. 이러한 초다채널 오디오는 22.2 채널과 같은 대용량의 오디오 데이터를 압축하는 기술뿐만 아니라 다양한 오디오 출력 환경에 적응적으로 오디오 콘텐츠를 표현하는 기술에 대한 연구/개발이 필요하다.

  • PDF

Latency Analysis of AVB Network and Optimization Design for Automotive

  • An, Byoungman;Kim, YoungSeop
    • Journal of the Semiconductor & Display Technology
    • /
    • v.18 no.3
    • /
    • pp.127-132
    • /
    • 2019
  • This paper presents an overview of automotive communication technologies, including related technology developments. We describe the latency of Audio Video Bridge (AVB) network as well as purpose the optimized design of the Ethernet network system for automotive. Our design plays a significant role in reducing the delay between components. The proposed approach on realistic test cases showed that there was a delay reduction, approximately 49.4%. It is expected that the optimization method for the actual automotive environment can greatly shorten the time period in the design and development process. The results obtained from the experiments on the delay time present in each function are reliable because average values are obtained through repeated actual tests for several months. It will greatly benefit the industry since analyzing the latency between each function in a short period of time is very important.

Implementation of a Person Tracking Based Multi-channel Audio Panning System for Multi-view Broadcasting Services (다시점 방송 서비스를 위한 사용자 위치추적 기반 다채널 오디오 패닝 시스템 구현)

  • Kim, Yong-Guk;Yang, Jong-Yeol;Lee, Young-Han;Kim, Hong-Kook
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.150-157
    • /
    • 2009
  • In this paper, we propose a person tracking based multi-channel audio panning system for multi-view broadcasting services. Multi-view broadcasting is to render the video sequences that are captured from a set of cameras based on different viewpoints, and multi-channel audio panning techniques are necessary for audio rendering in these services. In order to apply such a realistic audio technique to this multi-view broadcasting service, person tracking techniques which are to estimate the position of users are also necessary. For these reasons, proposed methods are composed of two parts. The first part is a person tracking method by using ultrasonic satellites and receiver. We could obtain user's coordinates of high resolution and short duration about 10 mm and 150 ms. The second part is MPEG Surround parameter-based multi-channel audio panning method. It is a method to obtain panned multi-channel audio by controlling the MPEG Surround spatial parameters. A MUSHRA test is conducted to objectively evaluate the perceptual quality and measure localization performance using a dummy head. From the experiments, it is shown that the proposed method provides better perceptual quality and localization performance than the conventional parameter-based audio panning method. In addition, we implement the prototype of person tracking based multi-view broadcasting system by integrating proposed methods with multi-view display system.

  • PDF

Multichannel Audio Reproduction Technology based on 10.2ch for UHDTV (UHDTV를 위한 10.2 채널 기반 다채널 오디오 재현 기술)

  • Lee, Tae-Jin;Yoo, Jae-Hyoun;Seo, Jeong-Il;Kang, Kyeong-Ok;Kim, Whan-Woo
    • Journal of Broadcast Engineering
    • /
    • v.17 no.5
    • /
    • pp.827-837
    • /
    • 2012
  • As broadcasting environments change rapidly to digital, user requirements for next-generation broadcasting service which surpass current HDTV service become bigger and bigger. The next-generation broadcasting service progress from 2D to 3D, from HD to UHD and from 5.1ch audio to more than 10ch audio for high quality realistic broadcasting service. In this paper, we propose 10.2ch based multichannel audio reproduction system for UHDTV. The 10.2ch-based audio reproduction system add two side loudspeakers to enhance the surround sound localization effect and add two height and one ceiling loudspeakers to enhance the elevation localization effect. To evaluate the proposed system, we used APM(Auditory Process Model) for objective localization test and conducted subjective localization test. As a result of objective/subjective localization test, the proposed system shows the statistically same performance compare with 22.2ch audio system and shows the significantly better performance compared with 5.1ch audio system.

Real-Time Vision Based Speaker Location Detection for Realistic Audio Reproduction (실감 음향 재생을 위한 영상기반의 실시간 화자 위치 검출)

  • Lim Jaehyun;Lee Chulhee
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.143-146
    • /
    • 2004
  • 일반적으로, 화상회의에서 화자의 위치를 검출하는 것은 음향 신호를 기반으로 이루어져 왔다. 그러나 물리적인 환경의 제약이나 화자 검출 시스템의 한계를 벗어나는 노이즈가 발생하는 경우에는 검출 시스템의 성능저하를 초래하게 된다. 본 논문에서는 음향 기반의 검출 시스템과 독립적으로, 혹은 상호 보완적으로 사용될 수 있는 영상 기반의 화자 검출 알고리즘에 대하여 제안하고자 한다. 화자의 위치에 관한 정보는 화상회의에 한층 사실감을 부여하는 3 차원 오디오 재생에 사용될 수 있다.

  • PDF

'EVE-SoundTM' Toolkit for Interactive Sound in Virtual Environment (가상환경의 인터랙티브 사운드를 위한 'EVE-SoundTM' 툴킷)

  • Nam, Yang-Hee;Sung, Suk-Jeong
    • The KIPS Transactions:PartB
    • /
    • v.14B no.4
    • /
    • pp.273-280
    • /
    • 2007
  • This paper presents a new 3D sound toolkit called $EVE-Sound^{TM}$ that consists of pre-processing tool for environment simplification preserving sound effect and 3D sound API for real-time rendering. It is designed so that it can allow users to interact with complex 3D virtual environments by audio-visual modalities. $EVE-Sound^{TM}$ toolkit would serve two different types of users: high-level programmers who need an easy-to-use sound API for developing realistic 3D audio-visually rendered applications, and the researchers in 3D sound field who need to experiment with or develop new algorithms while not wanting to re-write all the required code from scratch. An interactive virtual environment application is created with the sound engine constructed using $EVE-Sound^{TM}$ toolkit, and it shows the real-time audio-visual rendering performance and the applicability of proposed $EVE-Sound^{TM}$ for building interactive applications with complex 3D environments.

Improvement of Head Related Transfer Function to Create Realistic 3D Sound (현실감있는 입체음향 생성을 위한 머리전달함수의 개선)

  • Koo, Kyo-Sik;Cha, Hyung-Tai
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.3
    • /
    • pp.381-386
    • /
    • 2008
  • Virtual 3D audio methods that create 3D sound effects are researched highly for multimedia devices using 2 speakers or headphone. The most typical method to create 3D effects is a technology through use of head related transfer function (HRTF) which contains the information that sound arrives from a sound source to the ears of the listener. But it can decline some 3D effects by cone of confusion between front and back directions due to the non-individual HRTF depending on each listener. In this paper, we propose a new method to use psychoacoustic theory that creates realistic 3D audio. In order to improve 3D sound, we calculate the excitation energy of each symmetric HRTF and extract the ratio of energy of each bark range. Informal listening tests show that the proposed method improves the front-bach sound localization characteristics much better than the conventional methods.

A Sound Externalization Method for Realistic Audio Rendering in a Headphone Listening Environment (헤드폰 청취환경에서의 실감 오디오 재현을 위한 음상 외재화 기법)

  • Kim, Yong-Guk;Chun, Chan-Jun;Kim, Hong-Kook;Lee, Yong-Ju;Jang, Dae-Young;Kang, Kyeong-Ok
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.5
    • /
    • pp.1-8
    • /
    • 2010
  • In this paper, a sound externalization method is proposed for out-of-the-head localization in a headphone listening environment. In order to reduce timbre distortion by the conventional methods using a measured a head-related transfer function (HRTF) or early reflections, the proposed method integrates a model-based HRTF with reverberation. In addition, for improving frontal externalization performance, techniques such as decorrelation and spectral notch filtering are included. To evaluate the performance of the proposed externalization method, subjective listening tests are conducted by using different types of sound sources such as white noise, sound effects, speech, and music. It is shown from the test results that the proposed externalization method can localize sound sources farther away from out of the head than the conventional method.