• Title/Summary/Keyword: object audio

Search Result 95, Processing Time 0.027 seconds

A Study on Vocal Removal Scheme of SAOC Using Harmonic Information (하모닉 정보를 이용한 SAOC의 보컬 신호 제거 방법에 관한 연구)

  • Park, Ji-Hoon;Jang, Dae-Geun;Hahn, Min-Soo
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.10
    • /
    • pp.1171-1179
    • /
    • 2013
  • Interactive audio service provide with audio generating and editing functionality according to user's preference. A spatial audio object coding (SAOC) scheme is audio coding technology that can support the interactive audio service with relatively low bit-rate. However, when the SAOC scheme remove the specific one object such as vocal object signal for Karaoke mode, the scheme support poor quality because the removed vocal object remain in the SAOC-decoded background music. Thus, we propose a new SAOC vocal harmonic extranction and elimination technique to improve the background music quality in the Karaoke service. Namely, utilizing the harmonic information of the vocal object, we removed the harmonics of the vocal object remaining in the background music. As harmonic parameters, we utilize the pitch, MVF(maximum voiced frequency), and harmonic amplitude. To evaluate the performance of the proposed scheme, we perform the objective and subjective evaluation. As our experimental results, we can confirm that the background music quality is improved by the proposed scheme comparing with the SAOC scheme.

Spatial Audio Technologies for Immersive Media Services (체감형 미디어 서비스를 위한 공간음향 기술 동향)

  • Lee, Y.J.;Yoo, J.;Jang, D.;Lee, M.;Lee, T.
    • Electronics and Telecommunications Trends
    • /
    • v.34 no.3
    • /
    • pp.13-22
    • /
    • 2019
  • Although virtual reality technology may not be deemed as having a satisfactory quality for all users, it tends to incite interest because of the expectation that the technology can allow one to experience something that they may never experience in real life. The most important aspect of this indirect experience is the provision of immersive 3D audio and video, which interacts naturally with every action of the user. The immersive audio faithfully reproduces an acoustic scene in a space corresponding to the position and movement of the listener, and this technology is also called spatial audio. In this paper, we briefly introduce the trend of spatial audio technology in view of acquisition, analysis, reproduction, and the concept of MPEG-I audio standard technology, which is being promoted for spatial audio services.

Non-uniform Linear Microphone Array Based Source Separation for Conversion from Channel-based to Object-based Audio Content (채널 기반에서 객체 기반의 오디오 콘텐츠로의 변환을 위한 비균등 선형 마이크로폰 어레이 기반의 음원분리 방법)

  • Chun, Chan Jun;Kim, Hong Kook
    • Journal of Broadcast Engineering
    • /
    • v.21 no.2
    • /
    • pp.169-179
    • /
    • 2016
  • Recently, MPEG-H has been standardizing for a multimedia coder in UHDTV (Ultra-High-Definition TV). Thus, the demand for not only channel-based audio contents but also object-based audio contents is more increasing, which results in developing a new technique of converting channel-based audio contents to object-based ones. In this paper, a non-uniform linear microphone array based source separation method is proposed for realizing such conversion. The proposed method first analyzes the arrival time differences of input audio sources to each of the microphones, and the spectral magnitudes of each sound source are estimated at the horizontal directions based on the analyzed time differences. In order to demonstrate the effectiveness of the proposed method, objective performance measures of the proposed method are compared with those of conventional methods such as an MVDR (Minimum Variance Distortionless Response) beamformer and an ICA (Independent Component Analysis) method. As a result, it is shown that the proposed separation method has better separation performance than the conventional separation methods.

Interaction between Object and Audio in Augmented Reality (증강현실에서 객체와 오디오의 상호작용)

  • Cho, Hyun-Wook;Lee, Jong-Keun;Lee, Jong-Hyeok
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.12
    • /
    • pp.2705-2711
    • /
    • 2011
  • The recent development in multimedia technology such as audio technology needs high quality audio system. Especially, Real Audio Technology is to be developed to play realistic sound. To meet this demands, researches on 3-Dimensional Audio which provides realistic audio effect in virtual reality and augmented reality are conducted. In this paper, how to provide realistic audio effect by using better audio technologies in augmented reality was investigated. In the study, the movements of the 3-Dimensional model on the markers were used to provide the sense of reality in virtual and real world. Namely, the sound was modified according to the movement of the model. The change in distance and angle of the model affected the sound volume and the pitch.

MPEG-H 3D Audio Decoder Structure and Complexity Analysis (MPEG-H 3D 오디오 표준 복호화기 구조 및 연산량 분석)

  • Moon, Hyeongi;Park, Young-cheol;Lee, Yong Ju;Whang, Young-soo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.42 no.2
    • /
    • pp.432-443
    • /
    • 2017
  • The primary goal of the MPEG-H 3D Audio standard is to provide immersive audio environments for high-resolution broadcasting services such as UHDTV. This standard incorporates a wide range of technologies such as encoding/decoding technology for multi-channel/object/scene-based signal, rendering technology for providing 3D audio in various playback environments, and post-processing technology. The reference software decoder of this standard is a structure combining several modules and can operate in various modes. Each module is composed of independent executable files and executed sequentially, real time decoding is impossible. In this paper, we make DLL library of the core decoder, format converter, object renderer, and binaural renderer of the standard and integrate them to enable frame-based decoding. In addition, by measuring the computation complexity of each mode of the MPEG-H 3D-Audio decoder, this paper also provides a reference for selecting the appropriate decoding mode for various hardware platforms. As a result of the computational complexity measurement, the low complexity profiles included in Korean broadcasting standard has a computation complexity of 2.8 times to 12.4 times that of the QMF synthesis operation in case of rendering as a channel signals, and it has a computation complexity of 4.1 times to 15.3 times of the QMF synthesis operation in case of rendering as a binaural signals.

Multi-View Point switch System Structure & Implementation of Video player in MPEG-4 based (MPEG-4 시스템 기반의 다시점 전환 시스템 구조 및 재생기 구현)

  • Lee, Jun-Cheol;Lee, Jung-Won;Chang, Yong-Seok;Kim, Sung-Ho
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.44 no.1
    • /
    • pp.80-93
    • /
    • 2007
  • This paper suggests structures of the Object Descriptor and the Elementary Stream Descriptor that provide multi-view video services in 3-Dimensional Audio Video technical standards of current MPEG-4. First, it defines that the structures of the Object Descriptor and the Elementary Stream Descriptor on established MPEG-4 system, then distributes individually, and analyzes that. But extension of established system is inappropriate for providing multi-view audio video services connected transmissions and receptions. And, this paper suggests a structure of new Object Descriptor able to switch viewpoints that considers the correlation between each viewpoints, when multi-view video is transmitted. By means of that, it is able to switch viewpoints according to a requirement of a user in a multi-view video services, and reduce overheads for transmitting information about necessary viewpoint.

The Design of Terrestrial DMB Media Processor for Multi-Channel Audio Services (멀티채널 오디오 서비스를 위한 지상파 DMB 미디어처리기 설계)

  • Kang Kyeongok;Hong Jaegeun;Seo Jeongil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.4
    • /
    • pp.186-193
    • /
    • 2005
  • The Terrestrial Digital Multimedia Broadcasting (T-DMB) system supplies high quality audio comparable with VCD in 7 inch display and high quality audio comparable CD at the mobile reception environment T-DMB will launch commercial service at the middle of 2005. However the bandwidth for audio data and the number of channels are restricted to 128 kbps and 2 respectively in the current T-DMB standard because of the limitation of available bandwidth for multimedia data. This Paper Proposes a novel media processor structure for providing multi-channel audio contents oyer T-DMB system allowing backward compatibility with the legacy T-DMB receiver. Furthermore. we also Propose an adaptive receiver structure to supply optimal audio contents on various speaker configuration in T-DMB receiver. To provide multi-channel audio contents allowing backward comaptilbity with the legacy T-DMB receiver, the additional data for multi-channel audio are defined as a dependent stream of main audio stream. The OD strucure for control an additional multi-channel audio elementary stream is proposed without changing the BIFS of the legacy T-DMB system.

The Sensory-Motor Fusion System for Object Tracking (이동 물체를 추적하기 위한 감각 운동 융합 시스템 설계)

  • Lee, Sang-Hee;Wee, Jae-Woo;Lee, Chong-Ho
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.52 no.3
    • /
    • pp.181-187
    • /
    • 2003
  • For the moving objects with environmental sensors such as object tracking moving robot with audio and video sensors, environmental information acquired from sensors keep changing according to movements of objects. In such case, due to lack of adaptability and system complexity, conventional control schemes show limitations on control performance, and therefore, sensory-motor systems, which can intuitively respond to various types of environmental information, are desirable. And also, to improve the system robustness, it is desirable to fuse more than two types of sensory information simultaneously. In this paper, based on Braitenberg's model, we propose a sensory-motor based fusion system, which can trace the moving objects adaptively to environmental changes. With the nature of direct connecting structure, sensory-motor based fusion system can control each motor simultaneously, and the neural networks are used to fuse information from various types of sensors. And also, even if the system receives noisy information from one sensor, the system still robustly works with information from other sensors which compensates the noisy information through sensor fusion. In order to examine the performance, sensory-motor based fusion model is applied to object-tracking four-foot robot equipped with audio and video sensors. The experimental results show that the sensory-motor based fusion system can tract moving objects robustly with simpler control mechanism than model-based control approaches.

Design of a Format Converter from MPEG-4 Over MPEG-2 TS to MP4 (MPEG-4 Over MPEG-2 TS로부터 MP4 파일로의 포맷 변환기 설계)

  • 최재영;정제창
    • Journal of Broadcast Engineering
    • /
    • v.5 no.2
    • /
    • pp.176-187
    • /
    • 2000
  • MPEG-4 is a digital bit stream format and associated protocols for representing multimedia content consisting of natural and synthetic audio, video and object data. This paper describes an application where multiple audio/visual data stream are combined in MPEG-4 and transported via MPTG-2 transport streams(TS). Also, this paper describes how to convert MPEG-4 Over MPEG-2 TS bit streams into MP4 file which Is designed to contain the media information of an MPEG-4 presentation in a flexible, extensible format. MPEG-4 is presented in the form of audio-visual objects that are arranged into an audio-visual scene by means of a scene descriptor and is composed of the audio-visual objects by means of an object descriptor. These descriptor streams are not defined MPEG-2 TS. So. this paper focuses on handling of these descriptors and parsing TS streams to get MPEG-4 data. The MPEG-4 Over MPEG-2 TS to MP4 format converter is implemented in the demonstrated systems.

  • PDF

A Study on Immersive Audio Improvement of FTV using an effective noise (유효 잡음을 활용한 FTV 입체음향 개선방안 연구)

  • Kim, Jong-Un;Cho, Hyun-Seok;Lee, Yoon-Bae;Yeo, Sung-Dae;Kim, Seong-Kweon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.10 no.2
    • /
    • pp.233-238
    • /
    • 2015
  • In this paper, we proposed that immersive audio effect method using the effective noise to improve engagement in free-viewpoint TV(FTV) service. In the basketball court, we monitored the frequency spectrums by acquiring continuous audio data of players and referee using shotgun and wireless microphone. By analyzing this spectrum, in case that users zoomed in, we determined whether it is effective frequency or not. Therefore when users using FTV service zoom in toward the object, it is proposed that we need to utilize unnecessary noise instead of removing that. it will be able to be useful for an immersive audio implementation of FTV.