• Title/Summary/Keyword: 3D Audio

Search Result 210, Processing Time 0.018 seconds

Improvement of front/back Sound Localization Characteristics using Psychoacoustics of Head Related Transfer Function (머리전달함수의 심리음향적 특성을 이용한 전/후 음상정위 특성 개선)

  • Koo, Kyo-Sik;Cha, Hyung-Tai
    • Journal of Broadcast Engineering
    • /
    • v.11 no.4 s.33
    • /
    • pp.448-457
    • /
    • 2006
  • HRTF DB, including the information of the sounds which is arrived to our ears, is generally used to make a 3D sound. But it can decline some three-dimensional effects by the confusion between front and back directions due to the non-individual HRTF depending on each listener. In this paper, we propose a new method to use psychoacoustic theory that reduces the confusion of sound image localization. And we make use of an excitation energy by the sense of hearing. This method is brought HRTF spectrum characteristics into relief to draw out the energy ratio about the bark band. Informal listening tests show that the proposed method improves the front-back sound localization characteristics much better than the conventional methods.

Three Dimensional Audio Technologies for Realistic Broadcasting (실감방송을 위한 3차원 오디오 기술)

  • Jang, D.Y.;Seo, J.I.;Lee, T.J.;Park, G.Y.;Kang, K.O.
    • Electronics and Telecommunications Trends
    • /
    • v.19 no.4 s.88
    • /
    • pp.53-62
    • /
    • 2004
  • 차세대 방송 서비스는 입체감있는 3차원 AV 콘텐츠와 자연스럽게 사용자와 인터랙션하는 대화형 콘텐츠를 기반으로 하는 실감방송으로 변화되어 갈 것으로 예상된다. 이러한 실감방송 서비스에서는 현장감을 효율적으로 나타낼 수 있는 음상 정위 및 음장 재현 등 3차원 오디오 기술과 사용자 인터랙션을 위한 객체기반 오디오 처리 기술들이 필요하다. 본 고에서는 이러한 현장감과 사용자 인터랙션을 통하여 가상현실에 근접한 서비스를 제공하기 위한 대표적인 3차원 오디오 기술의 개발 동향을 살펴본다. 우선 3차원 오디오 기술의 기본 개념 및 개요를 기술하며, 이러한 3차원 오디오 기술에 기반한 대화형 3차원 오디오 기술 개발에 대한 최근 동향을 살펴보고, 국내에서 개발하고 있는 객체기반 3차원 오디오 기술에 대하여 간략히 설명한다.

Multi-modal Emotion Recognition using Semi-supervised Learning and Multiple Neural Networks in the Wild (준 지도학습과 여러 개의 딥 뉴럴 네트워크를 사용한 멀티 모달 기반 감정 인식 알고리즘)

  • Kim, Dae Ha;Song, Byung Cheol
    • Journal of Broadcast Engineering
    • /
    • v.23 no.3
    • /
    • pp.351-360
    • /
    • 2018
  • Human emotion recognition is a research topic that is receiving continuous attention in computer vision and artificial intelligence domains. This paper proposes a method for classifying human emotions through multiple neural networks based on multi-modal signals which consist of image, landmark, and audio in a wild environment. The proposed method has the following features. First, the learning performance of the image-based network is greatly improved by employing both multi-task learning and semi-supervised learning using the spatio-temporal characteristic of videos. Second, a model for converting 1-dimensional (1D) landmark information of face into two-dimensional (2D) images, is newly proposed, and a CNN-LSTM network based on the model is proposed for better emotion recognition. Third, based on an observation that audio signals are often very effective for specific emotions, we propose an audio deep learning mechanism robust to the specific emotions. Finally, so-called emotion adaptive fusion is applied to enable synergy of multiple networks. The proposed network improves emotion classification performance by appropriately integrating existing supervised learning and semi-supervised learning networks. In the fifth attempt on the given test set in the EmotiW2017 challenge, the proposed method achieved a classification accuracy of 57.12%.

Optimized DSP Implementation of Audio Decoders for Digital Multimedia Broadcasting (디지털 방송용 오디오 디코더의 DSP 최적화 구현)

  • Park, Nam-In;Cho, Choong-Sang;Kim, Hong-Kook
    • Journal of Broadcast Engineering
    • /
    • v.13 no.4
    • /
    • pp.452-462
    • /
    • 2008
  • In this paper, we address issues associated with the real-time implementation of the MPEG-1/2 Layer-II (or MUSICAM) and MPEG-4 ER-BSAC decoders for Digital Multimedia Broadcasting (DMB) on TMS320C64x+ that is a fixed-point DSP processor with a clock speed of 330 MHz. To achieve the real-time requirement, they should be optimized in different steps as follows. First of all, a C-code level optimization is performed by sharing the memory, adjusting data types, and unrolling loops. Next, an algorithm level optimization is carried out such as the reconfiguration of bitstream reading, the modification of synthesis filtering, and the rearrangement of the window coefficients for synthesis filtering. In addition, the C-code of a synthesis filtering module of the MPEG-1/2 Layer-II decoder is rewritten by using the linear assembly programming technique. This is because the synthesis filtering module requires the most processing time among all processing modules of the decoder. In order to show how the real-time implementation works, we obtain the percentage of the processing time for decoding and calculate a RMS value between the decoded audio signals by the reference MPEG decoder and its DSP version implemented in this paper. As a result, it is shown that the percentages of the processing time for the MPEG-1/2 Layer-II and MPEG-4 ER-BSAC decoders occupy less than 3% and 11% of the DSP clock cycles, respectively, and the RMS values of the MPEG-1/2 Layer-II and MPEG-4 ER-BSAC decoders implemented in this paper all satisfy the criterion of -77.01 dB which is defined by the MPEG standards.

3D Audio Processing Technology for Realistic Broadcasting (실감방송을 위한 3차원 오디오 처리 기술)

  • Lee Taejin;Kang Kyeongok
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2004.11a
    • /
    • pp.207-210
    • /
    • 2004
  • 디지털 방송기술이 진보함에 따라 사용자들의 실감방송에 대한 욕구도 증가하고 있다. 실감방송의 목적은 3차원 AV 기술을 통해 사용자가 마치 녹음/녹화한 현장에 있는 것과 같은 느낌을 주는 것이다. 청취자가 녹음한 환경에 있는 것과 같은 느낌을 주는 3차원 오디오 신호의 획득을 위해 더미헤드를 많이 이용한다. 인간의 머리형태를 띤 더미헤드의 특성 때문에 더미헤드를 이용하여 획득한 신호를 헤드폰을 통해 재생하는 경우 입체감을 느낄 수 있다. 하지만 더미헤드의 형태 및 크기 때문에 공공장소에서 사용하기가 어렵고, 획득한 신호를 멀티채널 신호로 변환하기가 힘들며, 재생 시 전/후방 혼동현상이 많이 발생한다. 본 논문에서는 이러한 더미헤드의 단점을 극복하기 위해 머리형태를 구체로 간략화하고, 구체 위에 다수개의 마이크를 배치함으로써 3차원 오디오를 획득하고, 후처리 과정을 통해 다양한 재생환경에 적절한 재생신호를 생성할 수 있는 3차원 오디오 획득 및 재생 시스템을 제안한다. 제안한 시스템의 성능평가를 위해 무향실에서 주관적 방향성 평가 실험을 수행한 결과, 더미헤드 기술의 단점인 전/후방 혼동현상을 현저하게 줄일 수 있었다. 본 논문에서 제안한 3차원 오디오 시스템은 3D-TV나 실감방송 통에서 입체음향 콘텐츠 획득에 이용 가능하다.

  • PDF

A sturdy on the blind audio source separation based on multi-step NMF-EM algorithm (다중 단계 NMF-EM 알고리즘 기반의 오디오 소스 분리 방법에 대한 연구)

  • Cho, Choongsang;Kim, Jewoo
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2014.06a
    • /
    • pp.9-11
    • /
    • 2014
  • 본 논문에서는 오디오 신호의 특성 표현에 유용한 nonnegative matrix factorization(NMF)에 대해 설명하였으며, expectation maximization (EM)을 이용한 NMF 파라미터 추출 및 EM-NMF 기반한 오디오 소스 분리 기술에 대해서 설명했다. 또한, 다중 단계 NMF-EM 구조의 객체 분리를 통해서 객체 분리 성능을 향상시키기 위한 알고리즘을 제안하며, 제안된 알고리즘은 K-pop 음원과 SDR(source distortion ratio)를 통해서 객체 분리 성능을 평가한다. 성능 평가 결과 제안된 알고리즘은 다중 단계를 통해 약 3dB 의 보컬 분리 성능이 향상되며, 상업적 음원 제작에서 사용되는 가상 오디오 효과가 많이 적용된 음원에서 약 5dB 의 분리 성능을 향상시켰다. 그러므로 제안된 방식은 오디오 객체 분리에 유용한 방법이 될 것으로 생각된다.

  • PDF

A Study on the N-Path SC Tracking Filter using PLL (PLL을 이용한 N-Path SC추적여파기에 관한 연구)

  • Jung, Sung-Hwan;Son, Hyun
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.8 no.3
    • /
    • pp.83-90
    • /
    • 1983
  • N-path SC tracking filter is studied beyond the audio frequency range. First, the SC filter Cell which would determine total SC filter characteristics is analyzed by the two methods, charge equation method and difference equation method. Second, 4-path and 8-path SC filter are presented, including only capacitors and switches. Then, 4-path and 8-path SC tracking filter are constructed by conisting of SC filter block and PLL block. In this experiment, maximum response shift is confirmed. With respect to the capacitor ratios and the number of path, Q and Gain(dB) is considered. Also tracking range is measured.

  • PDF

Design and Implementation of Scent-Supported Educational Content using Arduino

  • Hye-kyung Kwon;Heesun Kim
    • International journal of advanced smart convergence
    • /
    • v.12 no.4
    • /
    • pp.260-267
    • /
    • 2023
  • Due to the development of science and technology in the 4th Industrial Revolution, a variety of content is being developed and utilized through educational courses linked to digital textbooks. Students use smart devices to engage in realistic virtual learning experiences, interacting with the content in digital textbooks. However, while many realistic contents offer visual and auditory effects like 3D VR, AR, and holograms, olfactory content that evokes actual sensations has not yet been introduced. Therefore, in this paper, we designed and implemented 4D educational content by adding the sense of smell to existing content. This implemented content was tested in classrooms through a curriculum-based evaluation. Classes taught with olfactory-enhanced content showed a higher percentage of correct answers compared to those using traditional audio-visual materials, indicating improved understanding.

Design and implementation of Distance Learning System using 3 Dimensional Animation Control Technology (3차원 애니메이션 제어 기술을 활용한 원격교육시스템 설계 및 개발)

  • Im, Choong-Jae
    • Journal of Korea Game Society
    • /
    • v.16 no.3
    • /
    • pp.109-116
    • /
    • 2016
  • Distance learning systems that teacher and learner(s) are located at remote have been in progress in a way that directly transfer the video and audio. To get the interest of learners and effectiveness of education or to overcome the poor network environment, various methods utilizing computer graphics in the distance learning system have been attempted. This paper describes a design and implementation of a distance learning system using 3D animation control technology based on Kinect and network game technology. Distance learning system designed and implemented in this paper is a good example of combining education and game technology. And I expect to be used at various educational contents in the future.

A Systematical Design of ADSL POTS Splitter Using Passive Devices (수동 소자를 이용한 ADSL POTS Splitter의 체계적인 설계)

  • 박지만;김진태;소운섭
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.24 no.6A
    • /
    • pp.913-919
    • /
    • 1999
  • A systematic synthesis process is presented for the design of ADSL POTS splitters. It consists of a low-pass filter formed by a single-ended inductor, balanced inductor, and balance tightly coupled transformer. This three low-pass filters has been simulation. Simulation results show agreement of frequency characteristics. Therefore, POTS splitters using a commercial balance tightly coupled transformer are designed for the applications of ADSL system. The experimental results show that POTS splitter in the ADSL system has ripple decibel of less than $\pm$0.5 dB over a frequency range from 0.2 kHz to 3.4 kHz(or an audio band frequency) and delay distortion of less than 130 $mutextrm{s}$ over a frequency range from 0.6 kHz to 3.2 kHz.

  • PDF