• Title/Summary/Keyword: 3D Audio

Search Result 210, Processing Time 0.024 seconds

A Study on Immersive Audio Improvement of FTV using an effective noise (유효 잡음을 활용한 FTV 입체음향 개선방안 연구)

  • Kim, Jong-Un;Cho, Hyun-Seok;Lee, Yoon-Bae;Yeo, Sung-Dae;Kim, Seong-Kweon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.10 no.2
    • /
    • pp.233-238
    • /
    • 2015
  • In this paper, we proposed that immersive audio effect method using the effective noise to improve engagement in free-viewpoint TV(FTV) service. In the basketball court, we monitored the frequency spectrums by acquiring continuous audio data of players and referee using shotgun and wireless microphone. By analyzing this spectrum, in case that users zoomed in, we determined whether it is effective frequency or not. Therefore when users using FTV service zoom in toward the object, it is proposed that we need to utilize unnecessary noise instead of removing that. it will be able to be useful for an immersive audio implementation of FTV.

Digital Watermarking Using Psychoacoustic Model

  • Poomdaeng, S.;Toomnark, S.;Amornraksa, T.
    • Proceedings of the IEEK Conference
    • /
    • 2002.07b
    • /
    • pp.872-875
    • /
    • 2002
  • A digital watermarking technique applying psychoacoustic model for audio signal is proposed in this paper. In the watermarking scheme, the pseudo-random bit stream used as a watermark signal is embedded into the audio signal in both speech and music. The strength of the embedded signal is subject to the human auditory system in such a way that the disturbances on host audio signal are beyond the sensing of human ears. The experimental results show that the quality of the watermarked audio signal, in term of signal to noise ratio, can be improved up to 3.2 dB.

  • PDF

Multichannel Audio Reproduction Technology based on 10.2ch for UHDTV (UHDTV를 위한 10.2 채널 기반 다채널 오디오 재현 기술)

  • Lee, Tae-Jin;Yoo, Jae-Hyoun;Seo, Jeong-Il;Kang, Kyeong-Ok;Kim, Whan-Woo
    • Journal of Broadcast Engineering
    • /
    • v.17 no.5
    • /
    • pp.827-837
    • /
    • 2012
  • As broadcasting environments change rapidly to digital, user requirements for next-generation broadcasting service which surpass current HDTV service become bigger and bigger. The next-generation broadcasting service progress from 2D to 3D, from HD to UHD and from 5.1ch audio to more than 10ch audio for high quality realistic broadcasting service. In this paper, we propose 10.2ch based multichannel audio reproduction system for UHDTV. The 10.2ch-based audio reproduction system add two side loudspeakers to enhance the surround sound localization effect and add two height and one ceiling loudspeakers to enhance the elevation localization effect. To evaluate the proposed system, we used APM(Auditory Process Model) for objective localization test and conducted subjective localization test. As a result of objective/subjective localization test, the proposed system shows the statistically same performance compare with 22.2ch audio system and shows the significantly better performance compared with 5.1ch audio system.

Towards Low Complexity Model for Audio Event Detection

  • Saleem, Muhammad;Shah, Syed Muhammad Shehram;Saba, Erum;Pirzada, Nasrullah;Ahmed, Masood
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.9
    • /
    • pp.175-182
    • /
    • 2022
  • In our daily life, we come across different types of information, for example in the format of multimedia and text. We all need different types of information for our common routines as watching/reading the news, listening to the radio, and watching different types of videos. However, sometimes we could run into problems when a certain type of information is required. For example, someone is listening to the radio and wants to listen to jazz, and unfortunately, all the radio channels play pop music mixed with advertisements. The listener gets stuck with pop music and gives up searching for jazz. So, the above example can be solved with an automatic audio classification system. Deep Learning (DL) models could make human life easy by using audio classifications, but it is expensive and difficult to deploy such models at edge devices like nano BLE sense raspberry pi, because these models require huge computational power like graphics processing unit (G.P.U), to solve the problem, we proposed DL model. In our proposed work, we had gone for a low complexity model for Audio Event Detection (AED), we extracted Mel-spectrograms of dimension 128×431×1 from audio signals and applied normalization. A total of 3 data augmentation methods were applied as follows: frequency masking, time masking, and mixup. In addition, we designed Convolutional Neural Network (CNN) with spatial dropout, batch normalization, and separable 2D inspired by VGGnet [1]. In addition, we reduced the model size by using model quantization of float16 to the trained model. Experiments were conducted on the updated dataset provided by the Detection and Classification of Acoustic Events and Scenes (DCASE) 2020 challenge. We confirm that our model achieved a val_loss of 0.33 and an accuracy of 90.34% within the 132.50KB model size.

A digital Audio Watermarking Algorithm using 2D Barcode (2차원 바코드를 이용한 오디오 워터마킹 알고리즘)

  • Bae, Kyoung-Yul
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.2
    • /
    • pp.97-107
    • /
    • 2011
  • Nowadays there are a lot of issues about copyright infringement in the Internet world because the digital content on the network can be copied and delivered easily. Indeed the copied version has same quality with the original one. So, copyright owners and content provider want a powerful solution to protect their content. The popular one of the solutions was DRM (digital rights management) that is based on encryption technology and rights control. However, DRM-free service was launched after Steve Jobs who is CEO of Apple proposed a new music service paradigm without DRM, and the DRM is disappeared at the online music market. Even though the online music service decided to not equip the DRM solution, copyright owners and content providers are still searching a solution to protect their content. A solution to replace the DRM technology is digital audio watermarking technology which can embed copyright information into the music. In this paper, the author proposed a new audio watermarking algorithm with two approaches. First, the watermark information is generated by two dimensional barcode which has error correction code. So, the information can be recovered by itself if the errors fall into the range of the error tolerance. The other one is to use chirp sequence of CDMA (code division multiple access). These make the algorithm robust to the several malicious attacks. There are many 2D barcodes. Especially, QR code which is one of the matrix barcodes can express the information and the expression is freer than that of the other matrix barcodes. QR code has the square patterns with double at the three corners and these indicate the boundary of the symbol. This feature of the QR code is proper to express the watermark information. That is, because the QR code is 2D barcodes, nonlinear code and matrix code, it can be modulated to the spread spectrum and can be used for the watermarking algorithm. The proposed algorithm assigns the different spread spectrum sequences to the individual users respectively. In the case that the assigned code sequences are orthogonal, we can identify the watermark information of the individual user from an audio content. The algorithm used the Walsh code as an orthogonal code. The watermark information is rearranged to the 1D sequence from 2D barcode and modulated by the Walsh code. The modulated watermark information is embedded into the DCT (discrete cosine transform) domain of the original audio content. For the performance evaluation, I used 3 audio samples, "Amazing Grace", "Oh! Carol" and "Take me home country roads", The attacks for the robustness test were MP3 compression, echo attack, and sub woofer boost. The MP3 compression was performed by a tool of Cool Edit Pro 2.0. The specification of MP3 was CBR(Constant Bit Rate) 128kbps, 44,100Hz, and stereo. The echo attack had the echo with initial volume 70%, decay 75%, and delay 100msec. The sub woofer boost attack was a modification attack of low frequency part in the Fourier coefficients. The test results showed the proposed algorithm is robust to the attacks. In the MP3 attack, the strength of the watermark information is not affected, and then the watermark can be detected from all of the sample audios. In the sub woofer boost attack, the watermark was detected when the strength is 0.3. Also, in the case of echo attack, the watermark can be identified if the strength is greater and equal than 0.5.

A Web-based 3D Virtual Reality Pavilion of Korean Traditional Music (웹 기반의 가상현실 3D 국악 박물관 제작)

  • Choi, Ji Ae;Shim, Jae Sun;Kim, Yoon Sang
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.4 no.1
    • /
    • pp.65-68
    • /
    • 2008
  • In this paper, a web-based 3D virtual reality (VR) pavillion of Korean Traditional Music was implemented. The VR pavillion is used for the virtual demonstration and experience of Korean Traditional Music, which provides the information as well as multimedia experience on eight instruments to users through internet. It provides eight web-pages and one an audio-visual classroom on the instruments.

A DNN-Based Personalized HRTF Estimation Method for 3D Immersive Audio

  • Son, Ji Su;Choi, Seung Ho
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.1
    • /
    • pp.161-167
    • /
    • 2021
  • This paper proposes a new personalized HRTF estimation method which is based on a deep neural network (DNN) model and improved elevation reproduction using a notch filter. In the previous study, a DNN model was proposed that estimates the magnitude of HRTF by using anthropometric measurements [1]. However, since this method uses zero-phase without estimating the phase, it causes the internalization (i.e., the inside-the-head localization) of sound when listening the spatial sound. We devise a method to estimate both the magnitude and phase of HRTF based on the DNN model. Personalized HRIR was estimated using the anthropometric measurements including detailed data of the head, torso, shoulders and ears as inputs for the DNN model. After that, the estimated HRIR was filtered with an appropriate notch filter to improve elevation reproduction. In order to evaluate the performance, both of the objective and subjective evaluations are conducted. For the objective evaluation, the root mean square error (RMSE) and the log spectral distance (LSD) between the reference HRTF and the estimated HRTF are measured. For subjective evaluation, the MUSHRA test and preference test are conducted. As a result, the proposed method can make listeners experience more immersive audio than the previous methods.

Design of AOD System for MP3 Copyright Protection (MP3 저작권 보호를 위한 AOD 시스템의 설계)

  • Kim, Yeong-Jun;Kim, Tae-Yun
    • The KIPS Transactions:PartD
    • /
    • v.9D no.2
    • /
    • pp.323-328
    • /
    • 2002
  • In recent years, e-Commerce is very active on the Internet, especially the World Wide Web alone: with the popularization of Internet using high-speed networks. Especially, Circulation of Multimedia Contents like MP3 data if widely being focused as one of the popular researches. However, the existing models of AOD (Audio On Demand) System lack substantial illegal copy protection or copyright protection. In this paper, we propose an AOD System that guarantees substantial illegal copy protection and copyright protection based on the PKI (Public Key Infrastructure). As transmitting MP3 data using the user's public key, the proposed .method prevents the attack of dropper during transmitting data. Also, it guarantees the right of users and distributors by prohibiting illegal users from using MP3 data.

Implementation and Performance Measurement of Personal Media Gateway for Applications over BcN Networks (BcN용 미디어 프로세서형 단말(PMG)의 구현 및 성능시험)

  • Jang, Seong-Hwan;Yang, Soo-Kyung;Cha, Young;Choi, Woo-Suk;Son, Seok-Bae;Kim, Jung-Joon
    • 한국정보통신설비학회:학술대회논문집
    • /
    • 2005.08a
    • /
    • pp.329-332
    • /
    • 2005
  • In this paper, we describe implementation of personal media gateway (PMG) for applications over BcN networks. PMG is a TV based set-top terminal, which enables transmission of Full D1 high quality video and audio at the speed of maximum 2Mbps. It supports SIP protocol and QoS for the BcN networks. The hardware of the PMG consists of host module, audio/video codec processing module, DTMF module, and remote control I/O module. H.263 and MPEG4 software are implemented in DSP as codec for hi-directional communication and streaming, respectively. G.711 and Ogg-Vorbis are implemented as audio codec. We examined the quality of video using the Video Quality Test Equpment, which was developed by KT Convergence Lab. The experimental results show the video quality of MOS 4.1 and audio quality of MOS 4.3. We expect that PMG will be prospective business models, and create new customer value.

  • PDF