• Title/Summary/Keyword: Audio Data

Search Result 879, Processing Time 0.046 seconds

An User Controllable Object Audio File Format and Audio Scene Description (사용자 기반 실감 객체 오디오 파일 포맷 및 오디오 장면 묘사 기법)

  • Cho, Choong-Sang;Kim, Je-Woo
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.5
    • /
    • pp.25-33
    • /
    • 2010
  • Multi-media service has been changed into user based audio services, which service supports actively user's preference and interaction with the users. In the market, multi-media products which can support the highest audio-quality by using lossless audio technology have been released and object audio music which user can select the objects has been serviced. In this paper, we design user's preference information based object audio file format and audio scene description for storage and transmission media. The designed file format is designed based on MPEG-4 file format because high-quality audio codecs in MPEG-4 audio can be easily used and the track of file format can be flexibly controlled depend on the number of the instrument in music. The encoded audio data of each objects and encoded audio scene description by binary encoding that has independent track are packed in a file. The scene description for storage media is consist of full and object scene description, the scene description for transmission media has an essential description for object audio operation and a specific description for real audio sound. The designed file format based simulator is developed and it generates an object audio file with several scene descriptions. Also, the real audio sound is serviced by the interaction with user and the unpacked scene description.

Energy-Aware Data-Preprocessing Scheme for Efficient Audio Deep Learning in Solar-Powered IoT Edge Computing Environments (태양 에너지 수집형 IoT 엣지 컴퓨팅 환경에서 효율적인 오디오 딥러닝을 위한 에너지 적응형 데이터 전처리 기법)

  • Yeontae Yoo;Dong Kun Noh
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.4
    • /
    • pp.159-164
    • /
    • 2023
  • Solar energy harvesting IoT devices prioritize maximizing the utilization of collected energy due to the periodic recharging nature of solar energy, rather than minimizing energy consumption. Meanwhile, research on edge AI, which performs machine learning near the data source instead of the cloud, is actively conducted for reasons such as data confidentiality and privacy, response time, and cost. One such research area involves performing various audio AI applications using audio data collected from multiple IoT devices in an IoT edge computing environment. However, in most studies, IoT devices only perform sensing data transmission to the edge server, and all processes, including data preprocessing, are performed on the edge server. In this case, it not only leads to overload issues on the edge server but also causes network congestion by transmitting unnecessary data for learning. On the other way, if data preprocessing is delegated to each IoT device to address this issue, it leads to another problem of increased blackout time due to energy shortages in the devices. In this paper, we aim to alleviate the problem of increased blackout time in devices while mitigating issues in server-centric edge AI environments by determining where the data preprocessed based on the energy state of each IoT device. In the proposed method, IoT devices only perform the preprocessing process, which includes sound discrimination and noise removal, and transmit to the server if there is more energy available than the energy threshold required for the basic operation of the device.

A design of dual AC-3 and MPEG-2 audio decoder (AC-3와 MPEG-2 오디오 공용 복호화기의 설계)

  • Ko, Woo-Suk;Yoo, Sun-Kook;Park, Sung-Wook;Jung, Nam-Hoon;Kim, Joon-Seok;Lee, Keun-Sup;Youn, Dae-Hee
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.23 no.6
    • /
    • pp.1433-1442
    • /
    • 1998
  • The thesis presents a dual audio decoder which can decode both AC-3 and MPEG-2 bitstream. The MPEG-2 synthesis processi s optimized via FFT to establish the common data path with AC-'3s. A dual audio decoder consists of a DSP core which performs the control-intensive part of each algorithm and a common synthesis filter which perfomrs the computation-intensive part. All the components of the dual audio decoder have been described in VHDL and simulated with a SYNOPSYS tool. The software modeling of the DSP core was used for functional validation. After being synthesized using 0.6 .mu.m-3ML technology standard cell, the dual audio decoder was simulated at gate-level with a COMPASS tool for hardware validation.

  • PDF

A Design of Multi-Format Audio Decoder (복수 포멧 지원 오디오 복호화기 설계)

  • Park, Sung-Wook
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.4
    • /
    • pp.477-482
    • /
    • 2007
  • This paper presents an audio decoder architecture which can decode AC-3 and MPEG-2 audio bit-streams efficiently. MPEG-2 synthesis filtering is modified by the 32-point FFT to share the common data path with the AC-3's. A programmable Audio DSP core and a hardwired common synthesis tilter are incorporated for effective decoding of two different formats.

A Synchronization Scheme Based on Moving Average for Robust Audio Watermarking

  • Zhang, Jinquan;Han, Bin
    • Journal of Information Processing Systems
    • /
    • v.15 no.2
    • /
    • pp.271-287
    • /
    • 2019
  • The synchronization scheme based on moving average is robust and suitable for the same rule to be adopted in embedding watermark and synchronization code, but the imperceptibility and search efficiency is seldom reported. The study aims to improve the original scheme for robust audio watermarking. Firstly, the survival of the algorithm from desynchronization attacks is improved. Secondly, the scheme is improved in inaudibility. Objective difference grade (ODG) of the marked audio is significantly changed. Thirdly, the imperceptibility of the scheme is analyzed and the derived result is close to experimental result. Fourthly, the selection of parameters is optimized based on experimental data. Fifthly, the search efficiency of the scheme is compared with those of other synchronization code schemes. The experimental results show that the proposed watermarking scheme allows the high audio quality and is robust to common attacks such as additive white Gaussian noise, requantization, resampling, low-pass filtering, random cropping, MP3 compression, jitter attack, and time scale modification. Moreover, the algorithm has the high search efficiency and low false alarm rate.

Sound Enhancement of low Sample rate Audio Using LMS in DWT Domain (DWT영역에서 LMS를 이용한 저 샘플링 비율 오디오 신호의 음질 향상)

  • 백수진;윤원중;박규식
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.1
    • /
    • pp.54-60
    • /
    • 2004
  • In order to mitigate the problems in storage space and network bandwidth for the full CD quality audio, current digital audio is always restricted by sampling rate and bandwidth. This restriction normally results in low sample rate audio or calls for the data compression scheme such as MP3. However, they can only reproduce a lower frequency range than a regular CD quality because of the Nyquist sampling theory. Consequently they lose rich spatial information embedded in high frequency. The propose of this paper is to propose efficient high frequency enhancement of low sample rate audio using n adaptive filtering and DWT analysis and synthesis. The proposed algorithm uses the LMS adaptive algorithm to estimate the missing high frequency contents in DWT domain and it then reconstructs the spectrally enhanced audio by using the DWT synthesis procedure. Several experiments with real speech and audio are performed and compared with other algorithm. From the experimental results of spectrogram and sonic test, we confirm that the proposed algorithm outperforms the other algorithm and reasonably works well for the most of audio cases.

An Advanced Coding for Video Streaming System: Hardware and Software Video Coding

  • Le, Tuan Thanh;Ryu, Eun-Seok
    • Journal of Internet Computing and Services
    • /
    • v.21 no.4
    • /
    • pp.51-57
    • /
    • 2020
  • Currently, High-efficient video coding (HEVC) has become the most promising video coding technology. However, the implementation of HEVC in video streaming systems is restricted by factors such as cost, design complexity, and compatibility with existing systems. While HEVC is considering deploying to various systems with different reached methods, H264/AVC can be one of the best choices for current video streaming systems. This paper presents an adaptive method for manipulating video streams using video coding on an integrated circuit (IC) designed with a private network processor. The proposed system allows to transfer multimedia data from cameras or other video sources to client. For this work, a series of video or audio packages from the video source are forwarded to the designed IC via HDMI cable, called Tx transmitter. The Tx processes input data into a real-time stream using its own protocol according to the Real-Time Transmission Protocol for both video and audio, then Tx transmits output packages to the video client though internet. The client includes hardware or software video/audio decoders to decode the received packages. Tx uses H264/AVC or HEVC video coding to encode video data, and its audio coding is PCM format. By handling the message exchanges between Tx and the client, the transmitted session can be set up quickly. Output results show that transmission's throughput can be achieved about 50 Mbps with approximately 80 msec latency.

Implementation of an Audio Broadcasting Service over the Internet (인터넷상의 실시간 오디오 방송 서비스 구현)

  • 박준석;고대식
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.23 no.6
    • /
    • pp.1496-1502
    • /
    • 1998
  • In this paper, a real-time audio broadcasting service system which is robust to loaded traffic on the Internet is developed. For implementing reliable real-time data transfer, the transfer characteristics of TCP/IP and UDP/IP was compared and analyzed. For lost packet recovery, redundant audio data algorithm was used and interleaving technique was applied for scattering consecutive packet loss. Test results showed, when using TCP/IP, pause occurred during playback, and when using UDP/IP, a stable receive rate was noticeable but the quality of the sound was lower than that of uisng TCP/IP. The recovery rate using redundant audio data and interleaving technique is shown in Fig. 9 and the delay is shown in Fig 4.

  • PDF

XCRAB : A Content and Annotation-based Multimedia Indexing and Retrieval System (XCRAB :내용 및 주석 기반의 멀티미디어 인덱싱과 검색 시스템)

  • Lee, Soo-Chelo;Rho, Seung-Min;Hwang, Een-Jun
    • The KIPS Transactions:PartB
    • /
    • v.11B no.5
    • /
    • pp.587-596
    • /
    • 2004
  • During recent years, a new framework, which aims to bring a unified and global approach in indexing, browsing and querying various digital multimedia data such as audio, video and image has been developed. This new system partitions each media stream into smaller units based on actual physical events. These physical events within oath media stream can then be effectively indexed for retrieval. In this paper, we present a new approach that exploits audio, image and video features to segment and analyze the audio-visual data. Integration of audio and visual analysis can overcome the weakness of previous approach that was based on the image or video analysis only. We Implement a web-based multi media data retrieval system called XCRAB and report on its experiment result.

Audio Source Separation Method based on Beamspace-domain Multichannel Non-negative Matrix Factorization, Part II: A Study on the Beamspace Transform Algorithms (빔공간-영역 다채널 비음수 행렬 분해 알고리즘을 이용한 음원 분리 기법 Part II: 빔공간-변환 기법에 대한 고찰)

  • Lee, Seok-Jin;Park, Sang-Ha;Sung, Koeng-Mo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.31 no.5
    • /
    • pp.332-339
    • /
    • 2012
  • Beamspace transform algorithm transforms spatial-domain data - such as x, y, z dimension - into incidence-angle-domain data, which is called beamspace-domain data. The beamspace transform method is generally used in source localization and tracking, and adaptive beamforming problem. When the beamspace transform method is used in multichannel audio source separation, the inverse beamspace transform is also important because the source image have to be reconstructed. This paper studies the beamspace transform and inverse transform algorithms for multichannel audio source separation system, especially for the beamspace-domain multichannel NMF algorithm.