• Title/Summary/Keyword: Audio Data

Search Result 883, Processing Time 0.027 seconds

A Fast IFFT Algorithm for IMDCT of AAC Decoder (AAC 디코더의 IMDCT를 위한 고속 IFFT 알고리즘)

  • Chi, Hua-Jun;Kim, Tae-Hoon;Park, Ju-Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.5
    • /
    • pp.214-219
    • /
    • 2007
  • This paper proposes a new IFFT(Inverse Fast Fourier Transform) algorithm, which is proper for IMDCT(Inverse Modified Discrete Cosine Transform) of MPEG-2 AAC(Advanced Audio Coding) decoder. The $2^n$(N-point) type IMDCT is the most powerful among many IMDCT algorithms, however it includes IFFT that requires many calculation cycles. The IFFT used in $2^n$(N-point) type IMDCT employ the bit-reverse data arrangement of inputs and N/4-point complex IFFT to reduce the calculation cycles. We devised a new data arrangement method of IFFT input and $N/4^{n+1}$-type IFFT and thus we can reduce multiplication cycles, addition cycles, and ROM size.

Effect on Audio Play Latency for Real-Time HMD-Based Headphone Listening (HMD를 이용한 오디오 재생 기술에서 Latency의 영향 분석)

  • Son, Sangmo;Jo, Hyun;Kim, Sunmin
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2014.10a
    • /
    • pp.141-145
    • /
    • 2014
  • A minimally appropriate time delay of audio data processing is investigated for rendering virtual sound source direction in real-time head-tracking environment under headphone listening. Less than 3.7 degree of angular mismatch should be maintained in order to keep desired sound source directions in virtually fixed while listeners are rotating their head in a horizontal plane. The angular mismatch is proportional to speed of head rotation and data processing delay. For 20 degree/s head rotation, which is a relatively slow head-movement case, less than total of 63ms data processing delay should be considered.

  • PDF

Implementation and evaluation of lost packet recovery using low-bitrate redundant audio data (저비트율 잉여오디오 정보를 이용한 손실 패킷 복구 방법의 구현 및 성능 평가)

  • 박준석;고대식
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.35S no.7
    • /
    • pp.1-5
    • /
    • 1998
  • In this paper, recovery method with high-bitrate and low-bitrate coder was implemented in order to recover consecutive packet loss over the Internet. LPC was used as redundant audio data for recover of lost packets and RTP parcket format was modified for accommodation of redundant data. In measuring results using random packet loss rate with three redundant datra in every packet, it has shown that recovery rate was 80% in los rate of 50%. Since the processing delay for recovery of the lost packet was 200ms, this recovery method can be applied to real-time Internet sevice such as Internet phone.

  • PDF

Korean Emotional Speech and Facial Expression Database for Emotional Audio-Visual Speech Generation (대화 영상 생성을 위한 한국어 감정음성 및 얼굴 표정 데이터베이스)

  • Baek, Ji-Young;Kim, Sera;Lee, Seok-Pil
    • Journal of Internet Computing and Services
    • /
    • v.23 no.2
    • /
    • pp.71-77
    • /
    • 2022
  • In this paper, a database is collected for extending the speech synthesis model to a model that synthesizes speech according to emotions and generating facial expressions. The database is divided into male and female data, and consists of emotional speech and facial expressions. Two professional actors of different genders speak sentences in Korean. Sentences are divided into four emotions: happiness, sadness, anger, and neutrality. Each actor plays about 3300 sentences per emotion. A total of 26468 sentences collected by filming this are not overlap and contain expression similar to the corresponding emotion. Since building a high-quality database is important for the performance of future research, the database is assessed on emotional category, intensity, and genuineness. In order to find out the accuracy according to the modality of data, the database is divided into audio-video data, audio data, and video data.

Video Summarization Using Eye Tracking and Electroencephalogram (EEG) Data (시선추적-뇌파 기반의 비디오 요약 생성 방안 연구)

  • Kim, Hyun-Hee;Kim, Yong-Ho
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.56 no.1
    • /
    • pp.95-117
    • /
    • 2022
  • This study developed and evaluated audio-visual (AV) semantics-based video summarization methods using eye tracking and electroencephalography (EEG) data. For this study, twenty-seven university students participated in eye tracking and EEG experiments. The evaluation results showed that the average recall rate (0.73) of using both EEG and pupil diameter data for the construction of a video summary was higher than that (0.50) of using EEG data or that (0.68) of using pupil diameter data. In addition, this study reported that the reasons why the average recall (0.57) of the AV semantics-based personalized video summaries was lower than that (0.69) of the AV semantics-based generic video summaries. The differences and characteristics between the AV semantics-based video summarization methods and the text semantics-based video summarization methods were compared and analyzed.

Efficient DSP Architecture For High- Quality Audio Algorithms (고음질 오디오 알고리즘을 위한 효율적인 DSP 설계)

  • Moon, Jong-Ha;SunWoo, Myung-Hoon
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.5
    • /
    • pp.112-117
    • /
    • 2007
  • This paper presents specialized DSP instructions and their hardware architecture for audio coding algorithms, such as the MPEG-2/4 Advanced Audio Coding(AAC), Dolby AC-3, MPEG-2 Backward Compatible(BC), etc. The proposed architecture is specially designed and optimized for the MDCT/IMDCT(Inverse Modified Discrete Cosine Transform), and Huffman decoding of the AAC decoding algorithm. Performance comparisons show a significant improvement compared with TMS320C62x and ASDSP21060 for the MDCT/IMDCT computation. In addition, the dedicated Huffman decoding accelerator performs decoding and preparing operand in only one cycle. The proposed DPU(Data Processing Unit) consists of 107,860 gates and achieves 150 MIPS.

The Android-based Bluetooth Device Application Design and Implementation (안드로이드 기반의 블루투스 디바이스 응용 설계 및 구현)

  • Cho, Hyo-Sung;Lee, Hyuk-Joon
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.11 no.1
    • /
    • pp.72-85
    • /
    • 2012
  • Today, although most bluetooth hands-free devices within a vehicle provide telephone service functions such as voice communication, caller id display and SMS message display and so on, they do not provide a function that displays Internet-based text data. We need to develop a scheme that displays the internet-based text data including existing hands-free function because the request for using the Internet service is increasing within a vehicle recently. The proposed bluetooth device application includes advanced function such as SNS message arrival notification, the message display function and we chose Android as the implementation mobile platform giving consideration to the fact that most SNS applications operate on Android and the platform is easily embedded into small embedded device. Smartphone or tablet PC connected with the proposed bluetooth device is an Android-based device and we designed a form of Android app for the function implementation of the devices. When the audio-text gateway app receives SNS text data, it extracts title and sender information from the message header information in a form of text data and sends them via ACL (Asynchronous Connection-Oriented) link to the bluetooth device showing the data on the screen. Android-based bluetooth devices are not possible to play voice through speaker because the bluetooth hands-free or headset profile ported within Android platform normally only includes audio gateway's function. The proposed bluetooth device application, therefore, applies the streaming scheme that sends data via ACL link instead of the way that sending them via SCO (Synchronous Connection-Oriented) link.

The Content Based Analysis According to the Composition of the Feature Parameters for the Auditory Data (오디오 데이터의 특징 파라메터 구성에 따른 내용기반 분석)

  • 한학용;허강인;김수훈
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.2
    • /
    • pp.182-189
    • /
    • 2002
  • In this paper, we research the content-based analysis and classification according to the composition of the feature parameters pool for the auditory signals to implement the auditory indexing and searching system. Auditory data is classified to the primitive various auditory types. we described the analysis and feature extraction method for the feature parameters available to the auditory data classification. And we compose the feature parameters pool in the indexing group unit, then compare and analysis the auditory data centering around the including level and indexing criterion into the audio categories. Based on this result, we composed the classification procedure and simulate the auditory data classification.

A VLSI DESIGN OF CD SIGNAL PROCESSOR for High-Speed CD-ROM

  • Kim, Jae-Won;Kim, Jae-Seok;Lee, Jaeshin
    • Proceedings of the IEEK Conference
    • /
    • 2002.07b
    • /
    • pp.1296-1299
    • /
    • 2002
  • We implemented a CD signal processor operated on a CAV 48-speed CD-ROM drive into a VLSI. The CD signal processor is a mixed mode monolithic IC including servo-processor, data recovery, data-processor, and I-bit DAC. For servo signal processing, we included a DSP core, while, for CAV mode playback, we adopted a PLL with a wide recovery range. Data processor (DP) was designed to meet the yellow book specification.[2]So, the DP block consists of EFM demodulator, C1/C2 ECC block, audio processor and a block transferring data to an ATAPI chip. A modified Euclid's algorithm was used as a key equation solver for the ECC block To achieve the high-speed decoding, the RS decoder is operated by a pipelined method. Audio playability is increased by playing a CD-DA disc at the speed of 12X or 16X. For this, subcode sync and data are processed in the same way as main data processing. The overall performance of IC is verified by measuring a transfer rate from the innermost area of disc to the outermost area. At 48-speed, the operating frequency is 210 ㎒, and this chip is fabricated by 0.35 um STD90 cell library of Samsung Electronics.

  • PDF

VLSI Design of a 2048 Point FFT/IFFT by Sequential Data Processing for Digital Audio Broadcasting System (순차적 데이터 처리방식을 이용한 디지틀 오디오 방송용 2048 Point FFT/IFFT의 VLSI 설계)

  • Choe, Jun-Rim
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.39 no.5
    • /
    • pp.65-73
    • /
    • 2002
  • In this paper, we propose and verify an implementation method for a single-chip 2048 complex point FFT/IFFT in terms of sequential data processing. For the sequential processing of 2048 complex data, buffers to store the input data are necessary. Therefore, DRAM-like pipelined commutator architecture is used as a buffer. The proposed structure brings about the 60% chip size reduction compared with conventional approach by using this design method. The 16-point FFT is a basic building block of the entire FFT chip, and the 2048-point FFT consists of the cascaded blocks with five stages of radix-4 and one stage of radix-2. Since each stage requires rounding of the resulting bits while maintaining the proper S/N ratio, the convergent block floating point (CBFP) algorithm is used for the effective internal bit rounding and their method contributed to a single chip design of digital audio broadcasting system.