• Title/Summary/Keyword: Audio Codec

Search Result 96, Processing Time 0.025 seconds

Novel harmonic coding method for parametric audio codec (하모닉 보상방법에 기반한 파라메트릭 코덱 구현에 관한 연구)

  • Jeong, Jong-Hoon;Lee, Nam-Suk;Lee, Geon-Hyoung
    • Proceedings of the KIEE Conference
    • /
    • 2008.10b
    • /
    • pp.143-144
    • /
    • 2008
  • 본 논문은 오디오 압축시 하모닉의 특성을 적용함으로써 신호의 압축률을 향상시킬 수 있도록 하는 내용을 기술하고 있다. 하모닉 코딩은 오디오 신호가 가지는 특징인 복합음(Complex tone)의 특성을 이용하는 것으로, 주파수 공간에서 정수배의 주파수가 존재하며, 정면파의 특성상 시간적으로 인접 신호들간의 유사성이 매우 높은 특징을 이용하여 압축효율을 향상시키는 방법이다. 하지만 실질적인 오디오 신호의 경우, 악기들의 harmonic stretch, 전달과정에서 발생하는 신호의 왜곡, 외부 잡음등의 특성으로 인하여 수집된 오디오 신호를 분석하는 과정에서 부정확한 하모닉의 판단이 이루어질 가능성이 높으며, 이는 압축과정에서 심각한 음질의 열화를 가져오게 된다. 따라서 본 논문에서는 프레림간의 변화 추이의 판단을 통하여 하모닉의 변화를 예측하고, 예측 오류에 대한 보상값을 전달함으로써 오디오 신호의 안정적인 압축/복원이 가능하도록 하는 신호처리 방법에 대한 내용을 기술하고있다.

  • PDF

Trends of Speech-Based Audio Convergence Codec Technology (음성기반 오디오 융합코덱 기술동향)

  • Kim, D.Y.;Sung, J.M.;Lee, M.S.;Bae, H.J.;Lee, B.S.
    • Electronics and Telecommunications Trends
    • /
    • v.24 no.5
    • /
    • pp.10-19
    • /
    • 2009
  • 본 논문에서는 통신과 방송서비스가 하나의 기기 또는 단말장치 안에서 결합되고 단말 내부에서는 디바이스의 통합에 따라 코덱의 개수를 최소화하기 위한 음성기반 오디오 융합코덱의 기술동향에 대해 기술한다. 하지만 기술적으로 완전히 태생이 다른 음성과 오디오 코덱을 진정한 의미에서 융합할 수 있는 기술적 모델과 기법은 아직 개발되지 않고 있다. 본 고에서는 이러한 시도의 일환으로 ITU-T SGl6을 중심으로 진행되고 있는 음성기반 코덱을 점진적 대역폭 확장 기술을 사용하여 광대역 음성, 슈퍼와이드 밴드 및 향후 오디오 대역까지 커버할 수 있는 임베디드 가변비트율 코덱기술을 중심으로 기술동향의 분석을 시도한다.

Near-field Noise-emission Modeling for Monitoring Multimedia Operations in Mobile Devices

  • Song, Eakhwan;Choi, Jieun;Lee, Young-Jun
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.5 no.6
    • /
    • pp.440-444
    • /
    • 2016
  • In this paper, an equivalent circuit model for near-field noise emission is proposed to implement a multimedia operation-monitoring system for mobile devices. The proposed model includes a magnetic field probe that captures noise emissions from multimedia components, and a transfer function for near-field noise coupling from a transmission line source to a magnetic field probe. The proposed model was empirically verified with transfer function measurements of near-field noise emissions from 10 kHz to 500 MHz. With the proposed model, a magnetic field probe was optimally designed for noise measurement on a camera module and an audio codec in a mobile device. It was demonstrated that the probe successfully captured the near-field noise emissions, depending on the operating conditions of the multimedia components, with enhanced sensitivity from a conventional reference probe.

An effective video multiplexing method for the DMB multimedia services (DMB 멀티미디어 서비스를 위한 효율적인 비디오 다중화 방식)

  • 나남웅;백선혜;홍성훈
    • Proceedings of the IEEK Conference
    • /
    • 2003.11a
    • /
    • pp.267-270
    • /
    • 2003
  • The DMB recently standardized in Korea is a Eureka-147 DAB(Digital Audio Broadcasting)-based standard which is able to provide multimedia services including moving pictures, still images, text and etc. That has the structure to add the MPEG media codec and the MPEG system, namely, video-multiplexer to the DAB system. In this paper, we analysis the video-multiplexer of the DMB standard and propose a new multiplexer, namely. M4GM(MPEG-4 General Mux) included in the DMB vido-multiplexer for the performance improvement with respect to the transmission efficiency and the expansible functions. In addition, we simulate the two video-multiplexers and then compare and estimate their performance entirely.

  • PDF

Design and Implementation of a Bluetooth Baseband Module (블루투스 기저대역 모듈의 설계 및 구현)

  • 천익재;오종환;임지숙;김보관
    • Proceedings of the IEEK Conference
    • /
    • 2001.06a
    • /
    • pp.21-24
    • /
    • 2001
  • Bluetooth wireless technology is a publicly available specification proposed for Radio Frequency (RF) communication for short-range and point-to-multipoint voice and data transfer. It operates in the 2.4GHz ISM(Industrial, Scientific and Medical) band and offers the potential for low-cost, broadband wireless access for various mobile and portable devices at range of about 10 meters. In this paper, we describe the structure and the test results of the bluetooth baseband module we have developed. This module has a UART interface for HCI and a audio codec for voice. The interface between controller and this module supports common control interface. An FPGA implementation of this module is tested for file and bit-stream transfers between PCs.

  • PDF

H.263-Based Scalable Video Codec (H.263을 기반으로 한 확장 가능한 비디오 코덱)

  • 노경택
    • Journal of the Korea Society of Computer and Information
    • /
    • v.5 no.3
    • /
    • pp.29-32
    • /
    • 2000
  • Layered video coding schemes allow the video information to be transmitted in multiple video bitstreams to achieve scalability. they are attractive in theory for two reasons. First, they naturally allow for heterogeneity in networks and receivers in terms of client processing capability and network bandwidth. Second, they correspond to optimal utilization of available bandwidth when several video qualify levels are desired. In this paper we propose a scalable video codec architectures with motion estimation, which is suitable for real-time audio and video communication over packet networks. The coding algorithm is compatible with ITU-T recommendation H.263+ and includes various techniques to reduce complexity. Fast motion estimation is Performed at the H.263-compatible base layer and used at higher layers, and perceptual macroblock skipping is performed at all layers before motion estimation. Error propagation from packet loss is avoided by Periodically rebuilding a valid Predictor in Intra mode at each layer.

  • PDF

Voice Synthesis Detection Using Language Model-Based Speech Feature Extraction (언어 모델 기반 음성 특징 추출을 활용한 생성 음성 탐지)

  • Seung-min Kim;So-hee Park;Dae-seon Choi
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.3
    • /
    • pp.439-449
    • /
    • 2024
  • Recent rapid advancements in voice generation technology have enabled the natural synthesis of voices using text alone. However, this progress has led to an increase in malicious activities, such as voice phishing (voishing), where generated voices are exploited for criminal purposes. Numerous models have been developed to detect the presence of synthesized voices, typically by extracting features from the voice and using these features to determine the likelihood of voice generation.This paper proposes a new model for extracting voice features to address misuse cases arising from generated voices. It utilizes a deep learning-based audio codec model and the pre-trained natural language processing model BERT to extract novel voice features. To assess the suitability of the proposed voice feature extraction model for voice detection, four generated voice detection models were created using the extracted features, and performance evaluations were conducted. For performance comparison, three voice detection models based on Deepfeature proposed in previous studies were evaluated against other models in terms of accuracy and EER. The model proposed in this paper achieved an accuracy of 88.08%and a low EER of 11.79%, outperforming the existing models. These results confirm that the voice feature extraction method introduced in this paper can be an effective tool for distinguishing between generated and real voices.

Real-time Implementation of the AMR-WB+ Audio Coder using ARM Core(R) (ARM Core(R)를 이용한 AMR-WB+ 오디오 부호화기의 실시간 구현)

  • Won, Yang-Hee;Lee, Hyung-Il;Kang, Sang-Won
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.46 no.3
    • /
    • pp.119-124
    • /
    • 2009
  • In this paper, AMR-WB+ audio coder is implemented, in real-time, using Intel 400MHz Xscale PXA250 with 32bit RISC processor ARM9E-J(R)core. The assembly code for ARM9E-J(R)core is developed through the serial process of C code optimization, cross compile, assembly code manual optimization and adjusting the optimized code to Embedded Visual C++ platform. C code is trimmed on Visual C++ platform. Cross compile and assembly code manual optimization are performed on CodeWarrior with ARM compiler. Through these stages the code for both ARM EVM board and PDA is implemented. The average complexities of the code are 160.75MHz on encoder and 33.05MHz on decoder. In case of static link library(SLL), the required memories are 65.21Kbyte, 32.01Kbyte and 279.81Kbyte on encoder, decoder and common sources, respectively. The implemented coder is evaluated using 16 test vectors given by 3GPP to verify the bit-exactness of the coder.

Studies on Joint Source/Channel Coding for MPEG-4 Scalable Video Transmission in Mobile Broadcast Receiving Environments (이동방송수신환경에서 MPEG-4 계층적 비디오 전송을 위한 결합 소스/채널 부호화에 관한 연구)

  • Lee Woon-Moon;Sohn Won
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.3 s.303
    • /
    • pp.31-40
    • /
    • 2005
  • In this paper, we develop an approach toward JSC(Joint Source-Channel Coding) method for MPEG-4 based FGS(Fine Granular Scalability) video coding and transmission in fixed and mobile receiving environment(Digital Audio Broadcasting, DAB). The source coder used MPEG-4 FGS video codec, the channel coder used RCPC(Rate Compatible Punctured Convolution) code and the modulation method used QPSK modulation. We have considered channel environment of AWGN and mobile receiving environment. This study determined optimum Trade-off point between source bit rate and channel coding rate in variable channel states. We compared FGS-JSC method and general single layer CBR(Constant Bit Rate) transmission. In this results, FGS-JSC was appeared better performance than CBR transmission.

MPEG Audio New Standard: USAC Technology (MPEG 오디오 최신 표준: USAC 기술)

  • Lee, Tae-Jin;Kang, Kyeong-Ok;Kim, Whan-Woo
    • Journal of Broadcast Engineering
    • /
    • v.16 no.5
    • /
    • pp.693-704
    • /
    • 2011
  • As mobile devices become multi-functional, and converge into a single platform, there is a strong need for a codec that is able to provide consistent quality for speech and music contents. MPEG-D USAC standardization activities started at the 82nd MPEG meeting with a CfP and approved Study on DIS at the 96th MPEG meeting. MPEG-D USAC is converged technology of AMR-WB+ and HE-AAC V2. Specifically, USAC utilizes three core codecs (AAC, ACELP, and TCX) for low frequency regions, SBR for high frequency regions, the MPEG Surround for stereo information, and window transition technology for smoothing transition between various core coder. USAC can provide consistent sound quality for both speech and music contents and can be applied to various applications such as multi-media download to mobile devices, digital radio, mobile TV and audio books.