• Title/Summary/Keyword: codecs

Search Result 114, Processing Time 0.021 seconds

Enhancement of Super-wideband Coder by Considering Audio Feature in MDCT Domain (MDCT 도메인에서 오디오 신호 특징을 고려한 초광대역 코덱 개선)

  • Hong, Ki-Bong;Jeong, Gyu-Hyeok;Lee, In-Sung
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.5
    • /
    • pp.129-136
    • /
    • 2011
  • This paper presents the coding method that have multi-mode and efficiency of audio codecs using the feature of audio signal. Recently, the developed extension super-wideband codec based on G.718 wideband divides two mode between Generic and Sinusiodal. So codec efficently encode audio signal exist in super-wideband. But the codec is not as efficent coding for harmonic component of wind instrument and string instrument and individual-Line component of percussion instrument. The proposed method are modeling and encoding multiple pitch and individual-line feature using multi mode coding. For the performance evaluation, we used SNR in MDCT domain for objective test and MUSHRA test for subjective test. As a result, the performance of SNR and MUSHRA test of the proposed method have better performance than the G.718 super-wideband codec.

An Algorithm with Low Complexity for Fast Motion Estimation in Digital Video Coding (디지털 비디오 부호화에서의 고속 움직임 추정을 위한 저복잡도 알고리즘)

  • Lee, Seung-Chul;Kim, Min-Ki;Jeong, Je-Chang
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.31 no.12C
    • /
    • pp.1232-1239
    • /
    • 2006
  • In video standards such as MPEG-1/2/4 and H.264/AVC, motion estimation / compensation(ME/MC) process causes the most encoding complexity of video encoder. The full search method, which is used in general video codecs, exhausts much encoding time because it compares current macroblock with those at all positions within search window for searching a matched block. For the alleviation of this problem, the fast search methods such as TSS, NTSS, DS and HEXBS are exploited at first. Thereafter, DS based MVFAST, PMVFAST, MAS and FAME, which utilize temporal or spacial correlation characteristics of motion vectors, are developed. But there remain the problems of image quality degradation and algorithm complexity increase. In this thesis, the proposed algorithm maximizes search speed and minimizes the degradation of image quality by determining initial search point correctly and using simple one-dimension search patterns considering motion characteristics of each frame.

MPEG Audio New Standard: USAC Technology (MPEG 오디오 최신 표준: USAC 기술)

  • Lee, Tae-Jin;Kang, Kyeong-Ok;Kim, Whan-Woo
    • Journal of Broadcast Engineering
    • /
    • v.16 no.5
    • /
    • pp.693-704
    • /
    • 2011
  • As mobile devices become multi-functional, and converge into a single platform, there is a strong need for a codec that is able to provide consistent quality for speech and music contents. MPEG-D USAC standardization activities started at the 82nd MPEG meeting with a CfP and approved Study on DIS at the 96th MPEG meeting. MPEG-D USAC is converged technology of AMR-WB+ and HE-AAC V2. Specifically, USAC utilizes three core codecs (AAC, ACELP, and TCX) for low frequency regions, SBR for high frequency regions, the MPEG Surround for stereo information, and window transition technology for smoothing transition between various core coder. USAC can provide consistent sound quality for both speech and music contents and can be applied to various applications such as multi-media download to mobile devices, digital radio, mobile TV and audio books.

Optimal Coding Model for Screen Contents Applications from the Coding Performance Analysis of High Efficient Coding Tools in HEVC (HEVC 고성능 압축 도구들의 성능 분석을 통한 스크린 콘텐츠 응용 최적 부호화 모델)

  • Han, Chan-Hee;Lee, Si-Woong
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.12
    • /
    • pp.544-554
    • /
    • 2012
  • Screen content refers to images or videos generated by various electronic devices such as computers or mobile phones, whereas natural content refers to images captured by cameras. Screen contents show different statistical characteristics from natural images, so the conventional video codecs which were developed mainly for the coding of natural videos cannot guarantee good coding performances for screen contents. Recently, researches on efficient SCC(Screen Content Coding) are being actively studied, and especially at ongoing JCT-VC(Joint Collaborative Team on Video Coding) meeting for HEVC(High Efficiency Video Coding) standard, SCC issues are being discussed steadily. In this paper, we analyze the performances of high efficient coding tools in HM(HEVC Test Model) on SCC, and present an optimized SCC model based on the analysis results. We also present the characteristics of screen contents and the future research issues as well.

Implementation of Embedded Speech Recognition System for Supporting Voice Commander to Control an Audio and a Video on Telematics Terminals (텔레메틱스 단말기 내의 오디오/비디오 명령처리를 위한 임베디드용 음성인식 시스템의 구현)

  • Kwon, Oh-Il;Lee, Heung-Kyu
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.42 no.11
    • /
    • pp.93-100
    • /
    • 2005
  • In this paper, we implement the embedded speech recognition system to support various application services such as audio and video control using speech recognition interface on cars. The embedded speech recognition system is implemented and ported in a DSP board. Because MIC type and speech codecs affect the accuracy of speech recognition. And also, we optimize the simulation and test environment to effectively remove the real noises on a car. We applied a noise suppression and feature compensation algorithm to increase an accuracy of sppech recognition on a car. And we used a context dependent tied-mixture acoustic modeling. The performance evaluation showed high accuracy of proposed system in office environment and even real car environment.

An Efficient Partial Distortion Search Algorithm using the Spatial and Temporal Correlations for Fast Motion Estimation (고속 움직임 추정을 위한 시공간적 상관관계 기반의 효율적인 부분 왜곡 탐색 알고리즘)

  • Ha, Dong-Won;Cho, Hyo-Moon;Lee, Jong-Hwa
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.1
    • /
    • pp.79-85
    • /
    • 2010
  • In video standards such as H.264/AVC, motion estimation (ME) / compensation (MC) is regarded as a vital component in a video coder as it consumes a large amount of computation resources. The full search technique, which is used in general video codecs, gives the highest visual quality but also has the problem of significant computational load. To solve this problem, many fast algorithm has benn proposed. Among them, NPDS show that can maintain its video quality very close to the full search technique while achieving computation reduction by using a halfway-stop technique in the calculation of block distortion measure. In this paper, we proposed algorithm by determining minimum distortion measure with predictive motion vector and using the new search order. As the result, we can check that the proposed algorithm reduces the computational load 95% in average compared to the full search, respectively with the PSNR lost about 0.04dB.

Matching Pursuit Sinusoidal Modeling with Damping Factor (Damping 요소를 첨가한 매칭 퍼슈잇 정현파 모델링)

  • Jeong, Gyu-Hyeok;Kim, Jong-Hark;Lim, Joung-Woo;Joo, Gi-Ho;Lee, In-Sung
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.1
    • /
    • pp.105-113
    • /
    • 2007
  • In this paper, we propose the matching pursuit with damping factors, a new sinusoidal model improving the matching pursuit, for the codecs based on sinusoidal model. The proposed model defines damping factors by using a correlativity of parameters between the current and adjacent frame, and estimates sinusoidal parameters more accurately in analysis frame by using the matching pursuit according to damping factor, and synthesizes the final signal. Then it is possible to model efficiently without interpolation schemes. The proposed sinusoidal model shows a better speech quality without an additional delay than the conventional sinusoidal model with interpolation methods. Through the SNR(signal to noise ratio), the MOS(Mean Opinion Score), LR(Itakura-Saito likelihood ratio), and CD(cepstral distance), we compare the performance of our model with that of matching pursuit using interpolation methods.

Design and Implementation of High-quality Video Service with Adaptive Transport for Multi-party Collaborative Environments (다자간 원격 협업을 위한 적응형 전송 기능을 가진 고화질 영상 서비스의 설계 및 구현)

  • Han, Sang-Woo;Kim, Jong-Won
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.31 no.1B
    • /
    • pp.26-38
    • /
    • 2006
  • To construct seamless collaborative environments, what all participants intent should be delivered, and visual elements such gesture, facial expression, and ambiance should be shared with all participants. In this paper, we propose high-quality video service to support DV(digital video) and HDV(high-definition DV) based on Access Grid(AG) which is a prevalent collaborative system. The proposed service is designed for employing versatile media tools and codecs with SDP(session description protocol) and SAP(session announcement protocol). We also design network-adaptive video transmission module to mitigate the impact of network fluctuation. This periodically monitors multicast performance and controls frame rate on sender side considering network condition. The experimental results over the test bed show that proposed service enhances quality of AG video service and provides seamless high-quality video transport by mitigating the impact of network fluctuation.

The Efficient 32×32 Inverse Transform Design for High Performance HEVC Decoder (고성능 HEVC 복호기를 위한 효율적인 32×32 역변환기 설계)

  • Han, Geumhee;Ryoo, Kwangki
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.4
    • /
    • pp.953-958
    • /
    • 2013
  • In this paper, an efficient hardware architecture is proposed for $32{\times}32$ inverse transform HEVC decoder. HEVC is a new image compression standard to deal with much larger image sizes compared with conventional image codecs, such as 4k, 8k images. To process huge image data effectively, it adopts various new block structures. Theses blocks consists of $4{\times}4$, $8{\times}8$, $16{\times}16$, and $32{\times}32$ block. This paper suggests an effective structures to process $32{\times}32$ inverse transform. This structure of inverse transform adopts the decomposed $16{\times}16$ matrixes of $32{\times}32$ matrix, and simplified the operations by implementing multiplying with shifters and adders. Additionally the operations frequency is downed by using multicycle paths. Also this structure can be easily adopted to a multi-size transform or a forward transform block in HEVC codec.

Design and Implementation of a H.264 Video player based on DirectShow via Bluetooth (블루투스를 이용한 DirectShow기반의 H.264 동영상 플레이어의 설계 및 구현)

  • Park, Tae-Jun;Cho, Tai-Hoon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.4
    • /
    • pp.493-498
    • /
    • 2009
  • Bluetooth is a popular wireless data transmission method with low power consumption, but it has low data transmission rate. Thus, although many video stream players of a local or network file exist, there have been few players of video stream transmitted via Bluetooth. MPEG-4 AVC/H.264 codec is one of video codecs available with best compression rates for a certain quality, so a H.264 encoder seems to be adequate for video stream to be transmitted via Bluetooth. In this paper, we present a DirectShow filter based player of video stream encoded by H.264 codec, which is transmitted via Bluetooth. Details on the design and implementation of this program are described. Experimental results are shown to demonstrate the validity of the implemented program using various video samples.