• Title/Summary/Keyword: 고정 소수점 연산

Search Result 91, Processing Time 0.03 seconds

Impelementation of Optimized MPEG-4 BSAC Audio based on the embedded system (임베디드 시스템 기반 MPEG-4 BSAC 오디오 최적화 구현)

  • Hwang, Jin-Yong;Park, Jong-Soon;Oh, Hwa-Yong;Kim, Byoung-Ii;Chang, Tae-Gyu
    • Proceedings of the KIEE Conference
    • /
    • 2005.10b
    • /
    • pp.361-363
    • /
    • 2005
  • 본 논문에서는 MPEG-4 Version2 Audio 표준에 근거하여 낮은 연산부담을 갖는 독자적인 엘고리즘을 적용한 MPEG-4 BSAC Audio 디코더를 개발하였다. 개발된 BSAC 디코더는 32bit RISC 구조를 갖는 Intel Xscale Processor 기반 시스템에 최적화하여 구현 및 평가를 수행하였다. 수행속도 증가 및 연산 정밀도 향상을 위해 각 기능 블록별 기능 및 구현 원리 연구와 32 bit 연산 구조를 파악하여, 이를 고정소수점 연산 구조로 구현함으로써 성능을 향상시켰다. 유한비트에 따른 오차 영향을 최소화하기 위해 데이터의 표현 범위에 대한 연구를 통해 근사한 오차를 최소화 하여 연산 정밀도를 향상 시키고자 하였다. 비선형 양자화기 및 filter bank 등 상대적으로 높은 연산 부담을 갖는 기능 블록은 Table look-up, 보간법, 지수연산 제거, pre/post scrambling 기법 등을 적용하여 최적화 하였다. 최종적으로 개발된 BSAC 디코더는 32 bit 연산 구조의 X-scale 프로세서를 탑재한 Development Board와 WindowsCE OS로 구성된 타겟 system에 이식하여 performance 평가하였으며, 높은 연산 정밀도 및 다른 수행속도를 확인할 수 있었다. 주관적인 청각 평가에서도 MPEG-4 reference 디코더와의 음원의 차이가 거의 없음을 확인하였다.

  • PDF

FPGA-based Artificial Neural Network Accelerator Optimization Using Approximate Computing (Approximate computing 기법을 이용한 FPGA 기반 인공 신경망 가속기 최적화)

  • Park, Sangwoo;Kim, Hanyee;Suh, Taeweon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.05a
    • /
    • pp.479-481
    • /
    • 2019
  • 본 연구에서는 이미지를 분류하는 인공 신경망 가속기를 최적화했고, 이를 구현하여 기존 인공 신경망 가속기와 성능을 비교 분석했다. FPGA(Field Programmable Fate Array) 보드를 이용하여 가속기를 구현했으며, 해당 보드의 내부 메모리인 BRAM 을 FIFO(First In First Out)구조로 설계하여 메모리 시스템을 구현했다. Approximate computing 기법을 효율적으로 적용하기 위해 FWL(Fractional Word Length)최적점을 분석했고, 이를 기반으로 인공 신경망 가속기의 부동 소수점 연산을 고정 소수점 연산으로 변환했다. 구현된 인공 신경망 가속기는 기존의 인공 신경망에 비해, 약 7.4%더 효율적인 전력소모량을 보였다.

Loop unrolling and type casting operation for performance improvement in embedded system (임베디드 시스템에서의 성능 향상을 위한 루프 펼침과 형변환)

  • Sung, Woon;Shin, Dong-Young;Park, Joon-Seok
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2012.01a
    • /
    • pp.1-4
    • /
    • 2012
  • 임베디드 시스템에서 최적화 기술의 성능은 크로스 컴파일러의 성능과 실행상황, 대상 하드웨어의 특징 등에 따라 좌우된다. 본 논문에서는 최적화 기술 중 루프 펼침과 형 변환을 이미지 처리 코드에 적용하여 성능을 측정하였다. 그 결과 기술을 적용하지 않은 성능을 기준으로 55%의 성능향상이 이루어졌다.

  • PDF

Real-Time Implementation of the EHSX Speech Coder Using a Floating Point DSP (부동 소수점 DSP를 이용한 4kbps EHSX 음성 부호화기의 실시간 구현)

  • 이인성;박동원;김정호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.5
    • /
    • pp.420-427
    • /
    • 2004
  • This paper presents real time implementation of 4kbps EHSX (Enhanced Harmonic Stochastic Excitation) speech coder that combines the harmonic vector excitation coding with time-separated transition coding. The harmonic vector excitation coding uses the harmonic excitation coding for voiced frames and used the vector excitation coding with the structure of analysis-by-synthesis for unvoiced frames, respectively. For transition frames mixed with voiced and unvoiced signal, we use the time-separated transition coding. In this paper. we present the optimization methods of implementation speech coder on the EMS320C6701/sup (R)/ DSP. To reduce the complex for real-time implementation. we perform the optimization method in algorithm by replacing the complex sinusoidal synthesis method with IFFT. and we apply fully pipelines hand assembly coding after converting it from floating source to fixed source. To generate a more efficient code. we also make use or the available EMS320C6701/sup (R)/ resources such as Fastest67x library and memory organization.

A Fast Background Subtraction Method Robust to High Traffic and Rapid Illumination Changes (많은 통행량과 조명 변화에 강인한 빠른 배경 모델링 방법)

  • Lee, Gwang-Gook;Kim, Jae-Jun;Kim, Whoi-Yul
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.3
    • /
    • pp.417-429
    • /
    • 2010
  • Though background subtraction has been widely studied for last decades, it is still a poorly solved problem especially when it meets real environments. In this paper, we first address some common problems for background subtraction that occur in real environments and then those problems are resolved by improving an existing GMM-based background modeling method. First, to achieve low computations, fixed point operations are used. Because background model usually does not require high precision of variables, we can reduce the computation time while maintaining its accuracy by adopting fixed point operations rather than floating point operations. Secondly, to avoid erroneous backgrounds that are induced by high pedestrian traffic, static levels of pixels are examined using shot-time statistics of pixel history. By using a lower learning rate for non-static pixels, we can preserve valid backgrounds even for busy scenes where foregrounds dominate. Finally, to adapt rapid illumination changes, we estimated the intensity change between two consecutive frames as a linear transform and compensated learned background models according to the estimated transform. By applying the fixed point operation to existing GMM-based method, it was able to reduce the computation time to about 30% of the original processing time. Also, experiments on a real video with high pedestrian traffic showed that our proposed method improves the previous background modeling methods by 20% in detection rate and 5~10% in false alarm rate.

Real-Time Implementation of Acoustic Echo Canceller for Mobile Handset Using TeakLite DSP Core (Teaklite DSP Core 를 이용한 이동통신 단말기용 음향반향제거기의 실시간 구현)

  • Gwon, Hong-Seok;Kim, Si-Ho;Jang, Byeong-Uk;Bae, Geon-Seong
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.39 no.2
    • /
    • pp.128-136
    • /
    • 2002
  • In this paper, we developed an acoustic echo canceller in real-time using TeakLite DSP Core, which will be placed in the vocoder chip of a mobile handset. Considering the limited computational capacity given to the acoustic echo canceller in a vocoder chip, we employed a FIR-type adaptive filter using a conventional NLMS algorithm. To begin with, we designed and implemented an acoustic echo canceller with floating-point format C-source code, and then converted it into fixed-point format through integer simulation. Then we programmed and optimized it in the assembler level to make it run ill real-time. After optimization procedure, the implemented echo canceller has approximately 624 words of program memory and 811 words of data memory. With 8 KHz sampling rate and 256 filter taps in the echo canceller that corresponds to 32 msec of echo delay, it requires 14.12 MIPS of computational capacity. For coverage of 16 msec echo delay, i.e., 128 filter taps, 9 MIPS is requited.

A Real-Time JPEG2000 Codec Implementation on ARM9 Processor (ARM9 프로세서용 실시간 JPEG2000 코덱의 구현)

  • Kim, Young-Tae;Cho, Shi-Won;Lee, Dong-Wook
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.8 no.3
    • /
    • pp.149-155
    • /
    • 2007
  • In this paper, we propose an real-time implementation of JPEG2000 codec on the ARM9 processor. The implemented codec is designed to separate control codes from data management codes in order to use effectively the system resources such as processor and memory. Especially, in embedded situations like cellular phones it is very important to provide good services using limited processor and internal memory. Since ARM9 series processors do not provide floating-point, large amount of computational time is required to perform the operation which needs highly repetitive floating-point computations like DWT(discrete wavelet transform). The proposed codec was programed using fixed-point to overcome this weakness. Also code optimization considering cache memory was applied to further improve the computational speed.

  • PDF

Real-time Implementation of Speech and Channel Coder on a DSP Chip for Radio Communication System (무선통신 적용을 위한 단일 DSP칩상의 음성/채널 부호화기 실시간 구현)

  • Kim Jae-Won;Sohn Dong-Chul
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.9 no.6
    • /
    • pp.1195-1201
    • /
    • 2005
  • This paper deals with procedures and results for teal time implementation of G.729 speech coder and channel coder including convolution codec, viterbi decoder, and interleaver using a fixed point DSP chip for radio communication systems. We described the method for real-time implementation based on integer simulation results and explained the implemented results by quality performance and required complexity for real-time operation. The required complexity was 24MIPS and 9MIPS in computational load, and 12K words and 4K words in execution code length for speech and channel. The functional evaluation was performed into two steps. The one was bit exact comparison with a fixed point C code, the other was executed by actual speech samples and error test vectors. Unlik other results such as individual implementation, We implemented speech and channel coders on a DSP chip with 160MIPS computation capability and 64 K words memory on chip. This results outweigh the conventional methods in the point of system complexity and implementation cost for radio communication system.

OpenGL ES 1.1 Implementation Using OpenGL (OpenGL을 이용한 OpenGL ES 1.1 구현)

  • Lee, Hwan-Yong;Baek, Nak-Hoon
    • The KIPS Transactions:PartA
    • /
    • v.16A no.3
    • /
    • pp.159-168
    • /
    • 2009
  • In this paper, we present an efficient way of implementing OpenGL ES 1.1 standard for the environments with hardware-supported OpenGL API, such as desktop PCs. Although OpenGL ES was started from the existing OpenGL features, it becomes a new three-dimensional graphics library customized for embedded systems through introducing fixed-point arithmetic operations, buffer management with fixed-point data type supports, completely new texture mapping functionalities and others. Currently, it is the official three dimensional graphics library for Google Android, Apple iPhone, PlayStation3, etc. In this paper, we achieved improvements on the arithmetic operations for the fixed-point number representation, which is the most characteristic data type for OpenGL ES. For the conversion of fixed-point data types to the floating-point number representations for the underlying OpenGL, we show the way of efficient conversion processes even with satisfying OpenGL ES standard requirements. We also introduced a simple memory management scheme to mange the converted data for the buffer containing fixed-point numbers. In the case of texture processing, the requirements in both standards are quite different and thus we used completely new software-implementations. Our final implementation result of OpenGL ES library provides all of over than 200 functions in OpenGL ES 1.1 standard and completely passed its conformance test, to show its compliance with the standard. From the efficiency viewpoint, we measured its execution times for several OpenGL ES-specific application programs and achieved at most 33.147 times improvements, to become the fastest one among the OpenGL ES implementations in the same category.

High Level Design and Performance Evaluation for the Implementation of WCDMA Base Station Modem (WCDMA 기지국 모뎀의 구현을 위한 상위 레벨 설계 및 통합 성능 평가)

  • Do Joo-Hyun;Lee Young-Yong;Chung Sung-Hyun;Choi Hyung-Jin
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.1A
    • /
    • pp.10-27
    • /
    • 2005
  • In this paper, we propose a high level design architecture of WCDMA(UMTS) base station modem and synchronization algorithms applied to the proposed architecture. Also analysis of each synchronization algorithm and performance evaluation of fixed point designed modem are shown. Since the target system is base station modem, each synchronization algorithm is designed for its stable operation. To minimize implementation complexity, optimum fixed point design for best operation of synchronization algorithms is performed. We performed symbol level link simulation with fixed point designed modem simulator for data rate of 12.2kbps, 64kbps, 144kbps, and 384kbps. We compared performance results to the minimum requirements specified in 3GPP TS 25.104(Release 5). Extensive computer simulation shows that the proposed modem architecture has stable operation and outperform the minimum requirement by 2 dB. The proposed modem architecture has been applied in the implementation of WCDMA reverse link receiver modem chip successfully.