Processing math: 100%
  • Title/Summary/Keyword: Software Pipelining

Search Result 23, Processing Time 0.027 seconds

An Improved Implementation of Block Matching Algorithm on a VLIW-based DSP (VLIW 기반 DSP에서의 개선된 블록매칭 알고리즘 구현)

  • You, Hui-Jae;Chung, Sun-Tae;Jung, Sou-Hwan
    • Proceedings of the IEEK Conference
    • /
    • 2007.07a
    • /
    • pp.225-226
    • /
    • 2007
  • In this paper, we present our study about the optimization of the block matching algorithm on a VLIW based DSP. The block matching algorithm is well known for its computational burden in motion picture encoding. As supposed to the previous researches where the optimization is achieved by optimizing SAD, the most heavy routine of the block matching, we optimize the block matching algorithm by applying software pipelining technique to the whole routine of the algorithm. Through experiments, the efficiency of the proposed optimization is verified.

  • PDF

Improving Software Pipelining Performance Using a Register Renaming Technique (소프트웨어 파이프라이닝에서 레지스터 변경을 통한 성능 개선)

  • Cho, Doosan
    • Annual Conference of KIPS
    • /
    • 2010.11a
    • /
    • pp.1642-1643
    • /
    • 2010
  • 멀티미디어 도메인의 응용 프로그램에는 풍부한 병렬성이 내재하기 때문에 VLIW (Very Long Instruction Word) 형식의 신호처리 프로세서가 많이 사용되고 있다. VLIW 프로세서를 구성하는 복수의 연산처리유닛 (processing unit, PU)의 사용률은 컴파일러의 명령어 스케쥴러의 성능에 의하여 결정된다. 명령어들 사이의 병렬성을 분석하여 동시 수행가능한 명령어들을 각기 다른 PU 에서 수행되도록 프로그램 코드를 최적화한다. 하지만 기존의 명령어 스케쥴러는 복잡한 데이터 디펜던스 그래프 (data dependence graph, DDG)를 구성하여 복수의 PU 를 충분히 사용하도록 하지 못하는 문제점을 내재하고 있다. 이는 명령어 스케쥴러가 각 레지스터 사용시간을 별도로 고려하지 않기 때문에 실제로 내재된 데이터 디펜던스 보다 복잡성이 높은 디펜던스 그래프를 구성하게 되어 스케쥴러가 올바르게 최적화된 코드 스케쥴링 결과를 제공하지 못하기 때문이다. 본 연구에서는 레지스터의 라이프타임을 다른 레지스터를 이용하여 적절히 끊어주는 것으로 데이터 디펜던스 복잡도 완화하여 시스템 성능 향상의 가능성을 보이고 있다.

A High Speed Code Dissemination Protocol for Software Update in Wireless Sensor Network (무선 센서 네트워크상의 소프트웨어 업데이트를 위한 고속 코드 전파 프로토콜)

  • Cha, Jeong-Woo;Kim, Il-Hyu;Kim, Chang-Hoon;Kwon, Young-Jik
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.13 no.5
    • /
    • pp.168-177
    • /
    • 2008
  • The code propagation is one of the most important technic for software update in wireless sensor networks. This paper presents a new scheme for code propagation using network coding. The proposed code propagation method roughly shows 2025% performance improvement according to network environments in terms of the number of data exchange compared with the previously proposed pipelining scheme. As a result, we can efficiently perform the software update from the viewpoint of speed, energy, and network congestion when the proposed code propagation system is applied. In addition, the proposed system solves the overhearing problems of network coding such as the loss of original messages and decoding error using the predefined message. Therefore, our system allows a software update system to exchange reliable data in wireless sensor networks.

  • PDF

DSP Optimization for Rain Detection and Removal Algorithm (비 검출 및 제거 알고리즘의 DSP 최적화)

  • Choi, Dong Yoon;Seo, Seung Ji;Song, Byung Cheol
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.9
    • /
    • pp.96-105
    • /
    • 2015
  • This paper proposes a DSP optimization solution of rain detection and removal algorithm. We propose rain detection and removal algorithms considering camera motion, and also presents optimization results in algorithm level and DSP level. At algorithm level, this paper utilizes a block level binary pattern analysis, and reduces the operation time by using the fast motion estimation algorithm. Also, the algorithm is optimized at DSP level through inter memory optimization, EDMA, and software pipelining for real-time operation. Experiment results show that the proposed algorithm is superior to the other algorithms in terms of visual quality as well as processing speed.

Design and Analysis of MPEG-2 MP@HL Decoder in Multi-Processor Environments

  • Yoo, Seung-Hwan;Lee, Hyun-Seung;Lee, Sang-Jo;Park, Rae-Hong;Kim, Do-Hyung
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2009.01a
    • /
    • pp.211-216
    • /
    • 2009
  • As demands for high-definition television (HDTV) increase, the implementation of real-time decoding of high-definition (HD) video becomes an important issue. The data size for HD video is so large that real-time processing of the data is difficult to implement, especially with software. In order to implement a fast moving picture expert group-2 decoder for HDTV, we compose five scenarios that use parallel processing techniques such as data decomposition, task decomposition, and pipelining. Assuming the multi digital signal processor environments, we analyze each scenario in three aspects: decoding speed, L1 memory size, and bandwidth. By comparing the scenarios, we decide the most suitable cases for different situations. We simulate the scenarios in the dual-core and dual-central processing unit environment by using OpenMP and analyze the simulation results.

  • PDF

Hardware Implementation of Genetic Algorithm and Its Analysis (유전알고리즘의 하드웨어 구현 및 실험과 분석)

  • Dong, Sung-Soo;Lee, Chong-Ho
    • 전자공학회논문지 IE
    • /
    • v.46 no.2
    • /
    • pp.7-10
    • /
    • 2009
  • This paper presents the implementation of libraries of hardware modules for genetic algorithm using VHDL. Evolvable hardware refers to hardware that can change its architecture and behavior dynamically and autonomously by interacting with its environment. So, it is especially suited to applications where no hardware specifications can be given in advance. Evolvable hardware is based on the idea of combining reconfigurable hardware device with evolutionary computation, such as genetic algorithm. Because of parallel, no function call overhead and pipelining, a hardware genetic algorithm give speedup over a software genetic algorithm. This paper suggests the hardware genetic algorithm for evolvable embedded system chip. That includes simulation results and analysis for several fitness functions. It can be seen that our design works well for the three examples.

Hardware Implementation of Genetic Algorithm for Evolvable Hardware (진화하드웨어 구현을 위한 유전알고리즘 설계)

  • Dong, Sung-Soo;Lee, Chong-Ho
    • 전자공학회논문지 IE
    • /
    • v.45 no.4
    • /
    • pp.27-32
    • /
    • 2008
  • This paper presents the implementation of simple genetic algorithm using hardware description language for evolvable hardware embedded system. Evolvable hardware refers to hardware that can change its architecture and behavior dynamically and autonomously by interacting with its environment. So, it is especially suited to applications where no hardware specifications can be given in advance. Evolvable hardware is based on the idea of combining reconfigurable hardware device with evolutionary computation, such as genetic algorithm. Because of parallel, no function call overhead and pipelining, a hardware genetic algorithm give speedup over a software genetic algorithm. This paper suggests the hardware genetic algorithm for evolvable embedded system chip. That includes simulation results for several fitness functions.

Implementing Swing Modulo Scheduler for VLIW Processor (VLIW 프로세서를 위한 Swing Modulo Scheduler 구현)

  • Shin, Jangseop;Han, Sangjun;Jung, Hyungyun;Ahn, Minwook;Youn, Jonghee M.;Paek, Yunheung
    • Annual Conference of KIPS
    • /
    • 2014.04a
    • /
    • pp.12-14
    • /
    • 2014
  • 하드웨어가 해저드(hazard) 검출을 지원하지 않는 멀티이슈 VLIW 프로세서의 성능을 높이기 위해서는 컴파일러가 명령어 의존성과 하드웨어 자원의 제약을 지키는 범위 안에서 최대한 명령어수준 병렬성(ILP)을 활용하는 것이 중요하다. 기본 블록(basic block) 스케쥴링은 Branch 등 제어 흐름(control flow)의 경계를 넘어선 스케쥴링을 행하지 않아 그 효과가 제한적이다. 소프트웨어 파이프라이닝(software pipelining)은 루프(loop)의 경계를 허물어 여러반본(iteration)의 명령어가 동시에 수행되도록 하는 것으로 모듈로 스케쥴링(modulo scheduling)은 그 중에 한 범주의 스케쥴링 기법들을 일컫는다. 본 연구에서는 그 중 한가지인 스윙 모듈로 스케쥴러(swing modulo scheduler)[1]를 구현하여 그 효과를 알아보고자 한다.

Optimization for H.264/AVC De-blocking Filter on the TMS320C64x+ DSP (TMS320C64x+ DSP에서의 H.264/AVC 디블록킹 필터 최적화)

  • Lee, Jin-Seop;Kang, Dae-Beom;Sim, Dong-Gyu;Lee, Soo-Youn
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.2
    • /
    • pp.41-52
    • /
    • 2011
  • It is important to reduce computational complexity of de-blocking filter for real-time implementation, because it accounts for a great part of total computational complexity of the decoder. Because there are a lot of conditional branches and memory accesses in a decoding loop, it is not easy to speed up the de-blocking filter. Therefore, this paper presents a new algorithm of de-blocking filter minimizing conditional branches and memory accesses. The proposed structure of de-blocking filter enables filter operation to parallelize by software pipelining. The proposed optimization method was implemented on a TMS320DM6467 EVM board and we achieved approximately 46% cycle reduction, compared with that of FFmpeg.

An Efficient Dissemination Protocol for Remote Update in 6LoWPAN Sensor Network (6LoWPAN상에서 원격 업데이트를 위한 효율적인 코드 전파 기법)

  • Kim, Il-Hyu;Cha, Jung-Woo;Kim, Chang-Hoon;Nam, In-Gil;Lee, Chae-Wook
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.12 no.2
    • /
    • pp.133-138
    • /
    • 2011
  • In IP-based wireless sensor networks (WSNs), it might be necessary to distribute application updates to the sensor nodes in order to fix bugs or add new functionality. However, physical access to nodes is in many cases extremely limited following deployment. Therefore, network reprogramming protocols have recently emerged as a way to distribute application updates without requiring physical access to sensor nodes. In order to solve the network reprogramming problem over the air interface, this thesis presents a new scheme for new update code propagation using fragmentation scheme and network coding. The proposed code propagation method roughly shows reduced performance improvement in terms of the number of data exchange compared with the previously proposed pipelining scheme. Further, It is shows enhanced reliability for update code propagation and reduced overhead in terms of the number of data exchange. As a result, we can efficiently perform the software update from the viewpoint of speed, energy, and network congestion when the proposed code propagation system is applied. In addition, the proposed system solves overhearing problems of network coding such as the loss of original messages and decoding error using the predefined message. Therefore, our system allows a software update system to exchange reliable data in wireless sensor networks.