Search | Korea Science

Software Pipeline-Based Partitioning Method with Trade-Off between Workload Balance and Communication Optimization

Huang, Kai;Xiu, Siwen;Yu, Min;Zhang, Xiaomeng;Yan, Rongjie;Yan, Xiaolang;Liu, Zhili
- ETRI Journal
- /
- v.37 no.3
- /
- pp.562-572
- /
- 2015
For a multiprocessor System-on-Chip (MPSoC) to achieve high performance via parallelism, we must consider how to partition a given application into different components and map the components onto multiple processors. In this paper, we propose a software pipeline-based partitioning method with cyclic dependent task management and communication optimization. During task partitioning, simultaneously considering computation load balance and communication optimization can cause interference, which leads to performance loss. To address this issue, we formulate their constraints and apply an integer linear programming approach to find an optimal partitioning result - one that requires a trade-off between these two factors. Experimental results on a reconfigurable MPSoC platform demonstrate the effectiveness of the proposed method, with 20% to 40% performance improvements compared to a traditional software pipeline-based partitioning method.
https://doi.org/10.4218/etrij.15.0114.0502 인용 PDF KSCI

New Design of Duty Cycle Controllable CMOS Voltage-Controlled Oscillator for Low Power Systems (Duty Cycle 조정이 가능한 새로운 저전력 시스템 CMOS Voltage-Controlled Oscillator 설계)

Cho, Won;Lee, Sung-chul;Moon, Gyu
- Proceedings of the IEEK Conference
- /
- 2006.06a
- /
- pp.605-606
- /
- 2006
Voltage Controlled Oscillator(VCO) plays an important role in today's communication systems. Especially, a Clock Generator(CG) in phase-locked loop(PLL) is usually realized by the VCO. This paper proposes a new VCO with a controllable duty cycle buffer, that can be adopted in low-power high-speed communication systems. Delay cell of the VCO is implemented with gilbert cell. Frequency dynamic range of the VCO is in the range of approximately $50MHz{\sim}500MHz$. Parameters with N-well CMOS 0.18-um process with 1.8V supply voltage was used for the simulations.
PDF

Deblocking Filter Based on Edge-Preserving Algorithm And an Efficient VLSI Architecture (경계선 보존 알고리즘 기반의 디블로킹 필터와 효율적인 VLSI 구조)

Vinh, Truong Quang;Kim, Ji-Hoon;Kim, Young-Chul
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.36 no.11C
- /
- pp.662-672
- /
- 2011
This paper presents a new edge-preserving algorithm and its VLSI architecture for block artifact reduction. Unlike previous approaches using block classification, our algorithm utilizes pixel classification to categorize each pixel into one of two classes, namely smooth region and edge region, which are described by the edge-preserving maps. Based on these maps, a two-step adaptive filter which includes offset filtering and edge-preserving filtering is used to remove block artifacts. A pipelined VLSI architecture of the proposed deblocking algorithm for HD video processing is also presented in this paper. A memory-reduced architecture for a block buffer is used to optimize memory usage. The architecture of the proposed deblocking filter is prototyped on FPGA Cyclone II, and then we estimated performance when the filter is synthesized on ANAM 0.25 ${\mu}m$ CMOS cell library using Synopsys Design Compiler. Our experimental results show that our proposed algorithm effectively reduces block artifacts while preserving the details.
https://doi.org/10.7840/KICS.2011.36C.11.662 인용 PDF KSCI

A VLSI Design for High-speed Data Processing of Differential Phase Detectors with Decision Feedback (결정 궤환 구조를 갖는 차동 위상 검출기의 고속 데이터 처리를 위한 VLSI 설계)

Kim, Chang-Gon;Jeong, Jeong-Hwa
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.39 no.5
- /
- pp.74-86
- /
- 2002
This paper proposes a VLSI architecture for high-speed data processing of the differential phase detectors with the decision feedback. To improve the BER performance of the conventional differential phase detection, DF-DPD, DPD-RGPR and DFDPD-SA have been proposed. These detection methods have the architecture feedbacking the detected phase to reduce the noise of the previous symbol as phase reference. However, the feedback of the detected phase results in lower data processing speed than that of the conventional differential phase detection. In this paper, the VLSI architecture was proposed for high-speed data processing of the differential phase detectors with decision feedback. The Proposed architecture has the pre-calculation method to previously calculate the results on 'N'th step at 'M-1'th step and the pre-decision feedback method to previously feedback the predicted phases at 'M-1'th step. The architecture proposed in this paper was implemented to RTL using VHDL. The simulation results show that the Proposed architecture obtains the high-speed data processing.
PDF KSCI

VLSI Design of H.263 Video Codec Based on Modular Architecture (모듈화된 구조에 기반한 H.263 비디오 코덱 VLSI의 설계)

Kim, Myung-Jin;Lee, Sang-Hee;Kim, Keun-Bae
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.39 no.5
- /
- pp.477-485
- /
- 2002
In this paper, we present an efficient hardware architecture for the H.263 video codec and its VLSI implementation. This architecture is based on the unified interface by which internal hardware engines and an internal RISC processor are connected one another. The unified interface enables the modular design of internal blocks, efficient hardware/software partitioning, and pipelined paralled operations. The developed VLSI supports the H.263 version 2 profile 3 @ level 10, and moreover, both the control protocol H.245 and the multiplexing protocol H.223. Therefore, it can be used for the complete ITU-T H.324 or 3GPP 3G 324M multimedia processor with the help of an external audio codec. Simultaneous encoding and decoding of QCIF format images at a rate greater than 15 frames per second is achieved at 40 MHz clock frequency.
PDF KSCI

VLSI Implementation of Neural Networks Using CMOS Technology (CMOS 기술을 이용한 신경회로망의 VLSI 구현)

Chung, Ho-Sun
- Journal of the Korean Institute of Telematics and Electronics
- /
- v.27 no.3
- /
- pp.137-144
- /
- 1990
We describe how single layer perceptrons and new nonsymmetry feedback type neural networks can be implemented by VLSI CMOS technology. The network described provides a flexible tool for evaluation of boolean expressions and arithmetic equations. About 50 CMOS VLSI chips with an architecture based on two neural networks have been designed and me being fabricated by 2-micrometer double metal design rules. These chips have been developed to study the potential of neural network models for the use in character recognition and for a neural compute.
PDF

Design and Implementation of Motion Estimation VLSI Processor using Block Matching Algorithm (완전탐색 블럭정합 알고리듬을 이용한 움직임 추정기의 VLSI 설계 및 구현)

이용훈;권용무;박호근;류근장;김형곤;이문기
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.31B no.9
- /
- pp.76-84
- /
- 1994
This paper presents a new high-performance VLSI architecture and VLSI implementation for full-search block matching algorithm. The proposed VLSI architecture has the feature of two directional parallel and pipeline processing, thereby reducing the PE idle time at which the direction of block matching operation within the search area is changed. Therfore, the proposed architecture is faster than the existing architectures under the same clock frequency. Based on HSPICE circuit simulation, it is verified that the implemented procesing element is operated successfully within 13 ns for 75 MHz operation.
PDF

VLSI Architecture for Computer-Generated Hologram (컴퓨터 생성 홀로그램을 위한 VLSI 구조)

Seo, Young-Ho;Choi, Hyun-Jun;Kim, Dong-Wook
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.33 no.7C
- /
- pp.540-547
- /
- 2008
In this paper, we proposed a new VLSI architecture which can generate computer-generated hologram (CGH) in real-time and implemented to hardware. The modified algorithm for high-performance CGH was introduced and re-analyzed (or designing hardware. from both numerical and visual analysis, the infernal number system of hardware was decided. CGH algorithm and precision analysis enabled to propose a new cell architecture for CGH. The operational sequence was analyzed with the architecture of CGH cell and the characteristics of the modified CGH algorithm, and finally the pipelined architecture and the operational timing were proposed.
PDF KSCI

Design of a High Performance Exponentiation VLSI in Galois Field through Effective Use of Systems Constants (시스템 상수의 효과적인 사용을 통한 Galois 필드에서의 고성능 지수제곱 연산 VLSI 설계)

Han, Young-Mo
- Journal of the Institute of Electronics Engineers of Korea SC
- /
- v.47 no.1
- /
- pp.42-46
- /
- 2010
Encapsulation for information security is often carried out in Galois field in the form of arithmetic operations. This paper proposes how to efficiently perform exponentiation of arithmetic information on Galois field. Especially, by improving an existing bit-parallel exponentiator to exclude elements with heavy gate counts and to take advantage of system constants, this paper proposes how to implement a VLSI architecture with high performance even for large m.
PDF KSCI

Minimizing Leakage of Sequential Circuits through Flip-Flop Skewing and Technology Mapping

Heo, Se-Wan;Shin, Young-Soo
- JSTS:Journal of Semiconductor Technology and Science
- /
- v.7 no.4
- /
- pp.215-220
- /
- 2007
Leakage current of CMOS circuits has become a major factor in VLSI design these days. Although many circuit-level techniques have been developed, most of them require significant amount of designers' effort and are not aligned well with traditional VLSI design process. In this paper, we focus on technology mapping, which is one of the steps of logic synthesis when gates are selected from a particular library to implement a circuit. We take a radical approach to push the limit of technology mapping in its capability of suppressing leakage current: we use a probabilistic leakage (together with delay) as a cost function that drives the mapping; we consider pin reordering as one of options in the mapping; we increase the library size by employing gates with larger gate length; we employ a new flipflop that is specifically designed for low-leakage through selective increase of gate length. When all techniques are applied to several benchmark circuits, leakage saving of 46% on average is achieved with 45-nm predictive model, compared to the conventional technology mapping.
https://doi.org/10.5573/JSTS.2007.7.4.215 인용 PDF KSCI

Search Result 488, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)