Search | Korea Science

Design of a Parallel Pipelined Processor Architecture (병렬 파이프라인 프로세서 아키덱처의 설계)

이상정;김광준
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.32B no.3
- /
- pp.11-23
- /
- 1995
In this paper, a parallel pipelined processor model which acts as a small VLIW processor architecture and a scheduling algorithm for extracting instruction-level parallelism on this architecture are proposed. The proposed model has a dual-instruction mode which has maximum 4 basic operations being executed in parallel. By combining these basic operations, variable instruction set can be designed for various applications. The scheduling algorithm schedules basic operations for parallel execution and removes pipeline hazards by examining data dependency and resource conflict relations. In order to examine operation and evaluate the performance,a C compiler and a simulator are developed. By simulating various test programs with the compiler and the simulator, the characteristics and the performance result of the proposed architecture are measured.
PDF

Implementation of High-Speed Reed-Solomon Decoder Using the Modified Euclid's Algorithm (개선된 수정 유클리드 알고리듬을 이용한 고속의 Reed-Solomon 복호기의 설계)

김동선;최종찬;정덕진
- The Transactions of the Korean Institute of Electrical Engineers A
- /
- v.48 no.7
- /
- pp.909-915
- /
- 1999
In this paper, we propose an efficient VLSI architecture of Reed-Solomon(RS) decoder. To improve the speed. we develope an architecture featuring parallel and pipelined processing. To implement the parallel and pipelined processing architecture, we analyze the RS decoding algorithm and the honor's algorithm for parallel processing and we also modified the Euclid's algorithm to apply the efficient parallel structure in RS decoder. To show the proposed architecture, the performance of the proposed RS decoder is compared to Shao's and we obtain the 10 % efficiency in area and three times faster in speed when it's compared to Shao's time domain decoder. In addition, we implemented the proposed RS decoder with Altera FPGA Flex10K-50.
PDF

A Hardware Architecture of SEED Algorithm with 320 Mbps (320 Mbps SEED 알고리즘의 하드웨어 구조)

Lee Haeng-Woo;Ra Yoo-Chan
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.10 no.2
- /
- pp.291-297
- /
- 2006
This paper describes the architecture for reducing its size and increasing the computation rate in implementing the SEED algorithm of a 128-bit block cipher, and the result of the circuit design. In order to increase the computation rate, it is used the architecture of the pipelined systolic array. This architecture is a simple thing without involving any buffer at the input and output part. By this circuit, it can be recorded 320 Mbps encryption rate at 10 MHz clock. We designed the circuits with goals of the high-speed computations and the simplified structures.
PDF KSCI

Design of A Low-Voltage and High-Speed Pipelined A/D Converter Using Current-Mode Signals (저전압 고속 전류형 Pipelined A/D 변환기의 설계)

박승균;이희덕;한철희
- Journal of the Korean Institute of Telematics and Electronics A
- /
- v.31A no.3
- /
- pp.18-27
- /
- 1994
An 8-bit 2-stage pipelined current mode A/D converter is designed with a new architecture, where the wideband track-and-hold amplifiers which have 2 integrators in parallel sample input signal twice per clock cycle. The conversion speed of the A-D converter is two times faster than that of conventional pipelined method. The converter is designed to be operated at the power supply voltage of 3.3V with the input dynamic range of 0-256$\mu$A. HSPICE simulation results show the performance of up to 55Msamples/s and power consumption of 150mW with the parameters of ISRC $1.5\mu$m BICMOS process. The chip area is 3${\times}4mm^{2}$.
PDF

A New Pipelined Binary Search Architecture for IP Address Lookup (IP 어드레스 검색을 위한 새로운 pipelined binary 검색 구조)

Lim Hye-Sook;Lee Bo-Mi;Jung Yeo-Jin
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.29 no.1B
- /
- pp.18-28
- /
- 2004
Efficient hardware implementation of address lookup is one of the most important design issues of internet routers. Address lookup significantly impacts router performance since routers need to process tens-to-hundred millions of packets per second in real time. In this paper, we propose a practical IP address lookup structure based on the binary tree of prefixes of different lengths. The proposed structure produces multiple balanced trees, and hence it solve the issues due to the unbalanced binary prefix tree of the existing scheme. The proposed structure is implemented using pipelined binary search combined with a small size TCAM. Performance evaluation results show that the proposed architecture requires a 2000-entry TCAM and total 245 kbyte SRAMs to store about 30,000 prefix samples from MAE-WEST router, and an address lookup is achieved by a single memory access. The proposed scheme scales very well with both of large databases and longer addresses as in IPv6.
PDF KSCI

A study on the Cycle-Accurate Retargetable Micro-Architecture Simulation Framework (사이클 정확도의 재목적화 가능한 마이크로아키텍쳐 시뮬레이션 프레임워크에 관한 연구)

Yang, Hoon-Mo;Lee, Moon-Key
- Proceedings of the IEEK Conference
- /
- 2005.11a
- /
- pp.643-646
- /
- 2005
This paper presents CARMA (Cycle-Accurate Retargetable Micro-Architecture) as efficient framework for SoC-centric pipelined instruction-set architectures. It is based on ADL (Architecture Description Language) and provides more concise and manifest semantics to describe behavior of instruction set by mixing efficiency of instruction-set simulators and flexibility of RTL simulators. It exploits new timing model method based on process scheduling so it can support general timing model with cycle accuracy for large-scaled architectures usually used in SoC multimedia chip-set. According to experiments, the proposed framework was shown to be 5.5 times faster than HDL and 2.5 times faster than System-C in simulation speed so it is applicable for complex instruction-set pipelined architectures.
PDF

An Architecture Design of a Multi-Stage 12-bit High-Speed Pipelined A/D Converter (다단 12-비트 고속 파이프라인 A/D 변환기의 구조 설계)

임신일;이승훈
- Journal of the Korean Institute of Telematics and Electronics A
- /
- v.32A no.12
- /
- pp.220-228
- /
- 1995
An optimized 4-stage 12-bit pipelined CMOS analog-to-digital converter (ADC) architecture is proposed to obtain high linearity and high yield. The ADC based on a multiplying digital-to-analog converter (MDAC) selectively employs a binary-weighted-capacitor (BWC) array in the front-end stage and a unit-capacitor (UC) array in the back-end stages to improve integral nonlinearity (INL) and differential nonlinearity (DNL) simultaneously whil maintaining high yield. A digital-domain nonlinear error calibration technique is applied in the first stage of the ADC to improve its accuracy to 12-bit level. The largest DNL error in the mid-point code of the ADC is reduced by avoiding a code-error symmetry observed in a conventional digitally calibrated ADC is reduced by avoiding a code-error symmetry observed in a conventional digitally calibrated ADC is simulated to prove the effectiveness of the proposed ADC architecture.
PDF

A 3-stage Pipelined Architecture for Multi-View Images Decoder3 (단계 파이프라인 구조를 갖는 Multi-View 영상 디코더)

Bae, Chang-Ho;Yang, Yeong-Yil
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.39 no.4
- /
- pp.104-111
- /
- 2002
In this paper, we proposed the architecture of the decoder which implements the multi-view images decoding algorithm. The study of the hardware structure of the multi-view image processing has not been accomplished. The proposed multi-view images decoder operates in a three stage pipelined manner and extracts the depth of the pixels of the decoded image every clock. The multi-view images decoder consists of three modules, Node selector which transfers the value of the nodes repeatedly and Depth Extractor which extracts the depth of each pixel from the four values of the nodes and Affine transformer which generates the projecting position on the image plane from the values of the pixels and the specified viewpoint. The proposed architecture is designed and simulated by the Max+plus II design tool and the operating frequency is 30MHz. The image can be constructed in a real time by the decoder with the proposed architecture.
PDF KSCI

Edge-Preserving Algorithm for Block Artifact Reduction and Its Pipelined Architecture

Vinh, Truong Quang;Kim, Young-Chul
- ETRI Journal
- /
- v.32 no.3
- /
- pp.380-389
- /
- 2010
This paper presents a new edge-protection algorithm and its very large scale integration (VLSI) architecture for block artifact reduction. Unlike previous approaches using block classification, our algorithm utilizes pixel classification to categorize each pixel into one of two classes, namely smooth region and edge region, which are described by the edge-protection maps. Based on these maps, a two-step adaptive filter which includes offset filtering and edge-preserving filtering is used to remove block artifacts. A pipelined VLSI architecture of the proposed deblocking algorithm for HD video processing is also presented in this paper. A memory-reduced architecture for a block buffer is used to optimize memory usage. The architecture of the proposed deblocking filter is verified on FPGA Cyclone II and implemented using the ANAM 0.25 ${\mu}m$ CMOS cell library. Our experimental results show that our proposed algorithm effectively reduces block artifacts while preserving the details. The PSNR performance of our algorithm using pixel classification is better than that of previous algorithms using block classification.
https://doi.org/10.4218/etrij.10.0109.0290 인용 PDF KSCI

A Pipelined Parallel Optimized Design for Convolution-based Non-Cascaded Architecture of JPEG2000 DWT (JPEG2000 이산웨이블릿변환의 컨볼루션기반 non-cascaded 아키텍처를 위한 pipelined parallel 최적화 설계)

Lee, Seung-Kwon;Kong, Jin-Hyeung
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.46 no.7
- /
- pp.29-38
- /
- 2009
In this paper, a high performance pipelined computing design of parallel multiplier-temporal buffer-parallel accumulator is present for the convolution-based non-cascaded architecture aiming at the real time Discrete Wavelet Transform(DWT) processing. The convolved multiplication of DWT would be reduced upto 1/4 by utilizing the filter coefficients symmetry and the up/down sampling; and it could be dealt with 3-5 times faster computation by LUT-based DA multiplication of multiple filter coefficients parallelized for product terms with an image data. Further, the reutilization of computed product terms could be achieved by storing in the temporal buffer, which yields the saving of computation as well as dynamic power by 50%. The convolved product terms of image data and filter coefficients are realigned and stored in the temporal buffer for the accumulated addition. Then, the buffer management of parallel aligned storage is carried out for the high speed sequential retrieval of parallel accumulations. The convolved computation is pipelined with parallel multiplier-temporal buffer-parallel accumulation in which the parallelization of temporal buffer and accumulator is optimize, with respect to the performance of parallel DA multiplier, to improve the pipelining performance. The proposed architecture is back-end designed with 0.18um library, which verifies the 30fps throughput of SVGA(800$\times$600) images at 90MHz.
PDF KSCI

Search Result 176, Processing Time 0.02 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)