Search | Korea Science

Implementation of a Compiler for VLIW rchitecture (VLIW 구조를 위한 컴파일러의 구현)

Choe, Seong-Uk;Kim, Gyeong-Hun;Park, Myeong-Sun
- Journal of KIISE:Computing Practices and Letters
- /
- v.5 no.1
- /
- pp.109-121
- /
- 1999
VLIW(Very Long Instruction Word)기술을 이용한 프로세서는 최근에 다른 어떠한 형태의 프로세서보다 좋은 성능을 보일 것으로 기대되고 있다. 컴파일러가 전역적인 분석을 진행하여 명령어 수준의 병렬성을 , VLIW 구조를 위한 많은 컴파일 기술이 연구되어왔다. 컴파일 기술의 연구에 대해 보다 신뢰성 있는 결과를 얻기 위해서는 자신의 새로운 기술이 첨가될 수 있는 기본 토대로서 VLIW 컴파일러 및 실험환경을 구축하는 것이 필요하다. 본 논문에서는 VLIW 프로세서를 위해 GURPR을 기반으로 한 소프트웨어 파이프라이닝등 기존의 병렬성 증진 최적화 기법등을 포함한 병렬화 컴파일러를 개발하였고, 시뮬레이터 환경에서 테스트하였다. 실험 결과, 몇몇 벤치마크는 최대 30% 까지 실행시간이 시간이 단축될 수 있음을 보였다. 본 컴파일러 시스템은 컴파일링 기술에 대한 연구에 있어 기존 모듈을 개선하는 등에 대해 많은 도움을 줄 것이며 향후 새로운 연구결과와 구현이 본 컴파일러 환경에 추가되어 성능 향상 정도를 실험할 수 있을 것으로 기대하고 있다.

A Design of High Performance Parallel CRC Using A Simple Logic Optimization (논리 최적화 기법을 이용한 병렬 CRC 회로 설계)

Yi Hyunbean;Kim Jusub;Park Sungju;Park Changwon
- Proceedings of the Korean Information Science Society Conference
- /
- 2005.07a
- /
- pp.460-462
- /
- 2005
본 논문은 통신 시스템에서 오류 검출을 위해 널리 사용되고 있는 Cyclic Redundancy Check (CRC)회로의 병렬 구현을 위한 최적화 알고리즘을 제시한다. 논리 단을 최소로 하면서 가능한 않은 공유 텀을 찾아 매핑 함으로써 속도 및 게이트 수를 줄인다. 본 논문에서는 이더넷의 32비트 CRC를 병렬로 구현하여 성능평가를 하였다. FPGA 및 표준 셀 라이브러리를 이용하여 합성하였으며, 기존의 방식에 비해 속도와 면적 모두 향상되었음을 보여준다.
PDF

Enhancement Broadcast Algorithm for Distributed Memgory Multicomputers with Message Passing Environment (메시지 교환 방식의 분산 메모리 컴퓨터를 위한 개선된 방송 알고리즘)

Yun, Il-Heung;Kim, Dong-Seung
- Journal of KIISE:Computer Systems and Theory
- /
- v.26 no.5
- /
- pp.549-554
- /
- 1999
본 논문에서는 P개의 프로세서로 구성된 메시지 전달형의 병렬 컴퓨터에서 메시지 길이 L일 때 시간 복잡도가 O(L logP)인 방송함수(broadcast)의 기존 구현방식을 개선하고자, 메시지를 P/2개의 메시지로 균등하게 분할하고 그 각각을 분산시켜 병렬로 전송하는 복잡도 O(L)인 방송 알고리즘을 제안하였다. 또한 프로세서가 다단계 연결망으로 연결된 IBM SP2 병렬 컴퓨터에서 MPI 환경으로 실험하여 비교적 긴 메시지에 대해서 기존방식보다 성능이 향상됨을 확인하였다. 이 방식은 내장된 점대점 통신을 쓰고 방송에 의한 지연시간이 프로세서 수에 의존되지 않아 빠른 수행을 할수 있으므로 해당 컴퓨터의 통신 특성 파라메터와는 무관하게 동작하며, MPI-2 같은 새로운 환경에도 폭넓게 적용할수 있다.

(Design of GF(216) Serial Multiplier Using GF(24) and its C Language Simulation (유한체 GF(24)를 이용한 GF(216)의 직렬 곱셈기 설계와 이의 C언어 시뮬레이션)

신원철;이명호
- Journal of the Korea Society of Computer and Information
- /
- v.6 no.3
- /
- pp.56-63
- /
- 2001
In this paper, The GF(216) multiplier using its subfields GF(24) is designed. This design can be used to construct a sequential logic multiplier using a bit-parallel multiplier for its subfield. A finite field serial multiplier using parallel multiplier of subfield takes a less time than serial multiplier and a smaller complexity than parallel multiplier. It has an advatageous feature. A feature between circuit complexity and delay time is compared and simulated using C language.
PDF

DQ Synchronous Reference Frame Model of A Series-Parallel Tuned Inductive Power Transfer System and Current Controller (직렬-병렬 무선 전력 전송 시스템의 DQ 동기 좌표계 모델 및 전류제어기)

Noh, Eunchong;Lee, Sangmin;Lee, Seung-Hwan
- Proceedings of the KIPE Conference
- /
- 2020.08a
- /
- pp.181-183
- /
- 2020
본 논문에서는 DQ 변환을 적용한 직렬-병렬 공진형 무선전력 전송 시스템의 동기 좌표계 모델과 이를 이용한 전류제어기 시스템을 제안한다. 무선 전력 전송 시스템은 일반적으로 급전 측과 집전 측에 단상 전류가 흐르기 때문에 제어에 어려움이 있다. 따라서 정상 상태의 전압 및 전류의 수식을 이용하여 부하에 전달되는 전압 및 전류의 크기를 제어하는 경우가 많다. 따라서 과도 상태의 전압 및 전류의 동특성이 원하는 특성과 다르게 나타날 수 있다. 본 논문에서는 직렬-병렬 공진형 무선전력 전송 시스템의 단상 전압 및 전류를 DQ 변환하여 과도 상태 및 정상 상태의 전압 및 전류의 동특성을 해석할 수 있는 등가 회로 모델을 제시하고 이를 이용하여 과도 상태 제어를 위한 고성능 전류 제어기를 제안한다.
PDF

Optimal Interference Rejection Weight for Multistage Parallel Nulling-Partial PIC Receiver for MIMO MC-CDMA Systems (MIMO MC-CDMA 시스템을 위한 다단계 병렬 널링 및 부분 간섭 제거 수신기를 위한 최적 가중치 결정)

구정회;김경연;심세준;이충용
- Journal of the Institute of Electronics Engineers of Korea TC
- /
- v.41 no.11
- /
- pp.9-15
- /
- 2004
We propose optimal interference rejection weight for multistage parallel nulling (MPN) partial parallel interference cancellation (PPIC) receiver previously proposed to enhance the performance of V-BLAST for downlink multiple-input multiple-output (MIMO) multicarrier (MC)-code division multiple access (CDMA) systems. MPN-PPIC method proposed in [1] was based on the parallel interference cancellation (PIC) with fixed interference rejection weight obtained experimentally. However, the fixed weight can not be adapted to various systems efficiently, thus we proposed method for the optimal interference rejection weight based on the received signal to interference and noise ratio (SINR), and the performance of the proposed method was evaluated through computer simulation comparing with the previous method. We obtained performance gains of 2.5 dB ~ 5 dB for BER of 10$^{-3}$ .
PDF KSCI

A Study on the Parallel Ternary Logic Circuit Design to DCG Property with 2n nodes ($2^n$개의 노드를 갖는 DCG 특성에 대한 병렬3치 논리회로 설계에 관한 연구)

Byeon, Gi-Yeong;Park, Seung-Yong;Sim, Jae-Hwan;Kim, Heung-Su
- Journal of the Institute of Electronics Engineers of Korea SC
- /
- v.37 no.6
- /
- pp.42-49
- /
- 2000
In this paper, we propose the parallel ternary logic circuit design algorithm to DCG Property with 2$^n$ nodes. To increase circuit integration, one of the promising approaches is the use of multiple-valued logic(MVL). It can be useful methods for the realization of compact integrated circuit, the improvement of high velocity signal processing using parallel signal transmission and the circuit design algorithm to optimize and satisfy the circuit property. It is all useful method to implement high density integrated circuit. In this paper, we introduce matrix equation to satisfy given DCG with 2$^n$ nodes, and propose the parallel ternary logic circuit design process to circuit design algorithm. Also, we propose code assignment algorithm to satisfy for the given DCG property. According to the simulation result of proposed circuit design algorithm, it have the following advantage ； reduction of the circuit signal lines, computation time and costs.
PDF

Data Level Parallelism for H.264/AVC Decoder on a Multi-Core Processor and Performance Analysis (멀티코어 프로세서에서의 H.264/AVC 디코더를 위한 데이터 레벨 병렬화 성능 예측 및 분석)

Cho, Han-Wook;Jo, Song-Hyun;Song, Yong-Ho
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.46 no.8
- /
- pp.102-116
- /
- 2009
There have been lots of researches for H.264/AVC performance enhancement on a multi-core processor. The enhancement has been performed through parallelization methods. Parallelization methods can be classified into a task-level parallelization method and a data level parallelization method. A task-level parallelization method for H.264/AVC decoder is implemented by dividing H.264/AVC decoder algorithms into pipeline stages. However, it is not suitable for complex and large bitstreams due to poor load-balancing. Considering load-balancing and performance scalability, we propose a horizontal data level parallelization method for H.264/AVC decoder in such a way that threads are assigned to macroblock lines. We develop a mathematical performance expectation model for the proposed parallelization methods. For evaluation of the mathematical performance expectation, we measured the performance with JM 13.2 reference software on ARM11 MPCore Evaluation Board. The cycle-accurate measurement with SoCDesigner Co-verification Environment showed that expected performance and performance scalability of the proposed parallelization method was accurate in relatively high level
PDF KSCI

A Pipelined Parallel Optimized Design for Convolution-based Non-Cascaded Architecture of JPEG2000 DWT (JPEG2000 이산웨이블릿변환의 컨볼루션기반 non-cascaded 아키텍처를 위한 pipelined parallel 최적화 설계)

Lee, Seung-Kwon;Kong, Jin-Hyeung
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.46 no.7
- /
- pp.29-38
- /
- 2009
In this paper, a high performance pipelined computing design of parallel multiplier-temporal buffer-parallel accumulator is present for the convolution-based non-cascaded architecture aiming at the real time Discrete Wavelet Transform(DWT) processing. The convolved multiplication of DWT would be reduced upto 1/4 by utilizing the filter coefficients symmetry and the up/down sampling; and it could be dealt with 3-5 times faster computation by LUT-based DA multiplication of multiple filter coefficients parallelized for product terms with an image data. Further, the reutilization of computed product terms could be achieved by storing in the temporal buffer, which yields the saving of computation as well as dynamic power by 50%. The convolved product terms of image data and filter coefficients are realigned and stored in the temporal buffer for the accumulated addition. Then, the buffer management of parallel aligned storage is carried out for the high speed sequential retrieval of parallel accumulations. The convolved computation is pipelined with parallel multiplier-temporal buffer-parallel accumulation in which the parallelization of temporal buffer and accumulator is optimize, with respect to the performance of parallel DA multiplier, to improve the pipelining performance. The proposed architecture is back-end designed with 0.18um library, which verifies the 30fps throughput of SVGA(800$\times$600) images at 90MHz.
PDF KSCI

A Study on Design of High-Speed Parallel Multiplier over GF(2^m) using VCG (VCG를 사용한 GF(2^m)상의 고속병렬 승산기 설계에 관한 연구)

Seong, Hyeon-Kyeong
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.14 no.3
- /
- pp.628-636
- /
- 2010
In this paper, we present a new type high speed parallel multiplier for performing the multiplication of two polynomials using standard basis in the finite fields GF($2^m$). Prior to construct the multiplier circuits, we design the basic cell of vector code generator(VCG) to perform the parallel multiplication of a multiplicand polynomial with a irreducible polynomial and design the partial product result cell(PPC) to generate the result of bit-parallel multiplication with one coefficient of a multiplicative polynomial with VCG circuits. The presented multiplier performs high speed parallel multiplication to connect PPC with VCG. The basic cell of VCG and PPC consists of one AND gate and one XOR gate respectively. Extending this process, we show the design of the generalized circuits for degree m and a simple example of constructing the multiplier circuit over finite fields GF($2^4$). Also, the presented multiplier is simulated by PSpice. The multiplier presented in this paper uses the VCGs and PPCS repeatedly, and is easy to extend the multiplication of two polynomials in the finite fields with very large degree m, and is suitable to VLSL.
https://doi.org/10.6109/jkiice.2010.14.3.628 인용 PDF KSCI

Search Result 1,180, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)