• 제목/요약/키워드: Pipeline Structure

검색결과 273건 처리시간 0.024초

소면적 32-bit 2/3단 파이프라인 프로세서 설계 (Low-Gate-Count 32-Bit 2/3-Stage Pipelined Processor Design)

  • 이광민;박성경
    • 전자공학회논문지
    • /
    • 제53권4호
    • /
    • pp.59-67
    • /
    • 2016
  • 각종 계량기, 웨어러블 디바이스 등의 사물에 통신기능을 내장하여 인터넷에 연결하는 사물인터넷 (Internet of Things or IoT) 기술의 발전과 함께, 이에 사용 가능한 소면적 임베디드 프로세서에 대한 수요가 증가하고 있다. 본 논문에서는 이러한 사물인터넷 분야에 사용 가능한 소면적 32-bit 파이프라인 프로세서인 Juno를 소개한다. Juno는 즉치 값 확장이 편리한 EISC (extendable instruction set computer) 구조이며, 파이프라인의 데이터 의존성을 줄이기 위해 2/3단 파이프라인 구조를 택하였다. PC (program counter) 레지스터와 두 개의 파이프라인 레지스터만을 컨트롤함으로써 전체 파이프라인을 컨트롤할 수 있는 간단한 구조의 소면적 파이프라인 컨트롤러를 갖는다. 무선 통신에 필요한 암호화 등의 연산을 수행하기 위한 $32{\times}32=64$ 곱셈 연산, 64/32=32 나눗셈 연산, $32{\times}32+64=64$ MAC 연산, 32*32=64 Galois 필드 곱셈 연산을 모두 지원하지만, 모든 연산기를 선택적으로 구현하여 필요에 따라서는 면적을 줄이기 위해 일부 연산기를 제외하고도 프로세서를 재합성할 수 있다. 이 경우 정수 코어의 gate count는 12k~22k 수준이고, 0.57 DMIPS/MHz와 1.024 Coremark/MHz의 성능을 보인다.

The effect of nanoparticle in reduction of critical fluid velocity in pipes conveying fluid

  • Ghaitani, M.M.;Majidian, A.;Shokri, V.
    • Advances in concrete construction
    • /
    • 제9권1호
    • /
    • pp.103-113
    • /
    • 2020
  • This paper deal with the critical fluid velocity response of nanocomposite pipe conveying fluid based on numerical method. The pressure of fluid is obtained based on perturbation method. The motion equations are derived based on classical shell theory, energy method and Hamilton's principle. The shell is reinforced by nanoparticles and the distribution of them are functionally graded (FG). The mixture rule is applied for obtaining the equivalent material properties of the structure. Differential quadrature method (DQM) is utilized for solution of the motion equations in order to obtain the critical fluid velocity. The effects of different parameters such asCNT nanoparticles volume percent, boundary conditions, thickness to radius ratios, length to radius ratios and internal fluid are presented on the critical fluid velocity response structure. The results show that with increasing the CNT nanoparticles, the critical fluid velocity is increased. In addition, FGX distribution of nanoparticles is the best choice for reinforcement.

3중 DES와 DES 암호 알고리즘용 암호 프로세서와 VLSI 설계 (VLSI Design of Cryptographic Processor for Triple DES and DES Encryption Algorithm)

  • 정진욱;최병윤
    • 한국멀티미디어학회:학술대회논문집
    • /
    • 한국멀티미디어학회 2000년도 춘계학술발표논문집
    • /
    • pp.117-120
    • /
    • 2000
  • This paper describe VLSL design of crytographic processor which can execute triple DES and DES encryption algorithm. To satisfy flexible architecture and area-efficient structure, the processor has 1 unrolled loop structure without pipeline and can support four standard mode, such as ECB, CBC, CFB, and OFB modes. To reduce overhead of key computation , the key precomputation technique is used. Also to eliminate increase of processing time due to data input and output time, background I/O techniques is used which data input and output operation execute in parallel with encryption operation of cryptographic processor. The cryptographic processor is implemented using Altera EPF10K40RC208-4 devices and has peak performance of about 75 Mbps under 20 Mhz ECB DES mode and 25 Mbps uder 20 Mhz triple DES mode.

  • PDF

내용을 고려한 무방향 네트워크의 신뢰도 계산 (Reliability Evaluation of a Capacitated Two-Terminal Network)

  • 최명호;윤덕균
    • 산업경영시스템학회지
    • /
    • 제12권20호
    • /
    • pp.47-53
    • /
    • 1989
  • This paper presents an algorithm CAPFACT to evaluate the reliability of a capacitated two terminal network such as a communication network, a power distribution network, and a pipeline network. The network is good(working) if and only if it is possible to transmit successfully the required system capacity from one specified terminal to the other. This paper defines new Capacitated series-parallel reduction to be applied to a series-parallel structure of the network. New Capacitated factoring method is applied to a non-series-parallel structure. The method is based on the factoring theorem given by Agrawal and Barlow. According to the existing studies on the reliability evaluation of the network that the capacity is not considered, the factoring method using reduction is efficient. The CAPFACT is more efficient than Aggarwal algorithm which enumerated and combined the paths. The efficiency is proved by the result of testing the number of operations and cpu time on FORTRAN compiler of VAX-11/780 at Hanyang University.

  • PDF

고속 행렬 전치를 위한 효율적인 VLSI 구조 (An efficient VLSI architecture for high speed matrix transpositio)

  • 김견수;장순화;김재호;손경식
    • 한국통신학회논문지
    • /
    • 제21권12호
    • /
    • pp.3256-3264
    • /
    • 1996
  • This paper presents an efficient VLSI architecture for transposing matris in high speed. In the case of transposing N*N matrix, N$^{2}$ numbers of transposition cells are configured as regular and spuare shaped structure, and pipeline structure for operating each transposition cell in paralle. Transposition cell consists of register and input data selector. The characteristic of this architecture is that the data to be transposed are divided into several bundles of bits, then processed serially. Using the serial transposition of divided input data, hardware complexity of transpositioncell can be reduced, and routing between adjacent transposition cells can be simple. the proposed architecture is designed and implemented with 0.5 .mu.m VLSI library. As a result, it shows stable operation in 200 MHz and less hardware complexity than conventional architectures.

  • PDF

Systolic array 구조를 갖는 움직임 추정기 설계 (Design of a motion estimator with systolic array structure)

  • 정대호;최석준;김환영
    • 전자공학회논문지C
    • /
    • 제34C권10호
    • /
    • pp.36-42
    • /
    • 1997
  • In the whole world, the research about the VLSI implementation of motion estimation algorithm is progressed to actively full (brute force) search algorithm research with the development of systolic array possible to parallel and pipeline processing. But, because of processing time's limit in a field to handle a huge data quantily such as a high definition television, many problems are happened to full search algorithm. In the paper, as a fast processing to using parallel scheme for the serial input image data, motion estimator of systolic array structure verifying that processing time is improved in contrast to the conventional full search algorithm.

  • PDF

Implementation of Acoustic Echo Canceller with FPGA

  • Lim, Un-Cheon;Moon, Dai-Tchul
    • The Journal of the Acoustical Society of Korea
    • /
    • 제23권3E호
    • /
    • pp.79-84
    • /
    • 2004
  • In this paper, the AEC(acoustic echo canceller) is designed and implemented using VHDL(VHSIC hardware description language). The designed Echo Canceller employs the pipeline and the master-slave structure, and is realized with FPGA. As an adaptive algorithm, the Normalized LMS algorithm is used. For the coefficient adjustment, the Stochastic Iteration Algorithm(SIA) which uses only current residual values is used and the number of registers are evidently reduced and convergence speed is also much improved comparing to existing methods by using EAB of FPGA for FIR filter structure of transceiver. The designed Echo Canceller is verified with the test board implemented for this paper. From the timing simulation echo signals at about 1500 sampling data are converged and ERLE is improved by about 42-dB.

시프트 버퍼를 이용한 고속 가변길이 디코더 구현 (An Implementation on the High Speed VLD using Shift Buffer)

  • 노진수;백창희;이강현
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2006년도 하계종합학술대회
    • /
    • pp.759-760
    • /
    • 2006
  • In this paper, The author designed on high speed VLD(Variable Length Decoder) using shift buffer. Variable Length Decoder is received N bit data from input block and decode the input signal using Shifting Buffer, Length Decoder and Symbol Decoder blocks. The inner part of shifting buffer in proposed Variable Length Decoder is filled input data and then operating therefore, the proposed structure can improve the decoded speed. And in this paper we applying pipeline structure therefore data is decoded in every clock.

  • PDF

Hierarchical Multiplexing Interconnection Structure for Fault-Tolerant Reconfigurable Chip Multiprocessor

  • Kim, Yoon-Jin
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • 제11권4호
    • /
    • pp.318-328
    • /
    • 2011
  • Stage-level reconfigurable chip multiprocessor (CMP) aims to achieve highly reliable and fault tolerant computing by using interwoven pipeline stages and on-chip interconnect for communicating with each other. The existing crossbar-switch based stage-level reconfigurable CMPs offer high reliability at the cost of significant area/power overheads. These overheads make realizing large CMPs prohibitive due to the area and power consumed by heavy interconnection networks. On other hand, area/power-efficient architectures offer less reliability and inefficient stage-level resource utilization. In this paper, I propose a hierarchical multiplexing interconnection structure in lieu of crossbar interconnect to design area/power-efficient stage-level reconfigurable CMP. The proposed approach is able to keep the reliability offered by the crossbar-switch while reducing the area and power overheads. Experimental results show that the proposed approach reduces area by up to 21% and power by up to 32% when compared with the crossbar-switch based interconnection network.

멀티코어 GP-GPU를 이용한 지오메트리 처리 (Geometry Processing using Multi-Core GP-GPU)

  • 이광엽;김치용
    • 전기전자학회논문지
    • /
    • 제14권2호
    • /
    • pp.69-75
    • /
    • 2010
  • 3D 그래픽 처리 과정은 크게 지오메트리 단계와 렌더링 단계로 구분된다. 본 논문에서는 듀얼페이즈 멀티코어 GP-GPU에서 지오메트리 처리를 가속화시키기 위한 방법을 제안한다. GP-GPU의 SIMD, 듀얼페이즈 구조를 이용한 병렬적 데이터 처리와 메모리 프리패치를 이용하여, 지오메트리 처리를 가속화 시킬 수 있었으며, 모든 기능을 사용할 시 19%의 성능 향상을 나타내었다.