• Title/Summary/Keyword: ILP

Search Result 99, Processing Time 0.029 seconds

An Analysis of Power Dissipation of Value Prediction in Superscalar Processors (슈퍼스칼라 프로세서에서의 값 예측의 전력 소모 측정 및 분석)

  • 이명근;이상정
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.10c
    • /
    • pp.688-690
    • /
    • 2002
  • 고성능 슈퍼스칼라 프로세서에서는 명령어 수준 병렬성(Instruction Level Parallelism, ILP)의 장애인 명령어간의 종속 관계 중 데이터 종속관계를 극복하기 위해 값 예측기를 이용하여 모험적으로 명령어들을 실행한다. 값 예측 시에 필요한 테이블 참조와 값 예측 실패 시 실행되는 잘못된 명령어의 실행은 프로세서의 부가적인 전력 소모를 요구한다. 본 논문에서는 값 예측기와 Cai-Lim의 전력모델을 슈퍼스칼라 프로세서 사이클 수준 시뮬레이터인 SimpleScalar 3.0 툴셋에 삽입하여 전력 소모량을 측정하고 분석한다.

  • PDF

Optimal Design for Heterogeneous Adder Organization Using Integer Linear Programming (정수 선형 프로그래밍을 이용한 혼합 가산기 구조의 최적 설계)

  • Lee, Deok-Young;Lee, Jeong-Gun;Lee, Jeong-A;Rhee, Sang-Min
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.34 no.8
    • /
    • pp.327-336
    • /
    • 2007
  • Lots of effort toward design optimizations have been paid for a cost-effective system design in various ways from a transistor level to RTL designs. In this paper, we propose a bit level optimization of an adder design for expanding its design space. For the bit-level optimization, a heterogeneous adder organization utilizing a mixture of carry propagation schemes is proposed to design a delay-area efficient adder which were not available in an ordinary design space. Then, we develop an optimization method based on Integer Linear Programming to search the expanded design space of the heterogeneous adder. The novelty of the Proposed architecture and optimization method is introducing a bit level reconstruction/recombination of IPs which have same functionality but different speed and area characteristics for producing more find-grained delay-area optimization.

Storage Assignment for Variables Considering Efficient Memory Access in Embedded System Design (임베디드 시스템 설계에서 효율적인 메모리 접근을 고려한 변수 저장 방법)

  • Choi Yoonseo;Kim Taewhan
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.2
    • /
    • pp.85-94
    • /
    • 2005
  • It has been reported and verified in many design experiences that a judicious utilization of the page and burst access modes supported by DRAMs contributes a great reduction in not only the DRAM access latency but also DRAM's energy consumption. Recently, researchers showed that a careful arrangement of data variables in memory directly leads to a maximum utilization of the page and burst access modes for the variable accesses, but unfortunately, found that the problems are not tractable, consequently, resorting to simple (e.g., greedy) heuristic solutions to the problems. In this parer, to improve the quality of existing solutions, we propose 0-1 ILP-based techniques which produce optimal or near-optimal solution depending on the formulation parameters. It is shown that the proposed techniques use on average 32.2%, l5.1% and 3.5% more page accesses, and 84.0%, 113.5% and 10.1% more burst accesses compared to OFU (the order of first use) and the technique in [l, 2] and the technique in [3], respectively.

Accurate Prediction of Polymorphic Indirect Branch Target (간접 분기의 타형태 타겟 주소의 정확한 예측)

  • 백경호;김은성
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.41 no.6
    • /
    • pp.1-11
    • /
    • 2004
  • Modern processors achieve high performance exploiting avaliable Instruction Level Parallelism(ILP) by using speculative technique such as branch prediction. Traditionally, branch direction can be predicted at very high accuracy by 2-level predictor, and branch target address is predicted by Branch Target Buffer(BTB). Except for indirect branch, each of the branch has the unique target, so its prediction is very accurate via BTB. But because indirect branch has dynamically polymorphic target, indirect branch target prediction is very difficult. In general, the technique of branch direction prediction is applied to indirect branch target prediction, and much better accuracy than traditional BTB is obtained for indirect branch. We present a new indirect branch target prediction scheme which combines a indirect branch instruction with its data dependent register of the instruction executed earlier than the branch. The result of SPEC benchmark simulation which are obtained on SimpleScalar simulator shows that the proposed predictor obtains the most perfect prediction accuracy than any other existing scheme.

Efficient Internet Traffic Engineering based on Shortest Path Routing (최단경로 라우팅을 이용한 효율적인 인터넷 트래픽 엔지니어링)

  • 이영석
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.2B
    • /
    • pp.183-191
    • /
    • 2004
  • Single shortest path routing is known to perform poorly for Internet traffic engineering (TE) where the typical optimization objective is to minimize the maximum link load. Splitting traffic uniformly over equal cost multiple shortest paths in OSPF and IS-IS does not always minimize the maximum link load when multiple paths are not carefully selected for the global traffic demand matrix. However, among all the equal cost multiple shortest paths in the network, a set of TE-aware shortest paths, which reduces the maximum link load significantly, can be found and used by IP routers without any change of existing routing protocols and serious configuration overhead. While calculating TE-aware shortest paths. the destination-based forwarding constraint at a node should be satisfied, because an IP router will forward a packet to the next-hop toward the destination by looking up the destination prefix. In this paper, we present a problem formulation of finding a set of TE-aware shortest paths in ILP, and propose a simple heuristic for the problem. From the simulation results, it is shown that TE-aware shortest path routing performs better than default shortest path routing and ECMP in terms of the maximum link load with the marginal configuration overhead of changing the next-hops.

A Vectorization Technique at Object Code Level (목적 코드 레벨에서의 벡터화 기법)

  • Lee, Dong-Ho;Kim, Ki-Chang
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.5
    • /
    • pp.1172-1184
    • /
    • 1998
  • ILP(Instruction Level Parallelism) processors use code reordering algorithms to expose parallelism in a given sequential program. When applied to a loop, this algorithm produces a software-pipelined loop. In a software-pipelined loop, each iteration contains a sequence of parallel instructions that are composed of data-independent instructions collected across from several iterations. For vector loops, however the software pipelining technique can not expose the maximum parallelism because it schedules the program based only on data-dependencies. This paper proposes to schedule differently for vector loops. We develop an algorithm to detect vector loops at object code level and suggest a new vector scheduling algorithm for them. Our vector scheduling improves the performance because it can schedule not only based on data-dependencies but on loop structure or iteration conditions at the object code level. We compare the resulting schedules with those by software-pipelining techniques in the aspect of performance.

  • PDF

Design of a Ship Backbone Network for Effective Performance and Construct Cost (효율적인 네트워크의 구축 비용 및 성능을 고려한 선박 백본 네트워크의 설계기법)

  • Kim, Hye-Jin;Tak, Sung-Woo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2011.10a
    • /
    • pp.479-482
    • /
    • 2011
  • This paper proposes a design of a ship backbone network-based on the survival and efficiency of the ship network. Currently IEC operates the standard ship network, a standard specification "IEC 61162-410 maintains the operation of the network. IEC 61162-410 offers a high stability of the ship network by using terminal equipment. But current studies are incomplete because it has been assumed that the ship's network will operate at double its current capacity. This paper analyzes the double ship backbone topology for an organization and then will summarise the minimum costs required to implement the ship backbone topology using an ILP. Also, we present an effective traffic assignment technique that uses an ILP, metaheuristic, heuristic algorism-based underlying the ship backbone network. The results by experimenting the design of the network confirmed a greter efficiency, stability and cost-effectiveness of the ship network.

  • PDF

Glitch Reduction Through Path Balancing for Low-Power CMOS Digital Circuits (저전력 CMOS 디지털 회로 설계에서 경로 균등화에 의한 글리치 감소기법)

  • Yang, Jae-Seok;Kim, Seong-Jae;Kim, Ju-Ho;Hwang, Seon-Yeong
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.26 no.10
    • /
    • pp.1275-1283
    • /
    • 1999
  • 본 논문은 CMOS 디지털 회로에서의 전력 소모의 주원인인 신호의 천이중에서 회로의 동작에 직접적인 영향을 미치지 않는 불필요한 신호의 천이인 글리치를 줄이기 위한 효율적인 알고리즘을 제시한다. 제안된 알고리즘은 회로의 지연 증가 없이 게이트 사이징과 버퍼 삽입에 의해 경로 균등(path balancing)을 이룸으로써 글리치를 감소시킨다. 경로 균등화를 위하여 먼저 게이트 사이징을 통해 글리치의 감소와 동시에, 게이트 크기의 최적화를 통해 회로 전체의 캐패시턴스까지 줄일 수 있으며, 게이트 사이징 만으로 경로 균등화가 이루어지지 않을 경우 버퍼 삽입으로 경로 균등화를 이루게 된다. 버퍼 자체에 의한 전력 소모 증가보다 글리치 감소에 의한 전력 감소가 큰 버퍼를 선택하여 삽입한다. 이때 버퍼 삽입에 의한 전력 감소는 다른 버퍼의 삽입 상태에 따라 크게 달라질 수 있어 ILP (Integer Linear Program)를 이용하여 적은 버퍼 삽입으로 전력 감소를 최대화 할 수 있는 저전력 설계 시스템을 구현하였다. 제안된 알고리즘은 LGSynth91 벤치마크 회로에 대한 테스트 결과 회로의 지연 증가 없이 평균적으로 30.4%의 전력 감소를 얻을 수 있었다.Abstract This paper presents an efficient algorithm for reducing glitches caused by spurious transitions in CMOS logic circuits. The proposed algorithm reduces glitches by achieving path balancing through gate sizing and buffer insertion. The gate sizing technique reduces not only glitches but also effective capacitance in the circuit. In the proposed algorithm, the buffers are inserted between the gates where power reduction achieved by glitch reduction is larger than the additional power consumed by the inserted buffers. To determine the location of buffer insertion, ILP (Integer Linear Program) has been employed in the proposed system. The proposed algorithm has been tested on LGSynth91 benchmark circuits. Experimental results show an average of 30.4% power reduction.

A Hybrid Value Predictor using Speculative Update in Superscalar Processors (슈퍼스칼라 프로세서에서 모험적 갱신을 사용한 하이브리드 결과값 예측기)

  • Park, Hong-Jun;Sin, Yeong-Ho;Jo, Yeong-Il
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.28 no.11
    • /
    • pp.592-600
    • /
    • 2001
  • To improve the performance of wide-issue Superscalar microprocessors, it is essential to increase the width of instruction fetch and issue rate. Data dependences are major hurdle to exploit ILP(Instruction-Level Parallelism) efficiently, so several related works have suggested that the limits imposed by data dependences can be overcome to some extent with the use of the data value prediction. But the suggested mechanisms may access the same value prediction table entry again before they have been updated with a real data value. They will cause incorrect value prediction by using stable data and incur misprediction penalty and lowering performance. In this paper, we propose a new hybrid value predictor which achieve high performance by reducing stale data. Because the proposed hybrid value predictor can update the prediction table speculatively, it efficiently reduces the number of mispredicted instruction due to stable due to stale data. For SPECint95 benchmark programs on the 16-issue superscalar processors, simulation results show that the average prediction accuracy increase from 59% for non-speculative update to 72% for speculative update.

  • PDF

Sepculative Updates of a Stride Value Predictor in Wide-Issue Processors (와이드 이슈 프로세서를 위한 스트라이드 값 예측기의 모험적 갱신)

  • Jeon, Byeong-Chan;Lee, Sang-Jeong
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.28 no.11
    • /
    • pp.601-612
    • /
    • 2001
  • In superscalar processors, value prediction is a technique that breaks true data dependences by predicting the outcome of an instruction in order to exploit instruction level parallelism(ILP). A value predictor looks up the prediction table for the prediction value of an instruction in the instruction fetch stage, and updates with the prediction result and the resolved value after the execution of the instruction for the next prediction. However, as the instruction fetch and issue rates are increased, the same instruction is likely to fetch again before is has been updated in the predictor. Hence, the predictor looks up the stale value in the table and this mostly will cause incorrect value predictions. In this paper, a stride value predictor with the capability of speculative updates, which can update the prediction table speculatively without waiting until the instruction has been completed, is proposed. Also, the performance of the scheme is examined using Simplescalar simulator for SPECint95 benchmarks in which our value predictor is added.

  • PDF