• Title/Summary/Keyword: Dynamic instruction scheduling

Search Result 5, Processing Time 0.023 seconds

Compiler Processor Trade-offs for Dynamic Scheduling of VLIW Instructions (VLIW명령어의 동적 스케줄링을 위한 컴파일러와 프로세서간 상호보완)

  • Sunghyun Jee
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.31 no.5_6
    • /
    • pp.279-287
    • /
    • 2004
  • This paper describes a processor architecture, named Dynamically Instruction Scheduled VLIW (DISVLIW). The DISVLIW Processor architecture is designed for dynamic scheduling VLIW instructions using dependency information. The DISVLIW instruction format is augmented to allow dependency bit vectors to be placed in the same VLIW word. The DISVLIW processor dynamically schedules each instruction in long instructions using functional unit and dynamic scheduler pairs. Features such as explicit parallelism, balanced scheduling effort, and dynamic scheduling of VLIW instructions can be used to provide a sound frustructure for supercomputing. We simulate the DISVLIW processor architecture and show that the DISVLIW processor performs significantly better than the VLIW processor for a wide range of cache sites and across numerical benchmark applications.

Performance Improvement Through Aggressive Instruction Packing (적극적인 명령어 압축을 통한 성능향상)

  • Ji, Seung-Hyeon;Kim, Seok-Il
    • The KIPS Transactions:PartA
    • /
    • v.9A no.2
    • /
    • pp.231-240
    • /
    • 2002
  • This paper proposes balancing scheduling effort more evenly between the compiler and the processor, by introducing independently scheduled VLIW instructions. Aggressively Packed VLIW (APVLIW) processor is aimed specifically at independent scheduling Very Long Instruction Word(VLIW) instructions with dependency information. The APVLIW processor independently schedules earth instruction within long instructions using functional unit and dynamic scheduler pairs. Every dynamic scheduler dynamically checks far data dependencies and resource collisions while scheduling each instruction. This scheduling is especially effective in applications containing loops. We simulate the architecture and show that the APVLIW processor performs significantly better than the VLIW processor for a wide range of cache sizes and across various numerical benchmark applications.

Fine Grain Real-Time Code Scheduling Using an Adaptive Genetic Algorithm (적합 유전자 알고리즘을 이용한 실시간 코드 스케쥴링)

  • Chung, Tai-Myoung
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.6
    • /
    • pp.1481-1494
    • /
    • 1997
  • In hard real-time systems, a timing fault may yield catastrophic results. Dynamic scheduling provides the flexibility to compensate for unexpected events at runtime; however, scheduling overhead at runtime is relatively large, constraining both the accuracy of the timing and the complexity of the scheduling analysis. In contrast, static scheduling need not have any runtime overhead. Thus, it has the potential to guarantee the precise time at which each instruction implementing a control action will execute. This paper presents a new approach to the problem of analyzing high-level language code, augmented by arbitrary before and after timing constraints, to provide a valid static schedule. Our technique is based on instruction-level complier code scheduling and timing analysis, and can ensure the timing of control operations to within a single instruction clock cycle. Because the search space for a valid static schedule is very large, a novel adaptive genetic search algorithm was developed.

  • PDF

Proposition and Evaluation of Parallelism-Independent Scheduling Algorithms for DAGs of Tasks with Non-Uniform Execution Time

  • Kirilka Nikolova;Atusi Maeda;Sowa, Masa-Hiro
    • Proceedings of the IEEK Conference
    • /
    • 2000.07a
    • /
    • pp.289-293
    • /
    • 2000
  • We propose two new algorithms for parallelism-independent scheduling. The machine code generated from the compiler using these algorithms in its scheduling phase is parallelism-independent code, executable in minimum time regardless of the number of the processors in the parallel computer. Our new algorithms have the following phases: finding the minimum number of processors on which the program can be executed in minimal time, scheduling by an heuristic algorithm for this predefined number of processors, and serialization of the parallel schedule according to the earliest start time of the tasks. At run time tasks are taken from the serialized schedule and assigned to the processor which allows the earliest start time of the task. The order of the tasks decided at compile time is not changed at run time regardless of the number of the available processors which means there is no out-of-order issue and execution. The scheduling is done predominantly at compile time and dynamic scheduling is minimized and diminished to allocation of the tasks to the processors. We evaluate the proposed algorithms by comparing them in terms of schedule length to the CP/MISF algorithm. For performance evaluation we use both randomly generated DAGs (directed acyclic graphs) and DACs representing real applications. From practical point of view, the algorithms we propose can be successfully used for scheduling programs for in-order superscalar processors and shared memory multiprocessor systems. Superscalar processors with any number of functional units can execute the parallelism-independent code in minimum time without necessity for dynamic scheduling and out-of-order issue hardware. This means that the use of our algorithms will lead to reducing the complexity of the hardware of the processors and the run-time overhead related to the dynamic scheduling.

  • PDF

GCC based Compiler Construction for Compact DSP32

  • Cho, Myeong-Jin;Lee, Ho-Kyoon;Huong, Giang Nguyen Thi;Kim, Seon-Wook;Han, Young-Sun;Um, Jung-Young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.04a
    • /
    • pp.43-45
    • /
    • 2011
  • Very Long Instruction Word (VLIW) executes multiple instructions in parallel. In order to exploit higher performance, i.e., higher parallelism, VLIW compiler groups as many instructions into one word as possible. In this paper, we show how to construct a VLIW C compiler based on GCC for CDSP32 (Compact Digital Signal Processor 32-bit) which is an embedded DSP processor to issue two instructions in one VLIW. Also, we evaluated the compiler on EEMBC benchmark; the experiment result showed that the total number of dynamic instructions of the VLIW compiler was reduced by 18% on average over without VLIW instruction scheduling.