Browse > Article

Study of an In-order SMT Architecture and Grouping Schemes  

Moon, Byung-In (SP Division of System IC, Hynix Semiconductor Inc.,)
Kim, Moon-Gyung (Department of Electrical & Electronic Engineering, Yonsei University)
Hong, In-Pyo (Department of Electrical & Electronic Engineering, Yonsei University)
Kim, Ki-Chang (School of Information & Communication Engineering, Inha University)
Lee, Yong-Surk (Department of Electrical & Electronic Engineering, Yonsei University)
Publication Information
International Journal of Control, Automation, and Systems / v.1, no.3, 2003 , pp. 339-350 More about this Journal
Abstract
In this paper, we propose a simultaneous multithreading (SMT) architecture that improves instruction throughput by exploiting instruction level parallelism (ILP) and thread level parallelism (TLP). The proposed architecture issues and completes instructions belonging to the same thread in exact program order. The issue and completion policy greatly reduces the design complexity and hardware cost of our architecture, compared with others that employ out-of-order issue and completion. On the other hand, when the instructions belong to different threads, the issue and completion orders for those instructions may not necessarily be identical to the fetch order. The processor issues instructions simultaneously from multiple threads to functional units by exploiting ILP and TLP, and by dynamic resource sharing. That parallel execution notably improves performance and resource utilization with minimal additional hardware cost over the conventional superscalar processors. This paper proposes an SMT architecture with grouping as well as one without grouping. Without grouping, all threads dynamically and flexibly share most resources. On the other hand, in the SMT architecture with grouping, in which resources and threads are divided into several groups for design simplification, resources are shared only among threads belonging to the same group as those resources. Simulation results show that our processors with four and eight threads improve performance by three or more times over the conventional superscalar processor with comparable execution resources and policies, and that reasonable grouping reduces the design complexity of SMT processors with little negative effect on performance.
Keywords
Multithreading; SMT; ILP; TLP; in-order issue and completion; grouping;
Citations & Related Records

Times Cited By SCOPUS : 3
연도 인용수 순위
1 Simultaneous multithreading: maximizing on-chip parallelism /
[ D. M. Tullsen;S. J. Eggers;H. M. Levy ] / Proc. 22nd International Symposium on Computer Architecture
2 Compaq chooses SMT for Alpha /
[ K. Diefendorff ] / Microprocessor Report
3 Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor /
[ D. M. Tuilsen;S. J. Eggers;J. S. Emer;H.M. Levy;J. L. Lo;R. L. Stamm ] / Proc. 23rd International Symposium on Computer Architecture
4 /
[ D. A. Patterson;J. L. Hennesy ] / Computer Architecture: A Quantitative Approach(Second Edition)
5 /
[ M. Johnson ] / Superscalar Microprocessor Design
6 An elementary processor architecture with simultaneous instruction issuing from multiple threads /
[ H. Hirata;K. Kimura;S. Nagamine;Y Mochizuki;A. Nishimura;Y. Nakase;T. Nishizawa ] / Proc. 19th International Symposium on Computer Architecture
7 High-bandwidth interleaved memories for vector processors-a simulation study /
[ G. S. Sohi;M. Flanklin ] / IEEE Trans. Comput.   ScienceOn
8 SPEC CPU2000: measuring CPU performance in the New Millennium /
[ J. L. Henning ] / IEEE Computer   ScienceOn
9 Special features of a VLIW architecture /
[ A. Abnous;N. Bagherzadeh ] / Proc. 5th International Parallel Processing Symposium
10 Multithreading comes of age /
[ P. Song ] / Microprocessor Report
11 A benchmark evaluation of a multi-threaded RISC processor architecture /
[ R. Prasadh;C.-L. Wu ] / Proc. International Conference on Parallel Processing
12 /
[] / ARM Developer Suite: Compiler Linker and Utilities Guide
13 Increasing superscalar performance through multistreaming /
[ W. Yamamoto;M. Nemirovsky ] / Proc. IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques