DOI QR코드

DOI QR Code

Parallel Branch Instruction Extension for Thumb-2 Instruction Set Architecture

Thumb-2 명령어 집합 구조의 병렬 분기 명령어 확장

  • Kim, Dae-Hwan (Dept. of Computer Information, Suwon Science College)
  • 김대환 (수원과학대학교 컴퓨터정보과)
  • Received : 2013.03.27
  • Accepted : 2013.07.14
  • Published : 2013.07.31

Abstract

In this paper, the parallel branch instruction is proposed which executes a branch instruction and the frequently used instruction simultaneously to improve the performance of Thumb-2 instruction set architecture. In the proposed approach, new 32-bit parallel branch instructions are introduced which combine 16-bit branch instruction with each of the frequently used 16-bit LOAD, ADD, MOV, STORE, and SUB instructions, respectively. To provide the encoding space of the new instructions, the register field in less frequently executed instructions is reduced, and the new instructions are encoded by using the saved bits. Experiments show that the proposed approach improves performance by an average of 8.0% when compared to the conventional approach.

본 논문에서는 Thumb-2 명령어 집합 구조의 성능을 개선하기 위하여 분기 명령어와 사용 빈도가 높은 명령어를 동시에 실행하는 병렬 분기 명령어 집합을 제시한다. 제시된 기법에서는 16비트 분기 명령어와 사용 빈도가 높은 16비트 LOAD, ADD, MOV, STORE, SUB 명령어를 각각 결합하는 새로운 32비트 명령어를 도입한다. 새로운 명령어의 인코딩 공간을 제공하기 위해 사용 빈도가 낮은 기존 명령어의 레지스터 필드에 사용되는 비트 수를 줄이고 이를 통해 절약된 비트들을 이용하여 병렬 분기 명령어를 인코딩한다. 실험 결과, 제시된 방법은 코드 크기를 증가시키지 않고 전통적인 방식과 비교하여 평균 8.0%의 성능을 향상시킨다.

Keywords

References

  1. Advanced RISC Machines Ltd., "ARM Annual Report & Accounts 2012," Advanced RISC Machines Ltd., 2012.
  2. R. Phelan, "Improving ARM Code Density and Performance," Technical report, Advanced RISC Machines Ltd.. June 2003.
  3. S. Segars, K. Clarke, and L. Goudge, "Embedded control problems, Thumb, and the ARM7TDMI," IEEE Micro, Vol. 15, No. 5, pp. 22-30, Oct. 1995.
  4. J. L. Hennessy, and D. A. Patterson, "Computer Architecture - A Quantitative Approach (5. ed.)," Morgan Kaufmann, pp. 148-261, 2011.
  5. J. A. Fisher, P. Faraboschi, and C. Young, "Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools," Elsevier Morgan Kauffman, pp. 45-398, 2005.
  6. D. -H. Kim, "AMEX: Extending Addressing Mode of 16-bit Thumb Instruction Set Architecture," Journal of The Korea Society of Computer and Information, Vol. 17, No. 11, pp. 1-10, 2012. https://doi.org/10.9708/jksci/2012.17.11.001
  7. A. Krishnaswamy and R. Gupta, "Efficient Use of Invisible Registers in Thumb Code," In Proc. of the 38th IEEE/ACMInternational Symposium on Microarchitecture, pp. 30-42, Nov 2005.
  8. A. Krishnaswamy, and R. Gupta, "Dynamic coalescing for 16-bit instructions," ACM Transaction on Embedded Computing System, Vol. 4, No. 1, pp. 3-37, Feb. 2005. https://doi.org/10.1145/1053271.1053273
  9. Y. -J. Kwon, X. Ma, and H. J. Lee, "PARE: instruction set architecture for efficient code size reduction," IEE Electronics Letters, Vol. 35, No. 24, pp. 2098-2099, Nov. 1999. https://doi.org/10.1049/el:19991420
  10. B. Li, and R. Gupta, "Bit Section Instruction Set Extension of ARMfor Embedded Applications," In Proc. of International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), pp. 69-78, Grenoble, France, 2002.
  11. H.-H. Chiang, H.-J. Cheng, and Y.-S. Hwang, "Doubling the Number of Registers on ARM Processors," In Proc. of the 16th Workshop on Interaction between Compilers and Computer Architectures (INTERACT-16), pp. 1-8, Feb. 2012.
  12. Hedley Francis, "ARM DSP-Enhanced Extensions," ARM Ltd., 2001.
  13. J. Rokov, and D. Ing, "ARM Architecture and Multimedia Applications," RIZ-Transmitters Co., 2010.
  14. ARM Ltd. "Introducing NEON$^{TM}$ Development Article," 2009.
  15. J. Goodacre, and A. N. Sloss, "Parallelism and the ARM instruction set architecture," Computer, Vol. 38, No. 7, pp. 42-50, 2005. https://doi.org/10.1109/MC.2005.287
  16. D. -H. Kim, "The Compressed Instruction Set Architecture for the OpenRISC Processor," Journal of The Korea Society of Computer and Information, Vol. 17, No. 10, pp. 11-23, 2012. https://doi.org/10.9708/jksci/2012.17.10.011
  17. F. Bellard, "QEMU, a fast and portable dynamic translator," In Proc. of the Int. Conf. on USENIX Annual Technical Conference, Berkeley, CA, USA, pp. 41-41, 2005.
  18. ARM, "Cortex-M3 technical reference manual," http://www.arm.com, 2010.