DOI QR코드

DOI QR Code

OpenRISC 프로세서를 위한 압축 명령어 집합 구조

The Compressed Instruction Set Architecture for the OpenRISC Processor

  • 김대환 (수원과학대학교 컴퓨터정보과)
  • Kim, Dae-Hwan (Dept. of Computer Information, Suwon Science College)
  • 투고 : 2012.08.10
  • 심사 : 2012.09.14
  • 발행 : 2012.10.31

초록

본 논문에서는 OpenRISC 프로세서의 코드 크기를 저감하는 새로운 압축 명령어 집합 구조를 제시한다. 새로운 명령어와 형식은 기존 명령어들의 사용 빈도와 용법에 대한 프로파일 정보에 의해 결정된다. 제시된 기법에서는 기존의 32비트 명령어들과 연속적인 명령어들을 각각 대체하는 새로운 16비트 명령어와 32비트 명령어를 도입한다. 제시된 명령어는 세 유형으로 분류할 수 있다. 첫 번째는 사용 빈도가 높은 기존의 덧셈, 로드, 저장, 분기 명령어 등의 32비트 명령어들을 대체하는 새로운 16비트 명령어들이다. 두 번째 유형은 사용 빈도가 높은 두 개의 연속적인 로드 명령어, 두 개의 연속적인 저장 명령어, 32비트 데이터 이동 명령어를 압축하는 새로운 32비트 명령어들이다. 마지막으로 함수 프롤로그와 에필로그 명령어들을 각각 하나로 압축하는 두 개의 새로운 32비트 명령어가 제시된다. 추가된 명령어들을 디코딩하기 위해서 OpenRISC 하드웨어 디코더 부분이 확장된다. OpenRISC 1200프로세서에서 실험을 수행한 결과, 성능 저하 없이 30.4%의 코드 크기를 절감한다.

To achieve efficient code size reduction, this paper proposes a new compressed instruction set architecture for the OpenRISC architecture. The new instructions and their corresponding formats are designed by the profiling information of the existing instruction usage. New 16-bit instructions and 32-bit instructions are proposed to compressed the existing 32-bit instructions and instruction sequences, respectively. The proposed instructions can be classified into three types. The first is the new 16-bit instructions for the frequent normal 32-bit instructions such as add, load, store, branch, and jump instructions. The second type is the new 32-bit instructions for the consecutive two load instructions, two store instructions, and 32-bit data mov instructions. Finally, two new 32-bit instructions are proposed to compress function prolog and epilog code, respectively. OpenRISC hardware decoder is extended to support the new instructions. Experiments show that the efficiency of code size reduction improves by an average of 30.4% when compared to the OR1200 instruction set architecture without loss of execution performance.

키워드

참고문헌

  1. S. Segars, K. Clarke, and L. Goudge, "Embedded control problems, Thumb, and the ARM7TDMI," IEEE Micro, Vol. 15, No. 5, pp. 22-30, Oct. 1995. https://doi.org/10.1109/40.464580
  2. S. Furber, "ARM system-on-chip architecture," Addison-Wesley, 2000.
  3. K. Kissell, "MIPS16: High-Density MIPS for the Embedded Market," Technical report, Silicon Graphics MIPS Group, 1997.
  4. LSI LOGIC, "TinyRISC LR4102 Microprocessor Technical Manual," LSI LOGIC, Milpitas, CA, 2000.
  5. ST Microelectronics, "ST100 Technical Manual," ST Microelectronics, Geneva, Switzerland, 2004.
  6. ARC Cores, "ARCtangent-A5 Microprocessor Technical Manual," Herts, England, 2005.
  7. R. Phelan, "Improving ARM Code Density and Performance," Technical report, Advanced RISC Machines Ltd., June 2003.
  8. D. Lampret, "OpenRISC 1200 IP core specification," 2001.
  9. H. Jung, and K. Ryoo, "Performance and Power Consumption Improvement of Embedded RISC Core," Journal of the Korean Institute Of Maritime information & Communication Science, Vol. 14, No. 2, pp. 453-461, 2010. https://doi.org/10.6109/jkiice.2010.14.2.453
  10. H. Jung, X. Jin, and K. Ryoo, "Performance Improvement and Power Consumption Reduction of an Embedded RISC Core," Journal of information and communication convergence engineering, Vol. 10, No. 1, pp. 78-84, 2012. https://doi.org/10.6109/jicce.2012.10.1.078
  11. V. Viswanath, J. A. Abraham, and W. A. Hunt, "Automatic insertion of low power annotations in RTL for pipelined microprocessors," In Proceedings of the conference on Design, automation and test in Europe, pp. 496-501, 2006.
  12. R. Maheswari, and V. Pattabiraman, "A new technique of embedding multigrain parallel HPRC in OR1200 a soft-core processor," In Proceedings of SEPADS'12/EDUCATION'12, pp. 92-97, 2012.
  13. B. Li, and R. Gupta, "Bit Section Instruction Set Extension of ARM for Embedded Applications," In Proceedings of International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), Grenoble, France, pp. 69-78, October 2002.
  14. H. -J. Cheng, Y. -S. Hwang, R. -G. Chang, and C. -W. Chen, "Trading Conditional Execution for More Registers on ARM Processors," In Proceedings of the 8th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing (EUC), pp. 53-59, Dec. 2010.
  15. S. -M. Kang, and J. -M. Kim, "Multimedia Extension Instructions and Optimal Many-core Processor Architecture Exploration for Portable Ultrasonic Image Processing," Journal of The Korea Society of Computer and Information, Vol. 17, No. 8, pp. 1-10, 2012. https://doi.org/10.9708/jksci.2012.17.8.001
  16. Y. -B. Jung, Y. -M. Kim, C. -H. Kim, and J. -M. Kim, "Performance Evaluation and Verification of MMX-type Instructions on an Embedded Parallel Processor," Journal of The Korea Society of Computer and Information, Vol. 16, No. 10, pp. 11-21, 2011. https://doi.org/10.9708/jksci.2011.16.10.011
  17. D. Lampret, C.-M. Chen, M.Mlinar, et al, "OpenRISC 1000 Architecture Manual," 2003.
  18. OpenCores, http://www.opencores.org
  19. C. Lee, M. Potkonjak and H. Mangione-Smith," MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communications Systems," Micro-30, pp. 330-335, November 1997.
  20. M. Guthaus, J. Ringenberg, D. Ernst, T. Austin, T. Mudge, and R. Brown, "Mibench: A free, commercially representative embedded benchmark suite," In Proceedings of the 4th IEEE International Workshop on the Workload Characterization, pp. 3-14, 2001.
  21. John L. Henning, "SPEC CPU 2000: Measuring CPU performance in the new millennium," IEEE Computer, Vol. 33, No. 7, pp. 28-35, July 2000. https://doi.org/10.1109/2.869367
  22. E.M. McCreight, "A Space-Economical Suffix Tree Construction Algorithm," Journal of the ACM, Vol. 23, No. 2, pp. 262-272, April 1976. https://doi.org/10.1145/321941.321946
  23. J. L. Hennessy, and D. A. Patterson, "Computer Architecture - A Quantitative Approach (5. ed.)," Morgan Kaufmann, pp. B1-B.47, 2012.

피인용 문헌

  1. Improving rendering speed of 3D geospatial data based on HTML5/WebGL using improved arithmetic operation speed pp.2161-6779, 2018, https://doi.org/10.1080/12265934.2018.1476175
  2. Thumb-2 명령어 집합 구조의 병렬 분기 명령어 확장 vol.18, pp.7, 2013, https://doi.org/10.9708/jksci.2013.18.7.001