Browse > Article
http://dx.doi.org/10.6109/jkiice.2013.17.8.1891

Hardware Design of Pipelined Special Function Arithmetic Unit for Mobile Graphics Application  

Choi, Byeong-Yoon (Department of Computer Engineering, Dongeui University)
Abstract
To efficiently execute 3D graphic APIs, such as OpenGL and Direct3D, special purpose arithmetic unit(SFU) which supports floating-point sine, cosine, reciprocal, inverse square root, base-two exponential, and logarithmic operations is designed. The SFU uses second order minimax approximation method and lookup table method to satisfy both error less than 2 ulp(unit in the last place) and high speed operation. The designed circuit has about 2.3-ns delay time under 65nm CMOS standard cell library and consists of about 23,300 gates. Due to its maximum performance of 400 MFLOPS and high accuracy, it can be efficiently applicable to mobile 3D graphics application.
Keywords
minimax algorithm; floating-point; arithmetic unit; 3-dimensional graphics SoC; multithreading;
Citations & Related Records
연도 인용수 순위
  • Reference
1 K. C. Ng, "Argument Reduction for Huge Arguments : Good to the last Bit," SunPro, July 13, 1992.
2 Michael J. Schulte and Earl E. Swartzlander, Jr, "Hardware Design for Exactly Rounded Elementary functions," IEEE Transaction on Computer, vol.43, no.8, pp.964-973, Aug. 1994.   DOI   ScienceOn
3 S. M. Quek and Larry Hu, "Apparatus for Determining Booth Recoder Input Controls Signals", US patent, 5,280,439, Jan. 18, 1994.
4 M. Roorda, "Method to reduce the sign bit extension in a multiplier that uses the modified booth algorithm, "Electronics letters, vol.22, no.20, pp.1061-1062, 1986.   DOI   ScienceOn
5 H. C. Shin, J. A. Lee, and L. S. Kim, " A Minimized Hardware Architecture of fast Phong Shader Using Taylor Series Approximation in 3D Graphics," Proc. Int'l Conf. Computer Design, pp.286-291, 1998.
6 Jeong-Ho Woo, Ju-Ho Sohn, Byeong-Gyu Nam, and Hoi-Hun Yoo, Mobile 3D Graphics Soc : From Algorithm to Chip, John Wiley & Sons, 2010.
7 Chang-Hyo Yu, Kyusik Chung, Donghyun Kim, Lee-Sup Kim, "An Energy-Efficient Mobile Vertex Processor With Multithread Expanded VLIW Architecture and Vertex Caches", IEEE Journal of solid state circuits, vol. 42, no. 10, pp.2256-2269. oct. 2007.
8 Jean-Michel Muller, Elementary Functions: Algorithms and Implementation, Birkhauser Press, 1997.
9 Stuart F. Oberman and Michael Y. Siu, "A High Performance Area Efficient Multifunction Interpolator", IEEE 11th Symposium on Computer Arithmetic, pp.272-279, 2005.
10 Ping Tak Peter Tang, "Table-Driven Implementation of the logarithm function in IEEE Floating-Point Arithmetic," ACM Transactions on Mathematics Software, vol. 4, no. 16, pp.378-400, Dec. 1990.
11 IEEE, ANSI/IEEE Standard 754-1985: IEEE Standard for Binary Floating-Point Arithmetic, IEEE Press, 1985.
12 Jose-Alejandro Pineiro, Stuart F. Oberman, Jean-Michel Muller, and Javier D. Bruguera, "High-Speed Function Approximation Using a Minimax Quadratic Interpolator," IEEE Transaction on Computer, vol.54, no.3, pp.304-318, Mar. 2005.   DOI   ScienceOn
13 Waterloo Maple Inc., Maple 14 Programming Guide, 2010.