Browse > Article
http://dx.doi.org/10.9708/jksci.2013.18.12.001

Low Power TLB Supporting Multiple Page Sizes without Operation System  

Jung, Bo-Sung (ERI, Dept. of Control & Instrumentation, Gyeongsang National University)
Lee, Jung-Hoon (ERI, Dept. of Control & Instrumentation, Gyeongsang National University)
Abstract
Even though the multiple pages TLB are effective in improving the performance, a conventional method with OS support cannot utilize multiple page sizes in user application. Thus, we propose a new multiple-TLB structure supporting multiple page sizes for high performance and low power consumption without any operating system support. The proposed TLB is organised as two parts of a S-TLB(Small TLB) with a small page size and a L-TLB(Large TLB) with a large page size. Both are designed as fully associative bank structures. The S-TLB stores small pages are evicted from the L-TLB, and the L-TLB stores large pages including a small page generated by the CPU. Each one bank module of S-TLB and L-TLB can be selectively accessed base on particular one and two bits of the virtual address generated from CPU, respectively. Energy savings are achieved by reducing the number of entries accessed at a time. Also, this paper proposed the simple 1-bit LRU policy to improve the performance. The proposed LRU policy can present recently referenced block by using an additional one bit of each entry on TLBs. This method can simply select a least recently used page from the L-TLB. According to the simulation results, the proposed TLB can reduce Energy * Delay by about 76%, 57%, and 6% compared with a fully associative TLB, a ARM TLB, and a Dual TLB, respectively.
Keywords
TLB; Low power; Memory management; Bank structure; Multiple page;
Citations & Related Records
연도 인용수 순위
  • Reference
1 X. G. Qiu and M. Dubois. "Moving Address Tra nslation Closer to Memory in Distributed Shared-Memory Multiprocessors," IEEE Tran. on Parallel and Distributed Systems, Vol. 16, No 7, pp.612-623, Mar. 2005.   DOI   ScienceOn
2 T. W. Barr. "Exploiting Address Space Continuity to Accelerate TLB Miss Handling," Master degree paper of Rice University, 2010.
3 A. Basu, M. D. Hill, and M. M. Swift, "Reducing Memory Reference Energy with Opportunistic Virtual Caching," In Proceedings of International Symposium on Computer Architecture, pp.297-308, 2012.
4 R. Bhargava et al., "Accelerating Two- Dimensional Page Walks for Virtualized Systems," Proceedings of the 13th international conference on Architecture support for programming languages and operation system, pp.26-35, 2008.
5 B, Pham, V. Vaidyanathan, A. Jaleel and A. Bhattach arjee, "CoLT: Coalesces Large-Reach TLBs," Annual IEEE/ACM International Symposium on MICRO, pp.258-269, Dec. 2012.
6 C. H. Pack, D. Y. Pack, "Increasing TLB Reach with Multiple Pages Size Subblocks," 21st IEEE International Performance, Computing and Communications Conference, pp.123-130, 2002.
7 M. Talluri and M. D. Hill, "Surpassing the TLB performance of superpages with less operating system support," in Proc. of the 6th Symposium on Architectural Support for Programming Languages and Operating systems, pp.171-182, Oct. 1994.
8 T. Fukunaga and T. Sueyoshi, "Improvement of parallel processing performance by using two kinds of Huge Page," Automation and Systems International Conference on Control. pp.2662-2666, Oct. 2008.
9 T. W. Barr, A. L. Cox, and S. Rixner. "SpecTLB: a mechanism for speculative address translation," In Proceeding of the 38th annual international symposium on Computer architecture, pp.307-318, 2011.
10 J. H. Lee and S. D. Kim, "A dynamic TLB managment structure to support different page sizes," Proceedings of the Second IEEE Asia Pacific Conference on ASICs, pp.299-302, Aug. 2000.
11 A. Seznec, " Concurrent support of Multiple page sizes on a skewed associative TLB," IEEE transactions on computers, Vol. 53, pp.924-927, July, 2004.   DOI   ScienceOn
12 cortex-A9: technical reference manual, 2008.
13 G. Reinman. and N. P. Jouppi, "CACTI 3.0: An integrated cache timing and power, and area model," Compaq WRL Report, Aug. 2001.
14 Y. J. Chang, "Two New Techniques Integrated for Energy-Efficient TLB Design," IEEE Transactions on Very Large Scale Integration System, Vol. 15, No. 1, Jan. 2007.
15 A. Bhattacharjee and M. Martonosi, "Inter-Core Cooperative TLB Prefetchers for Chip Multiprocessors," Proceedings of the 15th edition of Architecture support for programming languages and operation system, pp.359-370, 2010.
16 D. Burger and T. M. Austin, "The SimpleScalar tool set, version 2.0, Technical Report TR-97-1 342," University of Wisconsin-Madison, 1997.
17 SPEC Benchmark Suite. http://www.spec.org