Browse > Article
http://dx.doi.org/10.5573/ieie.2016.53.12.020

Implementation of Hardware Data Prefetcher Adaptable for Various State-of-the-Art Workload  

Kim, KangHee (Dept. of Electronic Engineering, Inha University)
Park, TaeShin (Dept. of Electronic Engineering, Inha University)
Song, KyungHwan (Dept. of Electronic Engineering, Inha University)
Yoon, DongSung (Dept. of Electronic Engineering, Inha University)
Choi, SangBang (Dept. of Electronic Engineering, Inha University)
Publication Information
Journal of the Institute of Electronics and Information Engineers / v.53, no.12, 2016 , pp. 20-35 More about this Journal
Abstract
In this paper, in order to reduce the delay and area of the partial product accumulation (PPA) of the parallel decimal multiplier, a tree architecture that composed by multi-operand decimal CSAs and improved CLA is proposed. The proposed tree using multi-operand CSAs reduces the partial product quickly. Since the input range of the recoder of CSA is limited, CSA can get the simplest logic. In addition, using the multi-operand decimal CSAs to add decimal numbers that have limited range in specific locations of the specific architecture can reduce the partial products efficiently. Also, final BCD result can be received faster by improving the logic of the decimal CLA. In order to evaluate the performance of the proposed partial product accumulation, synthesis is implemented by using Design Complier with 180 nm COMS technology library. Synthesis results show the delay of the proposed partial product accumulation is reduced by 15.6% and area is reduced by 16.2% comparing with which uses general method. Also, the total delay and area are still reduced despite the delay and area of the CLA are increased.
Keywords
Parallel decimal multiplication; IEEE 754-2008; Multi-operand;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 B. Falsafi and T. F. Wenisch, A Primer on Hardware Prefetching, Morgan & Claypool Publisher, p. 1-5. 2014.
2 S. P. Vanderwiel and D. J. Lilja, "Data prefetch mechanisms," ACM, Computing Surveys., vol. 32, no. 2, pp. 174-199, Jun 2000.   DOI
3 Y. S. Jeong, J. H. Kim, T. H. Cho, and S. B. Choi, "Instructions and Data Prefetch Mechanism using Displacement History Buffer," Journal of The Institute of Electronics Engineers of Korea, vol. 52, no. 10, pp 82-94, Oct 2015.
4 D. Y. Jung and Y. S. Lee, "Cache Replacement Policy Based on Dynamic Counter for High Performance Processor," Journal of The Institute of the Electronics Engineers of Korea, vol. 50, no. 4, pp. 52-58, Apr 2013.
5 The 1st JILP Data Prefetching Championship (DPC-1) Available at : http://www.jilp.org/dpc/
6 The 2nd Data Prefetching Championship (DPC2) Available at : http://comparch-conf.gatech.edu/dpc2/
7 X. Zhuang and H. H. S. Lee, "A hardware-based cache pollution filtering mechanism for aggressive prefetches, " in Proc International Conference on Parallel Processing, pp. 286-293. Kaohsiung, Oct 2003.
8 S. Srinath, O. Mutlu, H. Kim, and Y. N. Patt, "Feedback Directed Prefetching: Improving the Performance and Bandwidth-Efficiency of Hardware Prefetchers," in Proc. of IEEE Conf International Symposium on High Performance Computer Architecture, pp. 10-14, Scottsdale, USA, Feb 2007.
9 S. H. Pugsley, Z. Chishti, C. Wilterson, P. f. Chuang, R. L. Scott, A. Jaleel, S. L. Lu, K. Chow, and R. Balasubramonian, "Sandbox Prefetching: Safe Run-Time Evaluation of Aggressive Prefetchers," in Proc. of IEEE Conf. International Symposium on High Performance Computer Architecture, pp. 15-19, Orlando, USA, Feb 2014.
10 B. Panda and S Balachandran, "Expert Prefetch Prediction: An Expert Predicting the Usefulness of Hardware Prefetchers," in IEEE Computer Architecture Letters, vol. 15, no. 1, pp. 13-16, Jan.-June 1 2016.   DOI
11 N. Binkert, S. Sardashti, R. Sen et al, "The gem5 simulator," ACM SIGARCH Computer Architecture News, vol. 39, no. 2, pp. 1-7, May 2011.
12 C. Bienia, S. Kumar, J. P. Jaswinder, and K. Li, "The PARSEC benchmark suite: characterization and architectural implications," In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, pp. 72-81, Toronto, Canada, Oct 2008.