Browse > Article
http://dx.doi.org/10.5626/JCSE.2015.9.1.20

Speculative Parallelism Characterization Profiling in General Purpose Computing Applications  

Wang, Yaobin (Department of Computer Science and Technology, Southwest University of Science and Technology)
An, Hong (Department of Computer Science and Technology, University of Science and Technology of China)
Liu, Zhiqin (Department of Computer Science and Technology, Southwest University of Science and Technology)
Li, Li (Department of Computer Science and Technology, Southwest University of Science and Technology)
Yu, Liang (Department of Computer Science and Technology, Southwest University of Science and Technology)
Zhen, Yilu (Department of Computer Science and Technology, Southwest University of Science and Technology)
Publication Information
Journal of Computing Science and Engineering / v.9, no.1, 2015 , pp. 20-28 More about this Journal
Abstract
General purpose computing applications have not yet been thoroughly explored in procedure level speculation, especially in the light-weighted profiling way. This paper proposes a light-weighted profiling mechanism to analyze speculative parallelism characterization in several classic general purpose computing applications from SPEC CPU2000 benchmark. By comparing the key performance factors in loop and procedure-level speculation, it includes new findings on the behaviors of loop and procedure-level parallelism under these applications. The experimental results are as follows. The best gzip application can only achieve a 2.4X speedup in loop level speculation, while the best mcf application can achieve almost 3.5X speedup in procedure level. It proves that our light-weighted profiling method is also effective. It is found that between the loop-level and procedure-level TLS, the latter is better on several cases, which is against the conventional perception. It is especially shown in the applications where their 'hot' procedure body is concluded as 'hot' loops.
Keywords
Multicore; Thread level speculation; General purpose computing; Profiling; Data dependence;
Citations & Related Records
연도 인용수 순위
  • Reference
1 A. Raman, H. Kim, T. R. Mason, T. B. Jablin, and D. I. August, "Speculative parallelization using software multithreaded transactions," ACM SIGARCH Computer Architecture News, vol. 38, no. 1, pp. 65-76, 2010.   DOI
2 M. K. Prabhu and K. Olukotun, "Exposing speculative thread parallelism in SPEC2000," in Proceedings of the 10th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Chicago, IL, 2005, pp. 142-152.
3 A. Kejariwal, X. Tian, W. Li, M. Girkar, S. Kozhukhov, H. Saito, et al, "On the performance potential of different types of speculative thread-level parallelism," in Proceedings of the 20th Annual International Conference on Supercomputing (ICS), Cairns, Australia, 2006, p. 24.
4 K. Selvamani, and T. M. Taha, "Estimating critical region parallelism to guide platform retargeting," in Proceedings of the 43rd ACM Southeast Regional Conference, Kennesaw, GA, 2005, pp. 168-173.
5 J. Auerbach, D. F. Bacon, I. Burcea, P. Cheng, S. J. Fink, R. Rabbah, and S. Shukla, "A compiler and runtime for heterogeneous computing," in Proceedings of the 49th Annual Design Automation Conference, San Francisco, CA, 2012, pp. 271-276.
6 M. Samadi, A. Hormati, J. Lee, and S. Mahlke, "Paragon: collaborative speculative loop execution on GPU and CPU," in Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units, London, UK, 2012, pp. 64-73.
7 P. Yiapanis, D. Rosas-Ham, G. Brown, and M. Lujan, "Optimizing software runtime systems for speculative parallelization," ACM Transactions on Architecture and Code Optimization (TACO), vol. 9, no. 4, article no. 9, 2013.
8 J. G. Steffan, C. Colohan, A. Zhai, and T. C. Mowry, "The STAMPede approach to thread-level speculation," ACM Transactions on Computer Systems(TOCS), vol. 23, no. 3, pp. 253-300, 2005.   DOI   ScienceOn
9 L. Hammond, B. A. Hubbert, M. Siu, M. K. Parbhu, M. Chen, and K. Qlukolun, "The Stanford Hydra CMP," IEEE Micro, vol. 20, no. 2, pp. 71-84, 2000.   DOI   ScienceOn
10 G. S. Sohi, S. E. Breach, and T. N. Vijaykumar, "Multiscalar processors," in Proceedings of the 22nd Annual International Symposium on Computer Architecture (ISCA'95), Barcelona, Spain, 1995, pp. 414-425.
11 J. T. Oplinger, D. L. Heine, and M. S. Lam, "In search of speculative thread-level parallelism," in Proceedings of International Conference on Parallel Architectures and Compilation Techniques (PACT'99), Newport Beach, CA, 1999, pp. 303-313.
12 Z. H. Du, C. C. Lim, X. F. Li, C. Yang, Q. Zhao, and T. F. Ngai, "A cost-driven compilation framework for speculative parallelization of sequential programs," in Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation, Washington DC, 2004, pp. 71-81.
13 Y. Liu, H. An, B. Liang, and L. Wang, "An online profile guided optimization approach for speculative parallel threading," in Advances in Computer Systems Architecture, Lecture Notes in Computer Science vol. 4697, Heidelberg: Springer, pp. 28-39, 2007.
14 Y. Wang, H. An, B. Liang, L. Wang, & R. Guo, "OpenPro: a dynamic profiling tool set for exploring thread-level speculation parallelism," in Proceedings of the International Conference on Computer and Electrical Engineering (ICCEE), Phuket Island, Thailand, 2008, pp. 256-260.
15 D. Prountzos, R. Manevich, K. Pingali, and K. S. McKinley, "A shape analysis for optimizing parallel graph programs," ACM SIGPLAN Notices, vol. 46, no. 1, pp. 159-172, 2011.   DOI
16 T. A. Johnson, R. Eigenmann, and T. N. Vijaykumar, "Speculative thread decomposition through empirical optimization," in Proceedings of the 12th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming, San Jose, CA, 2007, pp. 205-214.
17 C. Tian, M. Feng, and R. Gupta, "Speculative parallelization using state separation and multiple value prediction," ACM SIGPLAN Notices, vol. 45, no. 8, pp. 63-72, 2010.   DOI
18 A. Munir, S. Ranka, and A. Gordon-Ross, "High-performance energy-efficient multicore embedded computing," IEEE Transactions on Parallel and Distributed Systems, vol. 23, no. 4, pp. 684-700, 2012.   DOI   ScienceOn