Browse > Article

AVX-512를 활용한 인텔 차세대 프로세서에서의 효과적인 프로그래밍 방법  

Choe, Jae-Yeong (숭실대학교)
Kim, Rae-Hyeon (숭실대학교)
Im, Rok-Taek (숭실대학교)
Publication Information
Korea Information Processing Society Review / v.25, no.1, 2018 , pp. 68-77 More about this Journal
Keywords
Citations & Related Records
연도 인용수 순위
  • Reference
1 Goto, K., van de Geijn, R.A. "Anatomy of high-performance matrix multiplication", ACM Transactions on Mathematical Software (TOMS) 34(3), 12 (2008)   DOI
2 Gunnels, J.A., Henry, G.M., Van De Geijn, R.A. "A family of highperformance matrix multiplication algorithms.", In: International Conference on Computational Science, pp. 51-60. Springer (2001)
3 Heinecke, A., Vaidyanathan, K., Smelyanskiy, M., Kobotov, A., Dubtsov, R., Henry, G., Shet, A.G., Chrysos, G., Dubey, P. "Design and implementation of the linpack benchmark for single and multi-node systems based on Intel Xeon Phi Coprocessor" In: Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on, pp.126-137. IEEE (2013)
4 "Intel Intrinsics Guide." Software.intel.com. (2018). [online] Available at: https://software.intel.com/sites/landingpage/IntrinsicsGuide/ [Accessed 22 Mar. 2018].
5 Jeffers, J., Reinders, J., Sodani, A.: Intel Xeon Phi Processor High Performance Programming: Knights Landing Edition. Morgan Kaufmann (2016)
6 Lim, R., Lee, Y., Kim, R., Choi, J. "An Implementation of matrix-matrix multiplication on the Intel KNL processor with AVX-512." In: Cluster Computing (Submitted)
7 Peyton, J.L. "Programming dense linear algebra kernels on vectorized architectures." Master's thesis, The University of Tennessee, Knoxville (2013)
8 Van Zee, F. G., van de Geijn, R. A. "BLIS: A Framework for Rapidly Instantiating BLAS Functionality" In: ACM Trans. Math. Softw., 41(3), pp.1-33. ACM (2015)
9 Xianyi, Z., Qian, W., Yunquan, Z. "Model-driven level 3 BLAS performance optimization on Loongson 3A processor" In: Parallel and Distributed Systems, 2012 IEEE 18th International Conference, pp. 684-691. IEEE (2012)