DOI QR코드

DOI QR Code

Extended Three Region Partitioning Method of Loops with Irregular Dependences

비규칙 종속성을 가진 루프의 확장된 세지역 분할 방법

  • Jeong, Sam-Jin (Division of Information and Communication, Baekseok University)
  • 정삼진 (백석대학교 정보통신학부)
  • Received : 2015.03.16
  • Accepted : 2015.06.20
  • Published : 2015.06.30

Abstract

This paper proposes an efficient method such as Extended Three Region Partitioning Method for nested loops with irregular dependences for maximizing parallelism. Our approach is based on the Convex Hull theory, and also based on minimum dependence distance tiling, the unique set oriented partitioning, and three region partitioning methods. In the proposed method, we eliminate anti dependences from the nested loop by variable renaming. After variable renaming, we present algorithm to select one or more appropriate lines among given four lines such as LMLH, RMLH, LMLT and RMLT. If only one line is selected, the method divides the iteration space into two parallel regions by the selected line. Otherwise, we present another algorithm to find a serial region. The selected lines divide the iteration space into two parallel regions as large as possible and one or less serial region as small as possible. Our proposed method gives much better speedup and extracts more parallelism than other existing three region partitioning methods.

본 논문은 비규칙 종속성을 가진 내포된 루프의 수행 속도를 향상시키기 위해서 Extended Three Region Partitioning Method 라는 효과적인 루프 분할 방법에 대해서 연구하였다. 본 논문에서 제안된 루프 분할 방법은 변수 재명명에 의해서 역종속성을 가진 내포된 루프를 제거한 후 네 개의 선중에 하나 혹은 그 이상의 적절한 선을 선택하는 알고리즘을 개발한다. 한 개의 선이 선택되면 선택된 선에 의해서 전체 영역은 두 개의 병렬지역으로 분할된다. 한 개 이상의 선이 선택되면 그 선들에 의해서 하나의 순차지역과 두 개의 병렬지역으로 분할한다. 제안된 분할 방법은 기존의 분할 방법보다 성능이 우수함을 성능 분석에서 보여준다.

Keywords

References

  1. D. Bilar, E. Filiol, "On self-reproducing computer programs," in Journal in Computer Virology, vol. 9, no. 1, pp. 9-87, Feb. 2009
  2. S. P. Midkiff, "Automatic Parallelization : An Overview of Fundamental Compiler Techniques," in Synthesis Lectures on Computer Architecture, vol. 7, no.1, pp. 1-169, 2012. https://doi.org/10.2200/S00340ED1V01Y201201CAC019
  3. U. Banerjee, Dependence Analysis for Supercomputing, Kluwer Academic, Norwell, Mass., 1988.
  4. W. Pugh, "the Omega test: A fast and practical integer programming algorithm for dependence analysis," in Proc. Supercomputing'91, Nov. 1991.
  5. M. Wolfe and C. W. Tseng, "The power test for data dependence," IEEE Trans. Parallel and Distributed Systems, vol. 3, no. 5, pp. 591-601, Sep. 1992. https://doi.org/10.1109/71.159042
  6. U. Banerjee, Speedup of Ordinary Programs, Ph.D. thesis, University of Illinois at Urbana-Champaign, Oct. 1979.
  7. M. E. Wolfe and M. S. Lam, "A loop transformation theory and an algorithm to maximize parallelism," IEEE transactions on Parallel and Distributed Systems, vol. 2, pp. 452-471, Oct 1991. https://doi.org/10.1109/71.97902
  8. D. Maydan, J. Hennessy, and M. Lam, "Efficient and exact data dependence analysis," in Proc. ACM SIGPLAN '91 Conf. on programming Language Design and Implementation, Toronto, June 1991.
  9. S. Chatterjee, "Compiling nested data-parallel programs for shared-memory multiprocessors," ACM Transactions on Programming Languages and Systems, vol. 15, no. 3, pp. 400-462, July 1993. https://doi.org/10.1145/169683.174152
  10. T. Tzen and L. Ni, "Dependence uniformization: A loop parallelization technique," IEEE Trans. Parallel and Distributed Systems, vol. 4, no. 5, pp. 547-558, 1993. https://doi.org/10.1109/71.224217
  11. S. Punyamurtula and V. Chaudhary, "Minimum dependence distance tiling of nested loops with non-uniform dependences," in Proc. Symp. Parallel and Distributed Processing, pp. 74-81, 1994.
  12. J. Ju and V. Chaudhary, "Unique sets oriented Partitioning of nested loops with non-uniform dependences," in Proc. Int. Conf. Parallel Processing, vol. 3, pp. 45-52, 1996.
  13. C. K. Cho and M. H. Lee, "A Loop Parallization Method for Nested Loops with Non-uniform Dependences", in Proceedings of the International Conference on Parallel and Distributed Systems, pp. 314-321, 1997.
  14. D. L Pean and C. Chen, "An Optimized Three Region Partitioning Technique to Maximize Parallelism of Nested Loops with Non-uniform Dependences," Journal of Information Science and Engineering, vol. 17, pp. 463-487, 2001.
  15. S. J. Jeong, "Maximizing Parallelism for Nested Loops with Non-uniform Dependences", pp. 213-222, ICCSA 2004.
  16. T. Tzen and L. Ni, "Dependence uniformization: A loop parallelization technique," IEEE Transactions on Parallel and Distributed Systems, vol. 4, no. 5, pp. 547-558, 1993. https://doi.org/10.1109/71.224217