• Title/Summary/Keyword: Parallelizing Compiler

Search Result 13, Processing Time 0.016 seconds

An Efficient Loop Splitting Method on Single Loop with Non-uniform Dependences (비균일 단일루프에서의 효율적인 루프 분할 방법)

  • Jeong Sam-Jin
    • The Journal of the Korea Contents Association
    • /
    • v.5 no.4
    • /
    • pp.204-211
    • /
    • 2005
  • This paper introduces three loop splitting methods such as minimum dependence distance method, Polychronopoulous' method, and first dependence method for exploiting parallelism from single loop which already developed. And it also Indicates their several problems. We extend the first dependence method which is the most effective one among three loop splitting methods, and propose more powerful loop splitting method to enhance parallelism on single loop. The proposed algorithm solves several problems, such as anti-flow dependence and g=gcd(a,c) > 1, that the first dependence method has.

  • PDF

Parallelism for Nested Loops with Simple Subscripts

  • Jeong, Sam-Jin
    • International Journal of Contents
    • /
    • v.4 no.4
    • /
    • pp.1-6
    • /
    • 2008
  • In this paper, we propose improved loop splitting method for maximizing parallelism of single loops with non-constant dependence distances. By using the iteration and distance for the source of the first dependence, and by our defined theorems, we present generalized and optimal algorithms for single loops with non-uniform dependences (MPSL). By the extension of the MPSL method, we also apply to exploit parallelism from nested loops with simple subscripts, based on cycle shrinking and loop interchanging method. The algorithms generalize how to transform general single loops with non-uniform dependences as well as nested loops with simple subscripts into parallel loops.

Improving the speed of deep neural networks using the multi-core and single instruction multiple data technology (다중 코어 및 single instruction multiple data 기술을 이용한 심층 신경망 속도 향상)

  • Chung, Ik Joo;Kim, Seung Hi
    • The Journal of the Acoustical Society of Korea
    • /
    • v.36 no.6
    • /
    • pp.425-435
    • /
    • 2017
  • In this paper, we propose optimization methods for speeding the feedforward network of deep neural networks using NEON SIMD (Single Instruction Multiple Data) parallel instructions and multi-core parallelization on the multi-core ARM processor. As the result of the optimization using SIMD parallel instructions, we present the amount of speed improvement and arithmetic precision stage by stage. Through the optimization using SIMD parallel instructions on the single core, we obtain $2.6{\times}$ speedup over the baseline implementation using C compiler. Furthermore, by parallelizing the single core implementation on the multi-core, we obtain $5.7{\times}{\sim}7.7{\times}$ speedup. The results we obtain show the possibility for applying the arithmetic-intensive deep neural network technology to applications on mobile devices.