On Design for Elimination of the Merging Delay Time in the Multiple Vector Reduction (Inner Product)

다중벡터감출처리(내적처리)에서 합병지연시간의 제거를 위한 설계

  • Published : 2000.12.01

Abstract

A multiple vector reductive processing occurs during the vector inner product operation ([C] = [A] $\bigodot$,$\square$ [B]) and proceeds at the hardware dyadic pipeline unit. Every scalar result has to be generated with the component merging delay time in the multiple vector reduction($\bigodot$). In this paper we propose a new design method by which the component merging time could be eliminated from the multiple reduction and the scalar results from the reduction($\bigodot$) could be generated nearly in the almost same condensed time as the input components are fel>ded in the dyadic pipeline unitlo) or the output components are drained out of the dyadic pipeline unit($\square$), so called a dedicated chained pipeline unit for only a inner product operation.

다중 벡터감축처리는 벡터의 내적처리([C] =[A]$\bigodot$,$\square$ [B])에서 발생하며, 두 개의 입력포트를 갖는 파이프라인유니트에서 처리된다. 각각의 스칼라 결과값은 다중 벡터감축처리($\bigodot$)에서 요소들의 합병지연시간을 가져야 생성된다. 본 연구에서는 다중 감축처리에서 요소 합병지연시간이 제거되고, 감축처리($\bigodot$)로부터 스칼라 결과값들이 파이프라인($\square$) 입력시간과 거의 같게 생성될 수 있는 즉, 내적처리만을 위한 전용 체인 파이프라인 유니트 설계기법을 제안한다.

Keywords

References

  1. D. C. McCrackin, 'Eliminating Interlocks in Deeply Pipelined Processors by Delay Enforced Multistreaming,' IEEE Trans. on Comput., Vol.40. No.10, Oct. 1991 https://doi.org/10.1109/12.93745
  2. H. S. Stone, High-Performance Computer Architecture, 2nd ed., Addison-Wesley Publishing Company, 1990, pp.122-137
  3. R. Gupta, A. Zorat and I. V. Ramakrishnan, 'Reconfigurable Multipipelines for vector Super computers,' IEEE Trans. on Comput., Vol.38, No.9, Sept. 1989 https://doi.org/10.1109/12.29468
  4. K. Hwang and F. A. Briggs, Computer Architecture and Parallel Processing, McGraw-Hill, 1984, pp.151-154
  5. J. R. Jump and S. R. Ahuja, 'Effective Pipelining of Digital Systems,' IEEE Trans. on Comput., Vol.c-27, No.9, Sept. 1978 https://doi.org/10.1109/TC.1978.1675205
  6. S. R. Kunkel and J. Smith, 'Optimal Pipelining in Supercomputers,' The 13th Annual Int. Symp. on Comp. Arch. Conf. Proc. ACM, 1986 https://doi.org/10.1145/17407.17403
  7. P. M. Kogge, The architecture of pipeline computer, McGraw-Hill, 1981, pp.134-173
  8. L. M. Ni and K. Hwang, 'Vector-Reduction Techniques for Arithmetic-pipelines,' IEEE Trans. on Comput., Vol.c-34, No.5, May, 1985 https://doi.org/10.1109/TC.1985.1676580
  9. D. J. Kuck, The Structure of Computers and Computations, Vol.1, New York : Wiley, 1978, pp.257-258
  10. E. E. Swartzlander, B. K. Gilbert, I. S. Reed, 'Inner Product Computers,' IEEE Trans. on Comput., Vol.c-27, No.1, Jan. 1978 https://doi.org/10.1109/TC.1978.1674948
  11. J. R. Vanaken and G. I. Zick, 'The Expression Processor: A Pipelined, Multiple-Processor Architecture,' IEEE Trans. on Comput., Vol.c-30, No.8, Aug. 1981 https://doi.org/10.1109/TC.1981.1675837
  12. R. M. Russel, 'The Cray-I Computer System,' Cray Research, Inc., Tutorial Advanced Computer Architecture, IEEE Computer Society, Order No.667. 1986, pp.15-24
  13. C. V. Ramamoorthy & H. F. Li, 'Pipeline Architecture,' Tutorial Computer Architecture, Comp. Soc. Or. Nr. 704, THE COMPUTER SOCIETY OF THE IEEE, 1987, pp.38-79
  14. D. I. Moldovan, Modern Parallel Processing, University of Southern California, 1986
  15. M. M. Mano, Computer System Architecture, 3rd. Prentice International Ed. 1993, pp.305