• Title/Summary/Keyword: parallel multiplier

Search Result 158, Processing Time 0.021 seconds

Parallelized Architecture of Serial Finite Field Multipliers for Fast Computation (유한체 상에서 고속 연산을 위한 직렬 곱셈기의 병렬화 구조)

  • Cho, Yong-Suk
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.17 no.1
    • /
    • pp.33-39
    • /
    • 2007
  • Finite field multipliers are the basic building blocks in many applications such as error-control coding, cryptography and digital signal processing. Hence, the design of efficient dedicated finite field multiplier architectures can lead to dramatic improvement on the overall system performance. In this paper, a new bit serial structure for a multiplier with low latency in Galois field is presented. To speed up multiplication processing, we divide the product polynomial into several parts and then process them in parallel. The proposed multiplier operates standard basis of $GF(2^m)$ and is faster than bit serial ones but with lower area complexity than bit parallel ones. The most significant feature of the proposed architecture is that a trade-off between hardware complexity and delay time can be achieved.

Resource and Delay Efficient Polynomial Multiplier over Finite Fields GF (2m) (유한체상의 자원과 시간에 효율적인 다항식 곱셈기)

  • Lee, Keonjik
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.16 no.2
    • /
    • pp.1-9
    • /
    • 2020
  • Many cryptographic and error control coding algorithms rely on finite field GF(2m) arithmetic. Hardware implementation of these algorithms needs an efficient realization of finite field arithmetic operations. Finite field multiplication is complicated among the basic operations, and it is employed in field exponentiation and division operations. Various algorithms and architectures are proposed in the literature for hardware implementation of finite field multiplication to achieve a reduction in area and delay. In this paper, a low area and delay efficient semi-systolic multiplier over finite fields GF(2m) using the modified Montgomery modular multiplication (MMM) is presented. The least significant bit (LSB)-first multiplication and two-level parallel computing scheme are considered to improve the cell delay, latency, and area-time (AT) complexity. The proposed method has the features of regularity, modularity, and unidirectional data flow and offers a considerable improvement in AT complexity compared with related multipliers. The proposed multiplier can be used as a kernel circuit for exponentiation/division and multiplication.

High-throughput Low-complexity Mixed-radix FFT Processor using a Dual-path Shared Complex Constant Multiplier

  • Nguyen, Tram Thi Bao;Lee, Hanho
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.17 no.1
    • /
    • pp.101-109
    • /
    • 2017
  • This paper presents a high-throughput low-complexity 512-point eight-parallel mixed-radix multipath delay feedback (MDF) fast Fourier transform (FFT) processor architecture for orthogonal frequency division multiplexing (OFDM) applications. To decrease the number of twiddle factor (TF) multiplications, a mixed-radix $2^4/2^3$ FFT algorithm is adopted. Moreover, a dual-path shared canonical signed digit (CSD) complex constant multiplier using a multi-layer scheme is proposed for reducing the hardware complexity of the TF multiplication. The proposed FFT processor is implemented using TSMC 90-nm CMOS technology. The synthesis results demonstrate that the proposed FFT processor can lead to a 16% reduction in hardware complexity and higher throughput compared to conventional architectures.

A design of floating-point multiplier for superscalar microprocessor (수퍼스칼라 마이크로프로세서용 부동 소수점 승산기의 설계)

  • 최병윤;이문기
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.21 no.5
    • /
    • pp.1332-1344
    • /
    • 1996
  • This paper presents a pipelined floating point multiplier(FMUL) for superscalar microprocessors that conbines radix-16 recoding scheme based on signed-digit(SD) number system and new rouding and normalization scheme. The new rounding and normalization scheme enable the FMUL to compute sticky bit in parallel with multiple operation and elminate timing delay due to post-normalization. By expoliting SD radix-16 recoding scheme, we can achieves further reduction of silicon area and computation time. The FMUL can execute signle-precision or double-precision floating-point multiply operation through three-stage pipelined datapath and support IEEE standard 754. The algorithm andstructure of the designed multiplier have been successfully verified through Verilog HOL modeling and simulation.

  • PDF

Low Complexity Systolic Montgomery Multiplication over Finite Fields GF(2m) (유한체상의 낮은 복잡도를 갖는 시스톨릭 몽고메리 곱셈)

  • Lee, Keonjik
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.18 no.1
    • /
    • pp.1-9
    • /
    • 2022
  • Galois field arithmetic is important in error correcting codes and public-key cryptography schemes. Hardware realization of these schemes requires an efficient implementation of Galois field arithmetic operations. Multiplication is the main finite field operation and designing efficient multiplier can clearly affect the performance of compute-intensive applications. Diverse algorithms and hardware architectures are presented in the literature for hardware realization of Galois field multiplication to acquire a reduction in time and area. This paper presents a low complexity semi-systolic multiplier to facilitate parallel processing by partitioning Montgomery modular multiplication (MMM) into two independent and identical units and two-level systolic computation scheme. Analytical results indicate that the proposed multiplier achieves lower area-time (AT) complexity compared to related multipliers. Moreover, the proposed method has regularity, concurrency, and modularity, and thus is well suited for VLSI implementation. It can be applied as a core circuit for multiplication and division/exponentiation.

Design of a Low-Power Multiplier Using MOS Current Mode Logic Circuit (MOS 전류모드 논리회로를 이용한 저 전력 곱셈기 설계)

  • Lee, Yoon-Sang;Kim, Jeong-Beom
    • Journal of IKEEE
    • /
    • v.11 no.2
    • /
    • pp.83-88
    • /
    • 2007
  • This paper proposes an 8${\times}$8 bit parallel multiplier using MOS current-mode logic (MCML) circuit for low power consumption. The 8${\times}$8 multiplier is designed with proposed MCML full adders and conventional full adders. The designed multiplier is achieved to reduce the power consumption by 9.4% and the power-delay-product by 11.7% compared with the conventional circuit. This circuit is designed with Samsung 0.35${\mu}m$ standard CMOS process. The validity and effectiveness are verified through the HSPICE simulation.

  • PDF

Design of a New Bit-serial Multiplier/Divier Architecture (새로운 Bit-serial 방식의 곱셈기 및 나눗셈기 아키텍쳐 설계)

  • 옹수환;선우명훈
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.36C no.3
    • /
    • pp.17-25
    • /
    • 1999
  • This paper proposes a new bit-serial multiplier/divider architecture to reduce the hardware complexity significantly and to maintain the same number of cycles compared with existing architectures. Since the proposed bit-serial multiplier/divider architecture does not extend the number of bits in registers and an adde $r_tractor to calculate a partial product or a partial remainder, the hardware overhead can be greatly reduced. In addition, the proposed architecture can perform an additio $n_traction and a shift operation in parallel and the number of cycles for $\textit{N}$-bit multiplication and division for the proposed circuits is $\textit{N}$ and $\textit{N}$ + 2, repectively. Thus, the number of cycles for multiplication and division is the same compared with existing architectures. The SliM Image Processor employs the proposed multiplier/divider architecture and proves the performance of the proposed architecture.cture.

  • PDF

A Low-Error Truncated Booth Multiplier (작은 오차를 갖는 절사형 Booth 승산기)

  • 정해현;박종화;신경욱
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2001.10a
    • /
    • pp.617-620
    • /
    • 2001
  • This paper describes an efficient error-compensation technique for designing a low-error truncated Booth multiplier that receives two N-bit numbers and produces an N-bit product by eliminating the N least-significant bits. Applying the proposed method, a truncated Booth multiplier for area-efficient and low-power applications has been designed, and its performance (truncation error, area) was analyzed. Since the truncated Booth multiplier omits about half the partial product generators and adders, it has an area reduction by about 35%~40%, compared with non-truncated parallel multipliers. Error analysis shows that the proposed approach reduces the average truncation error by approximately 30%~40%, compared with conventional methods.

  • PDF

EFFICIENT PARALLEL GAUSSIAN NORMAL BASES MULTIPLIERS OVER FINITE FIELDS

  • Kim, Young-Tae
    • Honam Mathematical Journal
    • /
    • v.29 no.3
    • /
    • pp.415-425
    • /
    • 2007
  • The normal basis has the advantage that the result of squaring an element is simply the right cyclic shift of its coordinates in hardware implementation over finite fields. In particular, the optimal normal basis is the most efficient to hardware implementation over finite fields. In this paper, we propose an efficient parallel architecture which transforms the Gaussian normal basis multiplication in GF($2^m$) into the type-I optimal normal basis multiplication in GF($2^{mk}$), which is based on the palindromic representation of polynomials.

On Parallel Implementation of Lagrangean Approximation Procedure (Lagrangean 근사과정의 병렬계산)

  • 이호창
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.18 no.3
    • /
    • pp.13-34
    • /
    • 1993
  • By operating on many part of a software system concurrently, the parallel processing computers may provide several orders of magnitude more computing power than traditional serial computers. If the Lagrangean approximation procedure is applied to a large scale manufacturing problem which is decomposable into many subproblems, the procedure is a perfect candidate for parallel processing. By distributing Lagrangean subproblems for given multiplier to multiple processors, concurrently running processors and modifying Lagrangean multipliers at the end of each iteration of a subgradient method,a parallel processing of a Lagrangean approximation procedure may provide a significant speedup. This purpose of this research is to investigate the potential of the parallelized Lagrangean approximation procedure (PLAP) for certain combinational optimization problems in manufacturing systems. The framework of a Plap is proposed for some combinatorial manufacturing problems which are decomposable into well-structured subproblems. The synchronous PLAP for the multistage dynamic lot-sizing problem is implemented on a parallel computer Alliant FX/4 and its computational experience is reported as a promising application of vector-concurrent computing.

  • PDF