• Title/Summary/Keyword: Booth Multiplier

Search Result 59, Processing Time 0.025 seconds

A Hardware Reduced Multiplier for Low Power Design (저전력 설계를 위한 면적 절약형 곱셈기 구조에 관한 연구)

  • 이광현;임종석
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.1085-1088
    • /
    • 1998
  • In this paper, we propose a hardware reduced multiplier for DSP applications. In many DSP application, all of multiplier products were not used, but only upper bits of rpoduct were used. Kidambi proposed truncated unsigned multiplier for this idea. In this paper, we abopt this scheme to Booth multiplier which can be used for real DSP systems. Also, zero input guarantees zero output that was not provided in the previous work.

  • PDF

Design and Implementation of the Tree-like Multiplier

  • Song, Gi-Yong;Lee, Jae-jin;Lee, Ho-Jun;Song, Ho-Jeong
    • Proceedings of the IEEK Conference
    • /
    • 2000.07a
    • /
    • pp.371-374
    • /
    • 2000
  • This paper proposes a 16-bit ${\times}$ 16-bit multiplier for 2 twos-complement binary numbers with tree-like structure and implements it on a FPGA. The space and time complexity analysis shows that the 16-bit Tree-like multiplier represents lower circuit complexity and computes more quickly than both Booth array multiplier md Modified array multiplier.

  • PDF

A 32${\times}$32-b Multiplier Using a New Method to Reduce a Compression Level of Partial Products (부분곱 압축단을 줄인 32${\times}$32 비트 곱셈기)

  • 홍상민;김병민;정인호;조태원
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.40 no.6
    • /
    • pp.447-458
    • /
    • 2003
  • A high speed multiplier is essential basic building block for digital signal processors today. Typically iterative algorithms in Signal processing applications are realized which need a large number of multiply, add and accumulate operations. This paper describes a macro block of a parallel structured multiplier which has adopted a 32$\times$32-b regularly structured tree (RST). To improve the speed of the tree part, modified partial product generation method has been devised at architecture level. This reduces the 4 levels of compression stage to 3 levels, and propagation delay in Wallace tree structure by utilizing 4-2 compressor as well. Furthermore, this enables tree part to be combined with four modular block to construct a CSA tree (carry save adder tree). Therefore, combined with four modular block to construct a CSA tree (carry save adder tree). Therefore, multiplier architecture can be regularly laid out with same modules composed of Booth selectors, compressors and Modified Partial Product Generators (MPPG). At the circuit level new Booth selector with less transistors and encoder are proposed. The reduction in the number of transistors in Booth selector has a greater impact on the total transistor count. The transistor count of designed selector is 9 using PTL(Pass Transistor Logic). This reduces the transistor count by 50% as compared with that of the conventional one. The designed multiplier in 0.25${\mu}{\textrm}{m}$ technology, 2.5V, 1-poly and 5-metal CMOS process is simulated by Hspice and Epic. Delay is 4.2㎱ and average power consumes 1.81㎽/MHz. This result is far better than conventional multiplier with equal or better than the best one published.

Array Structure for Asynchronous Low Power Multiplier (저전력 비동기 곱셈기를 위한 배열 구조)

  • 박찬호;최병수;이동익
    • Proceedings of the IEEK Conference
    • /
    • 2000.06b
    • /
    • pp.141-144
    • /
    • 2000
  • In this paper, a new parallel array structure for the asynchronous array multiplier is introduced. This structure is designed for a data dependent asynchronous multiplier to reduces power which is wasted in conventional array structure. Simulation shows that this structure saves 30% of power and 55% of computation time comparing to conventional booth encoded array multiplier.

  • PDF

A Study on Multiplier Architectures Optimized for 32-bit RISC Processor with 3-Stage Pipeline (32비트 3단 파이프라인을 가진 RISC 프로세서에 최적화된 Multiplier 구조에 관한 연구)

  • 정근영;박주성;김석찬
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.41 no.11
    • /
    • pp.123-130
    • /
    • 2004
  • This paper describes a multiplier architecture optimized for 32 bit RISC processor with 3-stage pipeline. The multiplier of ARM7, the target processor, is variably carried out on the execution stage of pipeline within 7 cycles. The included multiplier employs a modified Booth's algerian to produce 64 bit multiplication and addition product and it has 6 separate instructions. We analyzed several multiplication algorithm such as radix4-32${\times}$8, radix4-32${\times}$16 and radix8-32${\times}$32 to decide which multiplication architecture is most fit for a typical architecture of ARM7. VLSI area, cycle delay time and execution cycle number is the index of an efficient design and the final multiplier was designed on these indexes. To verify the operation of embedded multiplier, it was simulated with various audio algorithms.

Maximum Error Reduction for Fixed-width Modified Booth Multipliers Based on Error Bound Analysis (오차범위 분석을 통한 고정길이 modified Booth 곱셈기의 최대오차 감소)

  • Cho, Kyung-Ju;Chung, Jin-Gyun
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.10 s.340
    • /
    • pp.29-34
    • /
    • 2005
  • The maximum quantization error has serious effect on the performance of fixed-width multipliers that receive W-bit inputs and produce W-bit products. In this paper, we analyze the error bound of fixed-width modified Booth multipliers. Then, the estimation method for the number of additional columns for fixed-width multipliers is proposed to limit the maximum quantization error within a desired bound. In addition, it is shown that our methodology can be extended to reduced-width multipliers. By simulations, it is shown that the proposed error analysis method is useful to the practical design of fixed-width modified Booth multipliers.

Design of QR Decomposition Processor for GDFE (GDFE를 위한 QR분해 프로세서 설계)

  • Cho, Kyung-Ju
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.6 no.2
    • /
    • pp.199-205
    • /
    • 2011
  • This paper presents a QR decomposition processor by exploiting Givens rotation for the GDFE (Generalized Decision Feedback Equalizer). A Givens rotation consists of phase extraction, sine/cosine generation and angle rotation parts. Combining two-stage method (coarse and fine stage) and the fixed-width modified-Booth multiplier, we design an efficient QR decomposition processor. By simulations, it is shown that the proposed QR decomposition processor can be a feasible solution for GDFE.

Parameterized IP Core of Complex-Number Multiplier (파라미터화된 복소수 승산기 IP 코어)

  • 양대성;이승기;신경욱
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2001.05a
    • /
    • pp.307-310
    • /
    • 2001
  • A parameterized complex-number multiplier (PCMUL) core IP (Intellectual Property), which can be used as an essential arithmetic unit in baseband signal processing of digital communication systems, is described. The bit-width of the multiplier is parameterized in the range of 8-b~24-b and is user-selectable in 2-b step. The PCMUL_GEN, a core generator with GUI, generates VHDL code of a CMUL core for a specified bit-width. The IP is based on redundant binary (RB) arithmetic and a new radix4 Booth encoding/decoding scheme proposed in this paper. It results in a simplified internal structure, as well as high-speed, low-power, and area-efficient implementation. The designed IP was verified using Xilinx FPGA board.

  • PDF

A 200-MHZ@2.5-V Dual-Mode Multiplier for Single / Double -Precision Multiplications (단정도/배정도 승산을 위한 200-MHZ@2.5-V 이중 모드 승산기)

  • 이종남;박종화;신경욱
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.4 no.5
    • /
    • pp.1143-1150
    • /
    • 2000
  • A dual-mode multiplier (DMM) that performs single- and double-precision multiplications has been designed using a $0.25-\mum$ 5-metal CMOS technology. An algorithm for efficiently implementing double-precision multiplication with a single-precision multiplier was proposed, which is based on partitioning double-precision multiplication into four single-precision sub-multiplications and computing them with sequential accumulations. When compared with conventional double-precision multipliers, our approach reduces the hardware complexity by about one third resulting in small silicon area and low-power dissipation at the expense of increased latency and throughput cycles. The DMM consists of a $28-b\times28-b$ single-precision multiplier designed using radix-4 Booth receding and redundant binary (RB) arithmetic, an accumulator and a simple control logic for mode selection. It contains about 25,000 transistors on the area of about $0.77\times0.40-m^2$. The HSPICE simulation results show that the DMM core can safely operate with 200-MHZ clock at 2.5-V, and its estimated power dissipation is about 130-㎽ at double-precision mode.

  • PDF

A New Complex-Number Multiplication Algorithm using Radix-4 Booth Recoding and RB Arithmetic, and a 10-bit CMAC Core Design (Radix-4 Booth Recoding과 RB 연산을 이용한 새로운 복소수 승산 알고리듬 및 10-bit CMAC코어 설계)

  • 김호하;신경욱
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.35C no.9
    • /
    • pp.11-20
    • /
    • 1998
  • High-speed complex-number arithmetic units are essential to baseband signal processing of modern digital communication systems such as channel equalization, timing recovery, modulation and demodulation. In this paper, a new complex-number multiplication algorithm is proposed, which is based on redundant binary (RB) arithmetic combined with radix-4 Booth recoding scheme. The proposed algorithm reduces the number of partial product by one-half as compared with the conventional direct method using real-number multipliers and adders. It also leads to a highly parallel architecture and simplified circuit, resulting in high-speed operation and low power dissipation. To demonstrate the proposed algorithm, a prototype complex-number multiplier-accumulator (CMAC) core with 10-bit operands has been designed using 0.8-$\mu\textrm{m}$ N-Well CMOS technology. The designed CMAC core contains about 18,000 transistors on the area of about 1.60 ${\times}$ 1.93 $\textrm{mm}^2$. The functional and speed test results show that it can operate with 120-MHz clock at V$\sub$DD/=3.3-V, and its power consumption is given to about 63-mW.

  • PDF