• Title/Summary/Keyword: Booth Multiplier

Search Result 59, Processing Time 0.023 seconds

An Area Efficient High Speed FIR Filter Design and Its Applications (면적 절약형 고속 FIR 필터의 설계 및 응용)

  • Lee, Kwang-Hyun;Rim, Chong-Suck
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.37 no.11
    • /
    • pp.85-95
    • /
    • 2000
  • FIR digital filter is one of important blocks in DSP application. For more effective operation, lots of architecture are proposed. In our paper, we proposed a high speed FIR filter with area efficiency. To fast operation, we used transposed form filter as basic architecute. And, we used dual path registers line to wupport variation of filter operation, and filter cascade is also considered. To reduce area, we adopted truncated Booth multiplier to our filter design. As a result, we showed that filter area is reduced when filter optimization using of dual path registers line and truncated multiplier with same constraints againt previous method.

  • PDF

Design of a $54{\times}54$-bit Multiplier Based on a Improved Conditional Sum Adder (개선된 조건 합 가산기를 이용한 $54{\times}54$-bit 곱셈기의 설계)

  • Lee, Young-Chul;Song, Min-Kyu
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.37 no.1
    • /
    • pp.67-74
    • /
    • 2000
  • In this paper, a $54{\times}54$-bit multiplier based on a improved conditional sum adder is proposed. To reduce the multiplication time, high compression-rate compressors without Booth's Encoding, and a 108-bit conditional sum adder with separated carry generation block, are developed. Furthermore, a design technique based on pass-transistor logic is utilized for optimize the multiplication time and the power consumption by about 5% compared to that of conventional one. With $0.65{\mu}m$, single-poly, triple-metal CMOS process, its chip size is $6.60{\times}6.69\;mm^2$ and the multiplication time is 135.ns at a 3.3V power supply.

  • PDF

New VLSI Architecture of Parallel Multiplier-Accumulator Based on Radix-2 Modified Booth Algorithm (Radix-2 MBA 기반 병렬 MAC의 VLSI 구조)

  • Seo, Young-Ho;Kim, Dong-Wook
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.4
    • /
    • pp.94-104
    • /
    • 2008
  • In this paper, we propose a new architecture of multiplier-and-accumulator (MAC) for high speed multiplication and accumulation arithmetic. By combining multiplication with accumulation and devising a hybrid type of carry save adder (CSA), the performance was improved. Since the accumulator which has the largest delay in MAC was removed and its function was included into CSA, the overall performance becomes to be elevated. The proposed CSA tree uses 1's complement-based radix-2 modified booth algorithm (MBA) and has the modified array for the sign extension in order to increase the bit density of operands. The CSA propagates the carries by the least significant bits of the partial products and generates the least significant bits in advance for decreasing the number of the input bits of the final adder. Also, the proposed MAC accumulates the intermediate results in the type of sum and carry bits not the output of the final adder for improving the performance by optimizing the efficiency of pipeline scheme. The proposed architecture was synthesized with $250{\mu}m,\;180{\mu}m,\;130{\mu}m$ and 90nm standard CMOS library after designing it. We analyzed the results such as hardware resource, delay, and pipeline which are based on the theoretical and experimental estimation. We used Sakurai's alpha power low for the delay modeling. The proposed MAC has the superior properties to the standard design in many ways and its performance is twice as much than the previous research in the similar clock frequency.

Design of a Floating Point Multiplier for IEEE 754 Single-Precision Operations (IEEE 754 단정도 부동 소수점 연산용 곱셈기 설계)

  • Lee, Ju-Hun;Chung, Tae-Sang
    • Proceedings of the KIEE Conference
    • /
    • 1999.11c
    • /
    • pp.778-780
    • /
    • 1999
  • Arithmetic unit speed depends strongly on the algorithms employed to realize the basic arithmetic operations.(add, subtract multiply, and divide) and on the logic design. Recent advances in VLSI have increased the feasibility of hardware implementation of floating point arithmetic units and microprocessors require a powerful floating-point processing unit as a standard option. This paper describes the design of floating-point multiplier for IEEE 754-1985 Single-Precision operation. Booth encoding algorithm method to reduce partial products and a Wallace tree of 4-2 CSA is adopted in fraction multiplication part to generate the $32{\times}32$ single-precision product. New scheme of rounding and sticky-bit generation is adopted to reduce area and timing. Also there is a true sign generator in this design. This multiplier have been implemented in a ALTERA FLEX EPF10K70RC240-4.

  • PDF

Design of High Performance 16bit Multiplier for Asynchronous Systems (비동기 시스템용 고성능 16비트 승산기 설계)

  • 김학윤;이유진;장미숙;최호용
    • Proceedings of the IEEK Conference
    • /
    • 1999.11a
    • /
    • pp.356-359
    • /
    • 1999
  • A high performance 16bit multiplier for asynchronous systems has been designed using asynchronous design methodology. The 4-radix modified Booth algorithm, TSPC (true single phase clocking) registers, and modified 4-2 counters using DPTL (differential pass transistor logic) have been used in our multiplier. It is implemented in 0.65${\mu}{\textrm}{m}$ double-poly/double-metal CMOS technology by using 6616 transistors with core size of 1.4$\times$1.1$\textrm{mm}^2$. And our design results in a computation rate exceeding 60MHz at a supply voltage of 3.3V.

  • PDF

Radix-2 Booth-based Variable Precision Multiplier for Lightweight CNN Accelerators (경량 CNN 가속기를 위한 Radix-2 Booth 기반 가변 정밀도 곱셈기)

  • Guem, Duck-Hyun;Jeon, Seung-Jin;Choi, Jae-Young;Kim, Ji-Hyeok;Kim, Sunhee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.05a
    • /
    • pp.494-496
    • /
    • 2022
  • 엣지 디바이스에서 딥러닝을 활용하기 위하여 CNN 경량화 연구들이 진행되고 있다. 경량 CNN 은 대부분 고정 소수점을 사용하며, 계층에 따라 정밀도는 달라진다. 본 논문에서는 경량 CNN 을 지원하기 위하여, 사용 계층에 따라 정밀도를 선택할 수 있는 가변 정밀도 곱셈기를 제안한다. 제안하는 가변 정밀도 곱셈기는 낮은 정밀도 곱셈기를 병합하는 구조로, 정밀도가 낮을 때는 병렬 처리를 통해 효율을 높인다. 제안하는 곱셈기를 Verilog HDL로 설계하고 ModelSim 에서 동작을 확인하였다. 설계된 곱셈기는 계층별로 정밀도가 다른 CNN 가속기에서 효율적으로 적용될 것으로 기대된다.

An Optimized Hybrid Radix MAC Design (최적화된 4진18진 혼합 MAC 설계)

  • 정진우;김승철;이용주;이용석
    • Proceedings of the IEEK Conference
    • /
    • 2002.06b
    • /
    • pp.173-176
    • /
    • 2002
  • This paper is about a high-speed MAC (multiplier and accumulator) design applying radix-4 and radix-8 Booth's algorithm at the same time. The optimized hybrid radix design for high speed MAC has taken advantage of both a radix-4 and a radix-8 architectures. A radix-4 architecture meets high-speed, but it takes much more power and chip area than a radix-8 architecture. A radix-8 architecture needs less power and chip area than the other, but it has a bottleneck of generating three times the multiplicand problem. An optimized hybrid architecture performs the radix-4 multiplication partially in parallel with the generation of three times the multiplicand for use of the radix-8 multiplication. It reduces the concerned bit width of multiplier in radix-8 multiplication.

  • PDF

17$\times$17-b Multiplier for 32-bit RISC/DSP Processors (32 비트 RISC/DSP 프로세서를 위한 17 비트 $\times$ 17 비트 곱셈기의 설계)

  • 박종환;문상국;홍종욱;문병인;이용석
    • Proceedings of the IEEK Conference
    • /
    • 1999.06a
    • /
    • pp.914-917
    • /
    • 1999
  • The paper describes a 17 $\times$ 17-b multiplier using the Radix-4 Booth’s algorithm. which is suitable for 32-bit RISC/DSP microprocessors. To minimize design area and achieve improved speed, a 2-stage pipeline structure is adopted to achieve high clock frequency. Each part of circuit is modeled and optimized at the transistor level, verification of functionality and timing is performed using HSPICE simulations. After modeling and validating the circuit at transistor level, we lay it out in a 0.35 ${\mu}{\textrm}{m}$ 1-poly 4-metal CMOS technology and perform LVS test to compare the layout with the schematic. The simulation results show that maximum frequency is 330MHz under worst operating conditions at 55$^{\circ}C$ , 3V, The post simulation after layout results shows 187MHz under worst case conditions. It contains 9, 115 transistors and the area of layout is 0.72mm by 0.97mm.

  • PDF

An Optimized Hybrid Radix MAC Design (최적화된 4진/8진 혼합 MAC 설계)

  • 정진우;김승철;이용주;이용석
    • Proceedings of the IEEK Conference
    • /
    • 2002.06a
    • /
    • pp.125-128
    • /
    • 2002
  • This paper is about a high-speed MAC (multiplier and accumulator) design applying radix-4 and radix-8 Booth's algorithm at the same time. The optimized hybrid radix design for high speed MAC has taken advantage of both a radix-4 and a radix-8 architectures. A radix-4 architecture meets high-speed, but it takes much more power and chip area than a radix-8 architecture. A radix-8 architecture needs less power and chip area than the other, but it has a bottleneck of generating three times the multiplicand problem. An optimized hybrid architecture performs tile radix-4 multiplication partially in parallel with the generation of three times the multiplicand for use of tile radix-8 multiplication. It reduces the concerned bit width of multiplier in radix-8 multiplication.

  • PDF

An Efficient Test Method for a Full-Custom Design of a High-Speed Binary Multiplier (풀커스텀 (full-custom) 고속 곱셈기 회로의 효율적인 테스트 방안)

  • Moon, San-Gook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2007.10a
    • /
    • pp.830-833
    • /
    • 2007
  • In this paper, we implemented a $17{\times}17b$ binary digital multiplier using radix-4 Booth;s algorithmand proposed an efficient testing methodology for the full-custom design. A two-stage pipeline architecture was applied to achieve higher throughput and 4:2 adders were used for regular layout structure in the Wallace tree partition. Several chips were fabricated using LG Semicon 0.6-um 3-Metal N-well CMOS technology. We did fault simulations efficiently using the proposed test method resulting in the reduction of the number of faulty nodes by 88%. The chip contains 9115 transistors and the core area occupies $1135^*1545$ mm2. The functional tests using ATS-2 tester showed that it can operate with 24 MHz clock at 5.0 V at room temperature.

  • PDF