• Title/Summary/Keyword: parallel multiplier

Search Result 158, Processing Time 0.026 seconds

New Transistor Sizing Algorithms For CMOS Digital Designs (CMOS 디지틀 설계를 위한 트랜지스터 크기의 최적화기법)

  • 이상헌;김경호;박송배
    • Journal of the Korean Institute of Telematics and Electronics A
    • /
    • v.31A no.3
    • /
    • pp.68-76
    • /
    • 1994
  • In the automatic transistor sizing with computer for optimizing delay and the chip area of CMOS digital circuits, conventionally either a mathematical method or a heuristic method has been used. In this paper, we present a new method of transistor sizing, a sort of combination of the above two methods, in which the mathematical method is used for sizing of critical paths and the heuristic method is used for desizing of non-critical paths. In order to reduce the overall problem dimension, a basic block called an extended stage is introduced which includes a basic stage, parallel transistors and complementary part. Optimization for multiple critical paths is formulated as a problem of area minimization subject to delay constraints and is solved by the augmented Lagrange multiplier method. The transistor sizes along non-critical paths are decreased successively without affecting the critical path delay times. The proposed scheme was successfully applied to several test circuits.

  • PDF

A Low-Complexity 128-Point Mixed-Radix FFT Processor for MB-OFDM UWB Systems

  • Cho, Sang-In;Kang, Kyu-Min
    • ETRI Journal
    • /
    • v.32 no.1
    • /
    • pp.1-10
    • /
    • 2010
  • In this paper, we present a fast Fourier transform (FFT) processor with four parallel data paths for multiband orthogonal frequency-division multiplexing ultra-wideband systems. The proposed 128-point FFT processor employs both a modified radix-$2^4$ algorithm and a radix-$2^3$ algorithm to significantly reduce the numbers of complex constant multipliers and complex booth multipliers. It also employs substructure-sharing multiplication units instead of constant multipliers to efficiently conduct multiplication operations with only addition and shift operations. The proposed FFT processor is implemented and tested using 0.18 ${\mu}m$ CMOS technology with a supply voltage of 1.8 V. The hardware- efficient 128-point FFT processor with four data streams can support a data processing rate of up to 1 Gsample/s while consuming 112 mW. The implementation results show that the proposed 128-point mixed-radix FFT architecture significantly reduces the hardware cost and power consumption in comparison to existing 128-point FFT architectures.

DESIGN AND IMPLEMENTATION OF THE ALL DIGITAL QPSK TRANSMITTER FOR MPEG-2 PACKETS SUPPORTING THE DAVIC STANDARD

  • Park, Sungsoo;Lee, Youngkou;Kim, Kiseon
    • Proceedings of the IEEK Conference
    • /
    • 2000.07b
    • /
    • pp.914-918
    • /
    • 2000
  • In this paper, a next generation high speed QPSK transmitter is designed based on 1.8${\mu}$m design rule. The designed transmitter supports the MPEG2-TS coded packed data for the DAVIC standard. Transmitter is composed of the convolutional coder, the shortened Reed-Solomon coder, and QPSK modulator. The coded packets are modulated in APSK with an RC filter. Especially, Galois Field multiplier with a standard basis is designed with the pipelined parallel architecture. Also, in the QPSK modulator, the RC filter and mixer are simplified into the ROM table, which can improve the performance of the transmitter. The total number of gates for the implemented baseband transmitter is 26,875.

  • PDF

A Design of Superscalar Digital Signal Processor (다중 명령어 처리 DSP 설계)

  • Park, Sung-Wook
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.3
    • /
    • pp.323-328
    • /
    • 2008
  • This paper presents a Digital Signal Processor achieving high through-put for both decision intensive and computation intensive tasks. The proposed processor employees a multiplier, two ALU and load/store. Unit as operational units. Those four units are controlled and works parallel by superscalar control scheme, which is different from prior DSP architecture. The performance evaluation was done by implementing AC-3 decoding algorithm and 37.8% improvement was achieved. This study is valuable especially for the consumer electronics applications, which require very low cost.

Design and Implementation of Digital Signal Processor and Development System (Digital Signal Processor와 개발시스템의 설계 및 구현)

  • Lim, Kwang Il;Lee, Woo Sun;Shin, In Chul;Rhee, Tae Won
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.23 no.6
    • /
    • pp.902-907
    • /
    • 1986
  • A real-time microprogrammable digital signal processor is designed and implemented using the bit-slice logic, a parallel multiplier, 74 series TTLs and MOS memories. A microinstruction set for the processor is defined and an application program development system is constructed. For its performance evalution, a digital filter and FFT are implemented with this digital signal processor. It is proved that this processor is faster than commrcially available single chip digital signal processors such as \ulcornerD 7720, AMI 2811, enabling very high speed digital signal processing.

  • PDF

Fully Distributed Economic Dispatching Methods Based on Alternating Direction Multiplier Method

  • Yang, Linfeng;Zhang, Tingting;Chen, Guo;Zhang, Zhenrong;Luo, Jiangyao;Pan, Shanshan
    • Journal of Electrical Engineering and Technology
    • /
    • v.13 no.5
    • /
    • pp.1778-1790
    • /
    • 2018
  • Based on the requirements and characteristics of multi-zone autonomous decision-making in modern power system, fully distributed computing methods are needed to optimize the economic dispatch (ED) problem coordination of multi-regional power system on the basis of constructing decomposition and interaction mechanism. In this paper, four fully distributed methods based on alternating direction method of multipliers (ADMM) are used for solving the ED problem in distributed manner. By duplicating variables, the 2-block classical ADMM can be directly used to solve ED problem fully distributed. The second method is employing ADMM to solve the dual problem of ED in fully distributed manner. N-block methods based on ADMM including Alternating Direction Method with Gaussian back substitution (ADM_G) and Exchange ADMM (E_ADMM) are employed also. These two methods all can solve ED problem in distributed manner. However, the former one cannot be carried out in parallel. In this paper, four fully distributed methods solve the ED problem in distributed collaborative manner. And we also discussed the difference of four algorithms from the aspects of algorithm convergence, calculation speed and parameter change. Some simulation results are reported to test the performance of these distributed algorithms in serial and parallel.

Efficient Polynomial Multiplication in Extension Field GF($p^n$) (확장체 GF($p^n$)에서 효율적인 다항식 곱셈 방법)

  • Chang Namsu;Kim Chang Han
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.5 s.335
    • /
    • pp.23-30
    • /
    • 2005
  • In the construction of an extension field, there is a connection between the polynomial multiplication method and the degree of polynomial. The existing methods, KO and MSK methods, efficiently reduce the complexity of coefficient-multiplication. However, when we construct the multiplication of an extension field using KO and MSK methods, the polynomials are padded with necessary number of zero coefficients in general. In this paper, we propose basic properties of KO and MSK methods and algorithm that can reduce coefficient-multiplications. The proposed algorithm is more reducible than the original KO and MSK methods. This characteristic makes the employment of this multiplier particularly suitable for applications characterized by specific space constrains, such as those based on smart cards, token hardware, mobile phone or other devices.

Design of Low Complexity and High Throughput Encoder for Structured LDPC Codes (구조적 LDPC 부호의 저복잡도 및 고속 부호화기 설계)

  • Jung, Yong-Min;Jung, Yun-Ho;Kim, Jae-Seok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.10
    • /
    • pp.61-69
    • /
    • 2009
  • This paper presents the design results of a low complexity and high throughput LDPC encoder structure. In order to solve the high complexity problem of the LDPC encoder, a simplified matrix-vector multiplier is proposed instead of the conventional complex matrix-vector multiplier. The proposed encoder also adopts a partially parallel structure and performs column-wise operations in matrix-vector multiplication to achieve high throughput. Implementation results show that the proposed architecture reduces the number of logic gates and memory elements by 37.4% and 56.7%, compared with existing five-stage pipelined architecture. The proposed encoder also supports 800Mbps throughput at 40MHz clock frequency which is improved about three times more than the existing architecture.

A New Complex-Number Multiplication Algorithm using Radix-4 Booth Recoding and RB Arithmetic, and a 10-bit CMAC Core Design (Radix-4 Booth Recoding과 RB 연산을 이용한 새로운 복소수 승산 알고리듬 및 10-bit CMAC코어 설계)

  • 김호하;신경욱
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.35C no.9
    • /
    • pp.11-20
    • /
    • 1998
  • High-speed complex-number arithmetic units are essential to baseband signal processing of modern digital communication systems such as channel equalization, timing recovery, modulation and demodulation. In this paper, a new complex-number multiplication algorithm is proposed, which is based on redundant binary (RB) arithmetic combined with radix-4 Booth recoding scheme. The proposed algorithm reduces the number of partial product by one-half as compared with the conventional direct method using real-number multipliers and adders. It also leads to a highly parallel architecture and simplified circuit, resulting in high-speed operation and low power dissipation. To demonstrate the proposed algorithm, a prototype complex-number multiplier-accumulator (CMAC) core with 10-bit operands has been designed using 0.8-$\mu\textrm{m}$ N-Well CMOS technology. The designed CMAC core contains about 18,000 transistors on the area of about 1.60 ${\times}$ 1.93 $\textrm{mm}^2$. The functional and speed test results show that it can operate with 120-MHz clock at V$\sub$DD/=3.3-V, and its power consumption is given to about 63-mW.

  • PDF

Implementation of High-radix Modular Exponentiator for RSA using CRT (CRT를 이용한 하이래딕스 RSA 모듈로 멱승 처리기의 구현)

  • 이석용;김성두;정용진
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.10 no.4
    • /
    • pp.81-93
    • /
    • 2000
  • In a methodological approach to improve the processing performance of modulo exponentiation which is the primary arithmetic in RSA crypto algorithm, we present a new RSA hardware architecture based on high-radix modulo multiplication and CRT(Chinese Remainder Theorem). By implementing the modulo multiplier using radix-16 arithmetic, we reduced the number of PE(Processing Element)s by quarter comparing to the binary arithmetic scheme. This leads to having the number of clock cycles and the delay of pipelining flip-flops be reduced by quarter respectively. Because the receiver knows p and q, factors of N, it is possible to apply the CRT to the decryption process. To use CRT, we made two s/2-bit multipliers operating in parallel at decryption, which accomplished 4 times faster performance than when not using the CRT. In encryption phase, the two s/2-bit multipliers can be connected to make a s-bit linear multiplier for the s-bit arithmetic operation. We limited the encryption exponent size up to 17-bit to maintain high speed, We implemented a linear array modulo multiplier by projecting horizontally the DG of Montgomery algorithm. The H/W proposed here performs encryption with 15Mbps bit-rate and decryption with 1.22Mbps, when estimated with reference to Samsung 0.5um CMOS Standard Cell Library, which is the fastest among the publications at present.