• Title/Summary/Keyword: 덧셈기

Search Result 164, Processing Time 0.031 seconds

New VLSI Architecture of Parallel Multiplier-Accumulator Based on Radix-2 Modified Booth Algorithm (Radix-2 MBA 기반 병렬 MAC의 VLSI 구조)

  • Seo, Young-Ho;Kim, Dong-Wook
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.4
    • /
    • pp.94-104
    • /
    • 2008
  • In this paper, we propose a new architecture of multiplier-and-accumulator (MAC) for high speed multiplication and accumulation arithmetic. By combining multiplication with accumulation and devising a hybrid type of carry save adder (CSA), the performance was improved. Since the accumulator which has the largest delay in MAC was removed and its function was included into CSA, the overall performance becomes to be elevated. The proposed CSA tree uses 1's complement-based radix-2 modified booth algorithm (MBA) and has the modified array for the sign extension in order to increase the bit density of operands. The CSA propagates the carries by the least significant bits of the partial products and generates the least significant bits in advance for decreasing the number of the input bits of the final adder. Also, the proposed MAC accumulates the intermediate results in the type of sum and carry bits not the output of the final adder for improving the performance by optimizing the efficiency of pipeline scheme. The proposed architecture was synthesized with $250{\mu}m,\;180{\mu}m,\;130{\mu}m$ and 90nm standard CMOS library after designing it. We analyzed the results such as hardware resource, delay, and pipeline which are based on the theoretical and experimental estimation. We used Sakurai's alpha power low for the delay modeling. The proposed MAC has the superior properties to the standard design in many ways and its performance is twice as much than the previous research in the similar clock frequency.

JPEG2000 IP Design and Implementation for SoC Design (SoC를 위한 JPEG2000 IP 설계 및 구현)

  • 정재형;한상균;홍성훈;김영철
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2002.11a
    • /
    • pp.63-68
    • /
    • 2002
  • JPEG2000은 기존의 정지영상압축부호화 방식에 비해 우수한 비트율-왜곡(Rate-Distortion)특성과 향상된 주관적 화질을 제공하며 인터넷, 디지털 영상카메라, 이동단말기, 의학영상 등 다양한 분야에서 적용될 수 있는 새로운 정지영상압축 표준이다. 본 논문에서는 SoC(System on a Chip)설계를 고려한 JPEG2000 인코더의 구조를 제안하고 IP(Intellectual Property)를 설계 및 검증하였다. 구현된 JPEG2000 IP는 DWT(Discrete Wavelet Transform)블록, 스칼라양자화블록, EBCOT(Embedded Block Coding with Optimized Truncation)블록으로 구성되어 있다. IP는 모의실험을 통해 구현 구조에 대한 타당성을 검증하였고, 반도체설계자산연구센터에서 제시한 'RTL Coding Guideline'에 따라 HDL을 설계하였다. 특히, DWT블록은 구현시 많은 연산과 메모리 용량이 필요하므로 영상을 저장할 외부 메모리를 사용하였고, 빠른 곱셈과 덧셈연산을 위한 3단 파이프라인 부스곱셈기(3-state pipeline booth multiplier)와 캐리예측 덧셈기(carry lookahead adder)를 사용하였다. 설계된 JPEG2000 IP들은 삼성 0.35$\mu\textrm{m}$ 라이브러리를 이용하여 Synopsys사 Design Analyzer 틀을 통해 논리 합성하였으며, Xillinx 100만 게이트 FPGA칩에 구현하여 그 동작을 검증하였다. 또한, Hard IP 설계를 위해 Avanti사의 Apollo툴을 이용하여 Layout을 수행하였다.

  • PDF

An ACS for a Viterbi Decoder Using a High-Speed Low-Power Comparator (고속 저전력 비교기를 사용한 비터비 검출기용 ACS)

  • Hong You-Pyo;Lee Jae-Jin
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.1A
    • /
    • pp.1-8
    • /
    • 2004
  • Viterbi decoders are widely used for communication and high-density storage devices. An add-compare-select(ACS) unit has been an active research area for a long time because it is the most critical component in determining the operation speed and power-consumption of the Viterbi decoder. We propose a new comparator which is faster and consumes less power than existing ones. We also used the new comparator for a Viterbi decoder and our simulations results show the Viterbi decoder outperforms existing ones at least $20\%$ in its operating speed.

Design of Low-Area HEVC Core Transform Architecture (저면적 HEVC 코어 변환기 아키텍쳐 설계)

  • Han, Seung-Mok;Nam, Woo-Jin;Lee, Seongsoo
    • Journal of IKEEE
    • /
    • v.17 no.2
    • /
    • pp.119-128
    • /
    • 2013
  • This paper proposes and implements an core transform architecture, which is one of the major processes in HEVC video compression standard. The proposed core transform architecture is implemented with only adders and shifters instead of area-consuming multipliers. Shifters in the proposed core transform architecture are implemented in wires and multiplexers, which significantly reduces chip area. Also, it can process from $4{\times}4$ to $16{\times}16$ blocks with common hardware by reusing processing elements. Designed core transform architecture in 0.13um technology can process a $16{\times}16$ block with 2-D transform in 130 cycles, and its gate count is 101,015 gates.

A Study on Video Encoder Implementation having Pipe-line Structure (Pipe-line 구조를 갖는 Video Encoder 구현에 관한 연구)

  • 이인섭;이완범;김환용
    • Journal of the Korea Computer Industry Society
    • /
    • v.2 no.9
    • /
    • pp.1183-1190
    • /
    • 2001
  • In this paper, it used a different pipeline method from conventional method which is encoding the video signal of analog with digital. It designed with pipeline structure of 4 phases as the pixel clock ratio of the whole operation of the encoder, and secured the stable operational timing of the each sub-blocks, it was visible the effect which reduces a gate possibility as designing by the ROM table or the shift and adder method which is not used a multiplication flag method of case existing of multiplication of the fixed coefficient. The designed encoder shared with the each sub-block and it designed the FPGA using MAX+PLUS2 with VHDL.

  • PDF

A High-Speed Matched Filter for Searching Synchronization in DSSS Receiver (DSSS 수신기에서 동기탐색을 위한 고속 정합필터)

  • 송명렬
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.10C
    • /
    • pp.999-1007
    • /
    • 2002
  • In this paper, the operation of matched filter for searching initial synchronization in direct sequence spread spectrum receiver is studied. The implementation model of the matched filter by HDL (Hardware Description Language) is proposed. The model has an architecture based on parallelism and pipeline for fast processing, which includes circular buffer, multiplier, adder, and code look-up table. The performance of the model is analyzed and compared with the implementation by a conventional digital signal processor. It is implemented on a FPGA (Field Programmable Gate Array) and its operation is validated in a timing simulation result.

Liner Cubic Convolution Interpolation Algorithm with Low Computational Complexity (연산량을 감소시킨 선형 Cubic Convolution 보간 알고리즘)

  • Jun Young-Hyun;Yun Jong-Ho;Choi Myung-Ryul
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.06b
    • /
    • pp.385-387
    • /
    • 2006
  • 본 논문에서는 Cubic Convolution 보간 알고리즘을 변형하여 연산량을 감소시키고 에지를 강조하는 보간 알고리즘을 제안한다. 제안된 알고리즘은 디지털 영상의 확대 또는 축소에 필요한 연산량을 줄이기 위해 두가지 방법을 사용하였다. 기존의 Cubic Convolution 알고리즘의 고차항의 가중치 연산을 일차원으로 변환하였다. 인접한 픽셀의 차이값을 사용하여 Bilinear 알고리즘을 제한적으로 적용하였다. 제안된 알고리즘의 화질 평가를 위해 원영상의 확대-후-축소와 축소-후-확대를 하여 RMSE를 사용하였고, 연산량을 평가하기 위해 픽셀별 곱셈기와 덧셈기를 기존의 알고리즘과 비교하였다. 시뮬레이션 결과 기존 Cubic Convolution 알고리즘보다 연산량이 감소하였다.

  • PDF

Signal Detection for Pattern Dependent Noise Channel (신호패턴 종속잡음 채널을 위한 신호검출)

  • Jeon, Tae-Hyun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.5
    • /
    • pp.583-586
    • /
    • 2004
  • Transition jitter noise is one of major sources of detection errors in high density recording channels. Implementation complexity of the optimal detector for such channels is high due to the data dependency and correlated nature of the jitter noise. In this paper, two types of hardware efficient sub-optimal detectors are derived by modifying branch metric of Viterbi algorithm and applied to partial response (PR) channels combined with run length limited modulation coding. The additional complexity over the conventional Viterbi algorithm to incorporate the modified branch metric is either a multiplication or an addition for each branch metric in the Viterbi trellis.

A Study on Frequency Modulation Method to Reduce Time Interval Error (주파수 변조 기법에 의한 시간격 오차 개선에 대한 연구)

  • Ahn, Tae-Won;Lee, Won-Seok
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.2
    • /
    • pp.141-146
    • /
    • 2016
  • This paper presents a method to improve time interval error for asynchronous communication systems. The proposed method is designed and simulated with multi-phase VCO, interpolator, phase selector, up-down counter, comparator and adder. The simulation results for CAN communication system show that the maximum time interval error can be tightly managed for satisfying the required specification. The proposed frequency modulation method can be properly used for asynchronous communication systems requiring high reliability.

Low Power ADC Design for Mixed Signal Convolutional Neural Network Accelerator (혼성신호 컨볼루션 뉴럴 네트워크 가속기를 위한 저전력 ADC설계)

  • Lee, Jung Yeon;Asghar, Malik Summair;Arslan, Saad;Kim, HyungWon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.11
    • /
    • pp.1627-1634
    • /
    • 2021
  • This paper introduces a low-power compact ADC circuit for analog Convolutional filter for low-power neural network accelerator SOC. While convolutional neural network accelerators can speed up the learning and inference process, they have drawback of consuming excessive power and occupying large chip area due to large number of multiply-and-accumulate operators when implemented in complex digital circuits. To overcome these drawbacks, we implemented an analog convolutional filter that consists of an analog multiply-and-accumulate arithmetic circuit along with an ADC. This paper is focused on the design optimization of a low-power 8bit SAR ADC for the analog convolutional filter accelerator We demonstrate how to minimize the capacitor-array DAC, an important component of SAR ADC, which is three times smaller than the conventional circuit. The proposed ADC has been fabricated in CMOS 65nm process. It achieves an overall size of 1355.7㎛2, power consumption of 2.6㎼ at a frequency of 100MHz, SNDR of 44.19 dB, and ENOB of 7.04bit.