• 제목/요약/키워드: VLSI design

Search Result 488, Processing Time 0.024 seconds

A 200-MHz@2.5V 0.25-$\mu\textrm{m}$ CMOS Pipelined Adaptive Decision-Feedback Equalizer (200-MHz@2.5-V 0.25-$\mu\textrm{m}$ CMOS 파이프라인 적응 결정귀환 등화기)

  • 안병규;이종남;신경욱
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2000.05a
    • /
    • pp.465-469
    • /
    • 2000
  • This paper describes a single-chip full-custom implementation of pipelined adaptive decision-feedback equalizer (PADFE) using a 0.25-${\mu}{\textrm}{m}$ CMOS technology for wide-band wireless digital communication systems. To enhance the throughput rate of ADFE, two pipeline stage are inserted into the critical path of the ADFE by using delayed least-mean-square (DLMS) algorithm Redundant binary (RB) arithmetic is applied to all the data processing of the PADFE including filter taps and coefficient update blocks. When compared with conventional methods based on two's complement arithmetic, the proposed approach reduces arithmetic complexity, as well as results in a very simple complex-valued filter structure, thus suitable for VLSI implementation. The design parameters including pipeline stage, filter tap, coefficient and internal bit-width and equalization performance such as bit error rate (BER) and convergence speed are analyzed by algorithm-level simulation using COSSAP. The singl-chip PADFE contains about 205,000 transistors on an area of about 1.96$\times$1.35-$\textrm{mm}^2$. Simulation results show that it can safely operate with 200-MHz clock frequency at 2.5-V supply, and its estimated power dissipation is about 890-mW.

  • PDF

Variable Sampling Window Flip-Flops for High-Speed Low-Power VLSI (고속 저전력 VLSI를 위한 가변 샘플링 윈도우 플립-플롭의 설계)

  • Shin Sang-Dae;Kong Bai-Sun
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.8 s.338
    • /
    • pp.35-42
    • /
    • 2005
  • This paper describes novel flip-flops with improved robustness and reduced power consumption. Variable sampling window flip-flop (VSWFF) adjusts the width of the sampling window according to input data, providing robust data latching as well as shorter hold time. The flip-flop also reduces power consumption for higher input switching activities as compared to the conventional low-power flip-flop. Clock swing-reduced variable sampling window flip-flop (CSR-VSWFF) reduces clock power consumption by allowing the use of a small swing clock. Unlike conventional reduced clock swing flip-flops, it requires no additional voltage higher than the supply voltage, eliminating design overhead related to the generation and distribution of this voltage. Simulation results indicate that the proposed flip-flops provide uniform latency for narrower sampling window and improved power-delay product as compared to conventional flip-flops. To evaluate the performance of the proposed flip-flops, test structures were designed and implemented in a $0.3\mu m$ CMOS process technology. Experimental result indicates that VSWFF yields power reduction for the maximum input switching activity, and a synchronous counter designed with CSR-VSWFF improves performance in terms of power consumption with no use of extra voltage higher than the supply voltage.

Hardware optimized high quality image signal processor for single-chip CMOS Image Sensor (Single-chip CMOS Image Sensor를 위한 하드웨어 최적화된 고화질 Image Signal Processor 설계)

  • Lee, Won-Jae;Jung, Yun-Ho;Lee, Seong-Joo;Kim, Jae-Seok
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.5
    • /
    • pp.103-111
    • /
    • 2007
  • In this paper, we propose a VLSI architecture of hardware optimized high quality image signal processor for a Single-chip CMOS Image Sensor(CIS). The Single-chip CIS is usually used for mobile applications, so it has to be implemented as small as possible while maintaining the image quality. Several image processing algorithms are used in ISP to improve captured image quality. Among the several image processing blocks, demosaicing and image filter are the core blocks in ISP. These blocks need line memories, but the number of line memories is limited in a low cost Single-chip CIS. In our design, high quality edge-adaptive and cross channel correlation considered demosaicing algorithm is adopted. To minimize the number of required line memories for image filter, we share the line memories using the characteristics of demosaicing algorithm which consider the cross correlation. Based on the proposed method, we can achieve both high quality and low hardware complexity with a small number of line memories. The proposed method was implemented and verified successfully using verilog HDL and FPGA. It was synthesized to gate-level circuits using 0.25um CMOS standard cell library. The total logic gate count is 37K, and seven and half line memories are used.

A Stable Multilevel Partitioning Algorithm for VLSI Circuit Designs Using Adaptive Connectivity Threshold (가변적인 연결도 임계치 설정에 의한 대규모 집적회로 설계에서의 안정적인 다단 분할 방법)

  • 임창경;정정화
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.35C no.10
    • /
    • pp.69-77
    • /
    • 1998
  • This paper presents a new efficient and stable multilevel partitioning algorithm for VLSI circuit design. The performance of multilevel partitioning algorithms that are proposed to enhance the performance of previous iterative-improvement partitioning algorithms for large scale circuits, depend on choice of construction methods for partition hierarchy. As the most of previous multilevel partitioning algorithms forces experimental constraints on the process of hierarchy construction, the stability of their performances goes down. The lack of stability causes the large variation of partition results during multiple runs. In this paper, we minimize the use of experimental constraints and propose a new method for constructing partition hierarchy. The proposed method clusters the cells with the connection status of the circuit. After constructing the partition hierarchy, a partition improvement algorithm, HYIP$^{[11]}$ using hybrid bucket structure, unclusters the hierachy to get partition results. The experimental results on ACM/SIGDA benchmark circuits show improvement up to 10-40% in minimum outsize over the previous algorithm $^{[3] [4] [5] [8] [10]}$. Also our technique outperforms ML$^{[10]}$ represented multilevel partition method by about 5% and 20% for minimum and average custsize, respectively. In addition, the results of our algorithm with 10 runs are better than ML algorithm with 100 runs.

  • PDF

Design and Analysis of a Digit-Serial $AB^{2}$ Systolic Arrays in $GF(2^{m})$ ($GF(2^{m})$ 상에서 새로운 디지트 시리얼 $AB^{2}$ 시스톨릭 어레이 설계 및 분석)

  • Kim Nam-Yeun;Yoo Kee-Young
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.4
    • /
    • pp.160-167
    • /
    • 2005
  • Among finite filed arithmetic operations, division/inverse is known as a basic operation for public-key cryptosystems over $GF(2^{m})$ and it is computed by performing the repetitive $AB^{2}$ multiplication. This paper presents a digit-serial-in-serial-out systolic architecture for performing the $AB^2$ operation in GF$(2^{m})$. To obtain L×L digit-serial-in-serial-out architecture, new $AB^{2}$ algorithm is proposed and partitioning, index transformation and merging the cell of the architecture, which is derived from the algorithm, are proposed. Based on the area-time product, when the digit-size of digit-serial architecture, L, is selected to be less than about m, the proposed digit-serial architecture is efficient than bit-parallel architecture, and L is selected to be less than about $(1/5)log_{2}(m+1)$, the proposed is efficient than bit-serial. In addition, the area-time product complexity of pipelined digit-serial $AB^{2}$ systolic architecture is approximately $10.9\%$ lower than that of nonpipelined one, when it is assumed that m=160 and L=8. Additionally, since the proposed architecture can be utilized for the basic architecture of crypto-processor and it is well suited to VLSI implementation because of its simplicity, regularity and pipelinability.

A New Algorithm and High-Performance Hardware Design for 2-Dimensional Parallel Generation of Digital Hologram (디지털 홀로그램의 2차원적인 병렬 생성을 위한 알고리즘 및 고성능 하드웨어 설계)

  • Yang, Wol-Sung;Seo, Young-Ho;Kim, Dong-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.1
    • /
    • pp.133-142
    • /
    • 2012
  • In this paper, we propose and implement a high-speed algorithm for CGH that is to calculate digital hologram by modeling the interference phenomenon for tow lights. This algorithm changes the computation equations into a parallel-computable ones and implements it with a structure consisting of two kinds of cells (initial calculation cell, and update calculation cell). The parallel computation algorithm is to get the rest hologram pixels concurrently after calculation the first hologram column. Here, the initial calculation cells compute the first column of the hologram and the update calculation cells compute the rest of the hologram. The two kinds of cells performs a pipeline operation to complete the operations of the two cells at the same time. A CGH calculator to compute the hole hologram for a light source is structured by arranging the two kinds of cells. Results from simulation showed that the maximum operation frequency is about 215MHz. So, experiments are performed by setting this frequency and the same environments as the method showing the best performance. As the results, the proposed one could complete the computation of 81.75 CGH frames per second, while the previous method computes 62.9 CGH frames per second.

Efficient programmable power-of-two scaler for the three-moduli set {2n+p, 2n - 1, 2n+1 - 1}

  • Taheri, MohammadReza;Navi, Keivan;Molahosseini, Amir Sabbagh
    • ETRI Journal
    • /
    • v.42 no.4
    • /
    • pp.596-607
    • /
    • 2020
  • Scaling is an important operation because of the iterative nature of arithmetic processes in digital signal processors (DSPs). In residue number system (RNS)-based DSPs, scaling represents a performance bottleneck based on the complexity of intermodulo operations. To design an efficient RNS scaler for special moduli sets, a body of literature has been dedicated to the study of the well-known moduli sets {2n - 1, 2n, 2n + 1} and {2n, 2n - 1, 2n+1 - 1}, and their extension in vertical or horizontal forms. In this study, we propose an efficient programmable RNS scaler for the arithmetic-friendly moduli set {2n+p, 2n - 1, 2n+1 - 1}. The proposed algorithm yields high speed and energy-efficient realization of an RNS programmable scaler based on the effective exploitation of the mixed-radix representation, parallelism, and a hardware sharing technique. Experimental results obtained for a 130 nm CMOS ASIC technology demonstrate the superiority of the proposed programmable scaler compared to the only available and highly effective hybrid programmable scaler for an identical moduli set. The proposed scaler provides 43.28% less power consumption, 33.27% faster execution, and 28.55% more area saving on average compared to the hybrid programmable scaler.

A Built-In Self-Test Architecture using Self-Scan Chains (자체 스캔 체인을 이용한 Built-In Self-Test 구조에 관한 연구)

  • Han, Jin-Uk;Min, Hyeong-Bok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.39 no.3
    • /
    • pp.85-97
    • /
    • 2002
  • STUMPS has been widely used for built-in self-test of scan design with multiple scan chains. In the STUMPS architecture, there is very high correlation between the bit sequences in the adjacent scan chains. This correlation causes circuits lower the fault coverage. In order to solve this problem, an extra combinational circuit block(phase shifter) is placed between the LFSR and the inputs of STUMPS architecture despite the hardware overhead increase. This paper introduces an efficient test pattern generation technique and built-in self-test architecture for sequential circuits with multiple scan chains. The proposed test pattern generator is not used the input of LFSR and phase shifter, hence hardware overhead can be reduced and sufficiently high fault coverage is obtained. Only several XOR gates in each scan chain are required to modify the circuit for the scan BIST, so that the design is very simple.

Design of Low Error Fixed-Width Group CSD Multiplier (저오차 고정길이 그룹 CSD 곱셈기 설계)

  • Kim, Yong-Eun;Cho, Kyung-Ju;Chung, Jin-Gyun
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.9
    • /
    • pp.33-38
    • /
    • 2009
  • The group CSD (GCSD) multiplier was recently proposed based on the variation of canonic signed digit (CSD) encoding and partial product sharing. This multiplier provides an efficient design when the multiplications are performed only with a few predetermined coefficients (e.g., FFT). In many DSP applications such as FFT, the (2W-1)-bit product obtained from W-bit multiplicand and W-bit multiplier is quantized to W-bits by eliminating the (W-1) least-significant bits. This paper presents an error compensation method for a fixed-width GCSD multiplier that receives a W-bit input and produces a W-bit product. To efficiently compensate for the quantization error, the encoded signals from the GCSD multiplier are used for the generation of error compensation bias. By Synopsys simulations, it is shown that the proposed method leads to up to 84% reduction in power consumption and up to 79% reduction in area compared with the fixed-width modified Booth multiplier.

Educational System Design of RFID/USN (RFID/USN 교육용 시스템의 설계)

  • Kim, Dae-Hee;Oh, Do-Bong;Jung, Joong-soo;Jung, Kwang-wook
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2009.05a
    • /
    • pp.687-692
    • /
    • 2009
  • This paper presents the development of RFID educational system based on 900MHz air interface between the reader and the active tag. The software of reader and the active tag is developed on embedded environment, and the software of PC controlling the reader is based on window OS operated as the server. The ATmega128 VLSI chip is used for the processor of the reader and the active tag. As the development environment, AVR compiler is used for the reader and the active tag of which the programming language is C. The visual C++language of the visual studio on the PC activated as the server is used for development language. Main functions of this system are to control tag containing EPC global Data by PC through the reader, to obtain information of tag through the internet and to read/write data on tag memory. Software design of 900MHz RFID/USN educational system is done on the basis of these functions.

  • PDF