• Title/Summary/Keyword: low-complexity hardware architecture

Search Result 86, Processing Time 0.023 seconds

A High-Speed 2-Parallel Radix-$2^4$ FFT Processor for MB-OFDM UWB Systems (MB-OFDM UWB 통신 시스템을 위한 고속 2-Parallel Radix-$2^4$ FFT 프로세서의 설계)

  • Lee, Jee-Sung;Lee, Han-Ho
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.533-534
    • /
    • 2006
  • This paper presents the architecture design of a high-speed, low-complexity 128-point radix-$2^4$ FFT processor for ultra-wideband (UWB) systems. The proposed high-speed, low-complexity FFT architecture can provide a higher throughput rate and low hardware complexity by using 2-parallel data-path scheme and single-path delay-feedback (SDF) structure. This paper presents the key ideas applied to the design of high-speed, low-complexity FFT processor, especially that for achieving high throughput rate and reducing hardware complexity. The proposed FFT processor has been designed and implemented with the 0.18-m CMOS technology in a supply voltage of 1.8 V. The throughput rate of proposed FFT processor is up to 1 Gsample/s while it requires much smaller hardware complexity.

  • PDF

High-Performance and Low-Complexity Image Pre-Processing Method Based on Gradient-Vector Characteristics and Hardware-Block Sharing

  • Kim, Woo Suk;Lee, Juseong;An, Ho-Myoung;Kim, Jooyeon
    • Transactions on Electrical and Electronic Materials
    • /
    • v.18 no.6
    • /
    • pp.320-322
    • /
    • 2017
  • In this paper, a high-performance, low-area gradient-magnitude calculator architecture is proposed, based on approximate image processing. To reduce the computational complexity of the gradient-magnitude calculation, vector properties, the symmetry axis, and common terms were applied in a hardware-resource-shared architec-ture. The proposed gradient-magnitude calculator was implemented using an Altera Cyclone IV FPGA (EP4CE115F29) and the Quartus II v.16 device software. It satisfied the output-data quality while reducing the logic elements by 23% and the embedded multipliers by 76%, compared with previous work.

Low Complexity Gradient Magnitude Calculator Hardware Architecture Using Characteristic Analysis of Projection Vector and Hardware Resource Sharing (정사영 벡터의 특징 분석 및 하드웨어 자원 공유기법을 이용한 저면적 Gradient Magnitude 연산 하드웨어 구현)

  • Kim, WooSuk;Lee, Juseong;An, Ho-Myoung
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.9 no.4
    • /
    • pp.414-418
    • /
    • 2016
  • In this paper, a hardware architecture of low area gradient magnitude calculator is proposed. For the hardware complexity reduction, the characteristic of orthogonal projection vector and hardware resource sharing technique are applied. The proposed hardware architecture can be implemented without degradation of the gradient magnitude data quality since the proposed hardware is implemented with original algorithm. The FPGA implementation result shows the 15% of logic elements and 38% embedded multiplier savings compared with previous work using Altera Cyclone VI (EP4CE115F29C7N) FPGA and Quartus II v15.0 environment.

High-Throughput Low-Complexity Successive-Cancellation Polar Decoder Architecture using One's Complement Scheme

  • Kim, Cheolho;Yun, Haram;Ajaz, Sabooh;Lee, Hanho
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.15 no.3
    • /
    • pp.427-435
    • /
    • 2015
  • This paper presents a high-throughput low-complexity decoder architecture and design technique to implement successive-cancellation (SC) polar decoding. A novel merged processing element with a one's complement scheme, a main frame with optimal internal word length, and optimized feedback part architecture are proposed. Generally, a polar decoder uses a two's complement scheme in merged processing elements, in which a conversion between two's complement and sign-magnitude requires an adder. However, the novel merged processing elements do not require an adder. Moreover, in order to reduce hardware complexity, optimized main frame and feedback part approaches are also presented. A (1024, 512) SC polar decoder was designed and implemented using 40-nm CMOS standard cell technology. Synthesis results show that the proposed SC polar decoder can lead to a 13% reduction in hardware complexity and a higher clock speed compared to conventional decoders.

Image Filter Optimization Method based on common sub-expression elimination for Low Power Image Feature Extraction Hardware Design (저전력 영상 특징 추출 하드웨어 설계를 위한 공통 부분식 제거 기법 기반 이미지 필터 하드웨어 최적화)

  • Kim, WooSuk;Lee, Juseong;An, Ho-Myoung;Kim, Byungcheul
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.10 no.2
    • /
    • pp.192-197
    • /
    • 2017
  • In this paper, image filter optimization method based on common sub-expression elimination is proposed for low-power image feature extraction hardware design. Low power and high performance object recognition hardware is essential for industrial robot which is used for factory automation. However, low area Gaussian gradient filter hardware design is required for object recognition hardware. For the hardware complexity reduction, we adopt the symmetric characteristic of the filter coefficients using the transposed form FIR filter hardware architecture. The proposed hardware architecture can be implemented without degradation of the edge detection data quality since the proposed hardware is implemented with original Gaussian gradient filtering algorithm. The expremental result shows the 50% of multiplier savings compared with previous work.

Gradient Magnitude Hardware Architecture based on Hardware Folding Design Method for Low Power Image Feature Extraction Hardware Design (저전력 영상 특징 추출 하드웨어 설계를 위한 하드웨어 폴딩 기법 기반 그라디언트 매그니튜드 연산기 구조)

  • Kim, WooSuk;Lee, Juseong;An, Ho-Myoung
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.10 no.2
    • /
    • pp.141-146
    • /
    • 2017
  • In this paper, a gradient magnitude hardware architecture based on hardware folding design method is proposed for low power image feature extraction. For the hardware complexity reduction, the projection vector chracteristic of gradient magnitude is applied. The proposed hardware architecture can be implemented with the small degradation of the gradient magnitude data quality. The FPGA implementation result shows the 41% of logic elements and 62% embedded multiplier savings compared with previous work using Altera Cyclone VI (EP4CE115F29C7N) FPGA and Quartus II v16.0 environment.

Low Complexity Architecture for Fast-Serial Multiplier in $GF(2^m)$ ($GF(2^m)$ 상의 저복잡도 고속-직렬 곱셈기 구조)

  • Cho, Yong-Suk
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.17 no.4
    • /
    • pp.97-102
    • /
    • 2007
  • In this paper, a new architecture for fast-serial $GF(2^m)$ multiplier with low hardware complexity is proposed. The fast-serial multiplier operates standard basis of $GF(2^m)$ and is faster than bit serial ones but with lower area complexity than bit parallel ones. The most significant feature of the fast-serial architecture is that a trade-off between hardware complexity and delay time can be achieved. But The traditional fast-serial architecture needs extra (t-1)m registers for achieving the t times speed. In this paper a new fast-serial multiplier without increasing the number of registers is presented.

A Low-complexity Mixed QR Decomposition Architecture for MIMO Detector (MIMO 검출기에 적용 가능한 저 복잡도 복합 QR 분해 구조)

  • Shin, Dongyeob;Kim, Chulwoo;Park, Jongsun
    • Journal of IKEEE
    • /
    • v.18 no.1
    • /
    • pp.165-171
    • /
    • 2014
  • This paper presents a low complexity QR decomposition (QRD) architecture for MIMO detector. In the proposed approach, various CORDIC-based QRD algorithms are efficiently combined together to reduce the computational complexity of the QRD hardware. Based on the computational complexity analysis on various QRD algorithms, a low complexity approach is selected at each stage of QRD process. The proposed QRD architecture can be applied to any arbitrary dimension of channel matrix, and the complexity reduction grows with the increasing matrix dimension. Our QR decomposition hardware was implemented using Samsung $0.13{\mu}m$ technology. The numerical results show that the proposed architecture achieves 47% increase in the QAR (QRD Rate/Gate count) with 28.1% power savings over the conventional Householder CORDIC-based architecture for the $4{\times}4$ matrix decomposition.

Low Complexity Digit-Parallel/Bit-Serial Polynomial Basis Multiplier (저복잡도 디지트병렬/비트직렬 다항식기저 곱셈기)

  • Cho, Yong-Suk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.35 no.4C
    • /
    • pp.337-342
    • /
    • 2010
  • In this paper, a new architecture for digit-parallel/bit-serial GF($2^m$) multiplier with low complexity is proposed. The proposed multiplier operates in polynomial basis of GF($2^m$) and produces multiplication results at a rate of one per D clock cycles, where D is the selected digit size. The digit-parallel/bit-serial multiplier is faster than bit-serial ones but with lower area complexity than bit-parallel ones. The most significant feature of the digit-parallel/bit-serial architecture is that a trade-off between hardware complexity and delay time can be achieved. But the traditional digit-parallel/bit-serial multiplier needs extra hardware for high speed. In this paper a new low complexity efficient digit-parallel/bit-serial multiplier is presented.

Low-Complexity Triple-Error-Correcting Parallel BCH Decoder

  • Yeon, Jaewoong;Yang, Seung-Jun;Kim, Cheolho;Lee, Hanho
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.13 no.5
    • /
    • pp.465-472
    • /
    • 2013
  • This paper presents a low-complexity triple-error-correcting parallel Bose-Chaudhuri-Hocquenghem (BCH) decoder architecture and its efficient design techniques. A novel modified step-by-step (m-SBS) decoding algorithm, which significantly reduces computational complexity, is proposed for the parallel BCH decoder. In addition, a determinant calculator and a error locator are proposed to reduce hardware complexity. Specifically, a sharing syndrome factor calculator and a self-error detection scheme are proposed. The multi-channel multi-parallel BCH decoder using the proposed m-SBS algorithm and design techniques have considerably less hardware complexity and latency than those using a conventional algorithms. For a 16-channel 4-parallel (1020, 990) BCH decoder over GF($2^{12}$), the proposed design can lead to a reduction in complexity of at least 23 % compared to conventional architecttures.