• Title/Summary/Keyword: bit-serial

Search Result 179, Processing Time 0.024 seconds

Design of a scalable general-purpose parallel associative processor using content-addressable memory (Content-Addressable Memory를 이용한 확장 가능한 범용 병렬 Associative Processor 설계)

  • Park, Tae-Geun
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.43 no.2 s.344
    • /
    • pp.51-59
    • /
    • 2006
  • Von Neumann architecture suffers from the interface between the central processing unit and the memory, which is called 'Von Neumann bottleneck' In this paper, we propose a scalable general-purpose associative processor (AP) based on content-addressable memory (CAM) which solves this problem and is suitable for the search-oriented applications. We propose an efficient instruction set and a structural scalability to extend for larger applications. We define twelve instructions and provide some reduced instructions to speed up which execute two instructions in a single instruction cycle. The proposed AP performs in a bit-serial, word-parallel fashion and can be considered as a 32-bit general-purpose parallel processor with a massively parallel SIMD structure. We design and simulate a maximum/minumum search greater-than/less-than search, and parallel addition to verify the proposed architecture. The algorithms are executed in a constant time O(k) regardless of the number of input data.

Performance Of Iterative Decoding Schemes As Various Channel Bit-Densities On The Perpendicular Magnetic Recording Channel (수직자기기록 채널에서 기록 밀도에 따른 반복복호 기법의 성능)

  • Park, Dong-Hyuk;Lee, Jae-Jin
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.35 no.7C
    • /
    • pp.611-617
    • /
    • 2010
  • In this paper, we investigate the performances of the serial concatenated convolutional codes (SCCC) and low-density parity-check (LDPC) codes on perpendicular magnetic recording (PMR) channels. We discuss the performance of two systems when user bit-densities are 1.7, 2.0, 2.4 and 2.8, respectively. The SCCC system is less complex than LDPC system. The SCCC system consists of recursive systematic convolutional (RSC) codes encoder/decoder, precoder and random interleaver. The decoding algorithm of the SCCC system is the soft message-passing algorithm and the decoding algorithm of the LDPC system is the log domain sum-product algorithm (SPA). When we apply the iterative decoding between channel detector and the error control codes (ECC) decoder, the SCCC system is compatible with the LDPC system even at the high user bit density.

Fault Detection Architecture of the Field Multiplication Using Gaussian Normal Bases in GF(2n (가우시안 정규기저를 갖는 GF(2n)의 곱셈에 대한 오류 탐지)

  • Kim, Chang Han;Chang, Nam Su;Park, Young Ho
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.24 no.1
    • /
    • pp.41-50
    • /
    • 2014
  • In this paper, we proposed an error detection in Gaussian normal basis multiplier over $GF(2^n)$. It is shown that by using parity prediction, error detection can be very simply constructed in hardware. The hardware overheads are only one AND gate, n+1 XOR gates, and one 1-bit register in serial multipliers, and so n AND gates, 2n-1 XOR gates in parallel multipliers. This method are detect in odd number of bit fault in C = AB.

Implementation of sigma-delta A/D converter IP for digital audio

  • Park SangBong;Lee YoungDae
    • Proceedings of the IEEK Conference
    • /
    • summer
    • /
    • pp.199-203
    • /
    • 2004
  • In this paper, we only describe the digital block of two-channel 18-bit analog-to-digital (A/D) converter employing sigma-delta method and xl28 decimation. The device contains two fourth comb filters with 1-bit input from sigma­delta modulator. each followed by a digital half band FIR(Finite Impulse Response) filters. The external analog sigma-delta modulators are sampled at 6.144MHz and the digital words are output at 48kHz. The fourth-order comb filter has designed 3 types of ways for optimal power consumption and signal-to-noise ratio. The following 3 digital filters are designed with 12tap, 22tap and 116tap to meet the specification. These filters eliminate images of the base band audio signal that exist at multiples of the input sample rate. We also designed these filters with 8bit and 16bit filter coefficient to analysis signal-to-noise ratio and hardware complexity. It also included digital output interface block for I2S serial data protocol, test circuit and internal input vector generator. It is fabricated with 0.35um HYNIX standard CMOS cell library with 3.3V supply voltage and the chip size is 2000um by 2000um. The function and the performance have been verified using Verilog XL logic simulator and Matlab tool.

  • PDF

A 1.7 Gbps DLL-Based Clock Data Recovery for a Serial Display Interface in 0.35-${\mu}m$ CMOS

  • Moon, Yong-Hwan;Kim, Sang-Ho;Kim, Tae-Ho;Park, Hyung-Min;Kang, Jin-Ku
    • ETRI Journal
    • /
    • v.34 no.1
    • /
    • pp.35-43
    • /
    • 2012
  • This paper presents a delay-locked-loop-based clock and data recovery (CDR) circuit design with a nB(n+2)B data formatting scheme for a high-speed serial display interface. The nB(n+2)B data is formatted by inserting a '01' clock information pattern in every piece of N-bit data. The proposed CDR recovers clock and data in 1:10 demultiplexed form without an external reference clock. To validate the feasibility of the scheme, a 1.7-Gbps CDR based on the proposed scheme is designed, simulated, and fabricated. Input data patterns were formatted as 10B12B for a high-performance display interface. The proposed CDR consumes approximately 8 mA under a 3.3-V power supply using a 0.35-${\mu}m$ CMOS process and the measured peak-to-peak jitter of the recovered clock is 44 ps.

Parallel Implementation of the Recursive Least Square for Hyperspectral Image Compression on GPUs

  • Li, Changguo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.7
    • /
    • pp.3543-3557
    • /
    • 2017
  • Compression is a very important technique for remotely sensed hyperspectral images. The lossless compression based on the recursive least square (RLS), which eliminates hyperspectral images' redundancy using both spatial and spectral correlations, is an extremely powerful tool for this purpose, but the relatively high computational complexity limits its application to time-critical scenarios. In order to improve the computational efficiency of the algorithm, we optimize its serial version and develop a new parallel implementation on graphics processing units (GPUs). Namely, an optimized recursive least square based on optimal number of prediction bands is introduced firstly. Then we use this approach as a case study to illustrate the advantages and potential challenges of applying GPU parallel optimization principles to the considered problem. The proposed parallel method properly exploits the low-level architecture of GPUs and has been carried out using the compute unified device architecture (CUDA). The GPU parallel implementation is compared with the serial implementation on CPU. Experimental results indicate remarkable acceleration factors and real-time performance, while retaining exactly the same bit rate with regard to the serial version of the compressor.

Combined Horizontal-Vertical Serial BP Decoding of GLDPC Codes with Binary Cyclic Codes (이진 순환 부호를 쓰는 GLDPC 부호의 수평-수직 결합 직렬 복호)

  • Chung, Kyuhyuk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.39A no.10
    • /
    • pp.585-592
    • /
    • 2014
  • It is well known that serial belief propagation (BP) decoding for low-density parity-check (LDPC) codes achieves faster convergence without any increase of decoding complexity per iteration and bit error rate (BER) performance loss than standard parallel BP (PBP) decoding. Serial BP (SBP) decoding, such as horizontal SBP (H-SBP) decoding or vertical SBP (V-SBP) decoding, updates check nodes or variable nodes faster than standard PBP decoding within a single iteration. In this paper, we propose combined horizontal-vertical SBP (CHV-SBP) decoding. By the same reasoning, CHV-SBP decoding updates check nodes or variable nodes faster than SBP decoding within a serialized step in an iteration. CHV-SBP decoding achieves faster convergence than H-SBP or V-SBP decoding. We compare these decoding schemes in details. We also show in simulations that the convergence rate, in iterations, for CHV-SBP decoding is about $\frac{1}{6}$ of that for standard PBP decoding, while the convergence rate for SBP decoding is about $\frac{1}{2}$ of that for standard PBP decoding. In simulations, we use recently proposed generalized LDPC (GLDPC) codes with binary cyclic codes (BCC).

Design of a systolic radix-4 finite-field multiplier for the elliptic curve cryptosystem (타원곡선 암호를 위한 시스톨릭 Radix-4 유한체 곱셈기의 설계)

  • Kim, Ju-Young;Park, Tae-Geun
    • Proceedings of the IEEK Conference
    • /
    • 2005.11a
    • /
    • pp.695-698
    • /
    • 2005
  • The finite-field multiplication can be applied to the wide range of applications, such as signal processing on communication, cryptography, etc. However, an efficient algorithm and the hardware design are required since the finite-field multiplication takes much time to compute. In this paper, we propose a radix-4 systolic multiplier on $GF(2^m)$ with comparative area and performance. The algorithm of the proposed standard-basis multiplier is mathematically developed to map on low-cost systolic cell, so that the proposed systolic architecture is suitable for VLSI design. Compared to the bit-serial and digit-serial multipliers, the proposed multiplier shows relatively better performance with low cost. We design and synthesis $GF(2^{193})$ finite-field multiplier using Hynix $0.35{\mu}m$ standard cell library and the maximum clock frequency is 400MHz.

  • PDF

A portable multichannel FES system for control of paralyzed extremities (마비된 말단근육의 제어를 위한 휴대용 다중 채널의 기능적 전기자극(FES) 장치)

  • 류영재;박봉기;김영민;임영철;김하경
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1992.10a
    • /
    • pp.90-94
    • /
    • 1992
  • A portable multichannel functional electrical stimulation(FES) system for the fine control of the paralyzed extremities in spinal cord injury patients is described. This system is composed of a stimulation data creating system, a serial communication device, a 16-bit microprocessor, D/A converter of 32 channels and a display device. Stimulation patterns are created from analytical results of integrated EMGs during motion in normal subjects and are stored in the stimulation data creating system as data files. And then the stimulation patterns are sent to the memory in the portable multichannel FES system through serial communication interfacing device. Sophisticated fine control of paralyzed extrimities was realized by transmitting multichannel stimulation patterns to percutaneous intramuscular electrodes, which stimulate the motor function of paralyzed muscle simultaneously. Advantages of this system are as follws: 1) It is possible to modify stimulation patterns in accordance with the patient's situation. 2) This system is small and light.

  • PDF

Pair-Wise Serial ROIC for Uncooled Microbolometer Array

  • Haider, Syed Irtaza;Majzoub, Sohaib;Alturaigi, Mohammed;Abdel-Rahman, Mohamed
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.4 no.4
    • /
    • pp.251-257
    • /
    • 2015
  • This work presents modelling and simulation of a readout integrated circuit (ROIC) design considering pair-wise serial configuration along with thermal modeling of an uncooled microbolometer array. A fully differential approach is used at the input stage in order to reduce fixed pattern noise due to the process variation and self-heating-related issues. Each pair of microbolometers is pulse-biased such that they both fall under the same self-heating point along the self-heating trend line. A ${\pm}10%$ process variation is considered. The proposed design is simulated with a reference input image consisting of an array of $127{\times}92$ pixels. This configuration uses only one unity gain differential amplifier along with a single 14-bit analog-to-digital converter in order to minimize the dynamic range requirement of the ROIC.