• Title/Summary/Keyword: bit-serial implementation

Search Result 52, Processing Time 0.025 seconds

Implementation of Data Protocol Conversion System for High-end CMOS Image Sensors Equipped with SMIA CCP2 Serial Interface (SMIA CCP2 직렬 인터페이스를 가지는 고기능 이미지 센서를 위한 데이터 프로토콜 변환 시스템의 구현)

  • Kim, Nam-Ho;Park, Hyun-Sang
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.10 no.4
    • /
    • pp.753-758
    • /
    • 2009
  • Recently the high-end CMOS image sensors are developed, conforming to the SMIA CCP2 specification, which is a high-speed low-power serial interface based on LVDS technology. But this kind of technology trend makes the existing equipments are no longer useful, although their capability is still good enough to handle the recent image sensors if there was no interfacing problem. In this paper, we propose and realize a data protocol conversion system that translates the SMIA CCP2 serial signals into the existing 10-bit parallel signals. The proposed system is composed of a de-serializer and a FPCA chip, and thus can be constructed on a small PCB which enables easy integration between the existing equipments and the new high-end image sensors. Besides, the maximum transfer rate by the SMIA specification is also achieved on the implemented system. So it is expected that the implemented system can be used as a general-purpose protocol converter in a variety of sensor-related application fields.

Low-Cost Elliptic Curve Cryptography Processor Based On Multi-Segment Multiplication (멀티 세그먼트 곱셈 기반 저비용 타원곡선 암호 프로세서)

  • LEE Dong-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.8 s.338
    • /
    • pp.15-26
    • /
    • 2005
  • In this paper, we propose an efficient $GF(2^m)$ multi-segment multiplier architecture and study its application to elliptic curve cryptography processors. The multi-segment based ECC datapath has a very small combinational multiplier to compute partial products, most of its internal data buses are word-sized, and it has only a single m bit multiplexer and a single m bit register. Hence, the resource requirements of the proposed ECC datapath can be minimized as the segment number increases and word-size is decreased. Hence, as compared to the ECC processor based on digit-serial multiplication, the proposed ECC datapath is more efficient in resource usage. The resource requirement of ECC Processor implementation depends not only on the number of basic hardware components but also on the complexity of interconnection among them. To show the realistic area efficiency of proposed ECC processors, we implemented both the ECC processors based on the proposed multi-segment multiplication and digit serial multiplication and compared their FPGA resource usages. The experimental results show that the Proposed multi-segment multiplication method allows to implement ECC coprocessors, requiring about half of FPGA resources as compared to digit serial multiplication.

The New Architecture of Low Power Inner Product Processor for Reconfigurable Neural Networks (재구성 가능한 뉴럴 네트워크 구현을 위한 새로운 저전력 내적연산 프로세서 구조)

  • 임국찬;이현수
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.41 no.5
    • /
    • pp.61-70
    • /
    • 2004
  • The operation mode of neural network is divided into learning and recognition process. Learning is updating process of weight until neural network archives target result from input pattern. Recognition is arithmetic process of input pattern and weight. Traditional inner product process is focused to improve processing speed and hardware complexity. There is no hardware architecture to distinguish between loaming and recognition mode of neural network. In this paper we propose the new architecture of low power inner product processor for reconfigurable neural network. The proposed architecture is similar with bit-serial inner product processor on learning mode. It have several advantages which are fast processing base on bit-level, suitability of hardware implementation and pipeline architecture to compute data. And proposed architecture minimizes active units and reduces consumption power on recognition mode. Result of simulation shows that active units is depend on bit representation of weight, but we can reduce active units about 50 precent.

Design of a Bit-Serial Divider in GF(2$^{m}$ ) for Elliptic Curve Cryptosystem (타원곡선 암호시스템을 위한 GF(2$^{m}$ )상의 비트-시리얼 나눗셈기 설계)

  • 김창훈;홍춘표;김남식;권순학
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.12C
    • /
    • pp.1288-1298
    • /
    • 2002
  • To implement elliptic curve cryptosystem in GF(2$\^$m/) at high speed, a fast divider is required. Although bit-parallel architecture is well suited for high speed division operations, elliptic curve cryptosystem requires large m(at least 163) to support a sufficient security. In other words, since the bit-parallel architecture has an area complexity of 0(m$\^$m/), it is not suited for this application. In this paper, we propose a new serial-in serial-out systolic array for computing division operations in GF(2$\^$m/) using the standard basis representation. Based on a modified version of tile binary extended greatest common divisor algorithm, we obtain a new data dependence graph and design an efficient bit-serial systolic divider. The proposed divider has 0(m) time complexity and 0(m) area complexity. If input data come in continuously, the proposed divider can produce division results at a rate of one per m clock cycles, after an initial delay of 5m-2 cycles. Analysis shows that the proposed divider provides a significant reduction in both chip area and computational delay time compared to previously proposed systolic dividers with the same I/O format. Since the proposed divider can perform division operations at high speed with the reduced chip area, it is well suited for division circuit of elliptic curve cryptosystem. Furthermore, since the proposed architecture does not restrict the choice of irreducible polynomial, and has a unidirectional data flow and regularity, it provides a high flexibility and scalability with respect to the field size m.

Performance Analysis of Implementation on Image Processing Algorithm for Multi-Access Memory System Including 16 Processing Elements (16개의 처리기를 가진 다중접근기억장치를 위한 영상처리 알고리즘의 구현에 대한 성능평가)

  • Lee, You-Jin;Kim, Jea-Hee;Park, Jong-Won
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.49 no.3
    • /
    • pp.8-14
    • /
    • 2012
  • Improving the speed of image processing is in great demand according to spread of high quality visual media or massive image applications such as 3D TV or movies, AR(Augmented reality). SIMD computer attached to a host computer can accelerate various image processing and massive data operations. MAMS is a multi-access memory system which is, along with multiple processing elements(PEs), adequate for establishing a high performance pipelined SIMD machine. MAMS supports simultaneous access to pq data elements within a horizontal, a vertical, or a block subarray with a constant interval in an arbitrary position in an $M{\times}N$ array of data elements, where the number of memory modules(MMs), m, is a prime number greater than pq. MAMS-PP4 is the first realization of the MAMS architecture, which consists of four PEs in a single chip and five MMs. This paper presents implementation of image processing algorithms and performance analysis for MAMS-PP16 which consists of 16 PEs with 17 MMs in an extension or the prior work, MAMS-PP4. The newly designed MAMS-PP16 has a 64 bit instruction format and application specific instruction set. The author develops a simulator of the MAMS-PP16 system, which implemented algorithms can be executed on. Performance analysis has done with this simulator executing implemented algorithms of processing images. The result of performance analysis verifies consistent response of MAMS-PP16 through the pyramid operation in image processing algorithms comparing with a Pentium-based serial processor. Executing the pyramid operation in MAMS-PP16 results in consistent response of processing time while randomly response time in a serial processor.

Design of FIR Filters With Sparse Signed Digit Coefficients (희소한 부호 자리수 계수를 갖는 FIR 필터 설계)

  • Kim, Seehyun
    • Journal of IKEEE
    • /
    • v.19 no.3
    • /
    • pp.342-348
    • /
    • 2015
  • High speed implementation of digital filters is required in high data rate applications such as hard-wired wide band modem and high resolution video codec. Since the critical path of the digital filter is the MAC (multiplication and accumulation) circuit, the filter coefficient with sparse non-zero bits enables high speed implementation with adders of low hardware cost. Compressive sensing has been reported to be very successful in sparse representation and sparse signal recovery. In this paper a filter design method for digital FIR filters with CSD (canonic signed digit) coefficients using compressive sensing technique is proposed. The sparse non-zero signed bits are selected in the greedy fashion while pruning the mistakenly selected digits. A few design examples show that the proposed method can be utilized for designing sparse CSD coefficient digital FIR filters approximating the desired frequency response.

The Implementation of Integrated Information Network for JANG-MOK Oceanographic Research Ship (시험조사선 장목호의 종합정보통신망 구현)

  • Park Jong-Won;Kim Dug-Jin;Baek Hyuk;Park Dong-Won
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2006.05a
    • /
    • pp.91-94
    • /
    • 2006
  • KORDI(Korea Ocean Research & Development Institute) built a research vessel JANG-MOK with 40 G/T for a survey and observation of oceanographic environmental characteristics at coastal region in September 2005. This paper introduced the implementation for hardware and software of an integrated information network that loaded in JANG-MOK, and depicted the function of an integrated information network, such as an installation of ga-bit based network, RS232C serial & UDP network interface of instruments, a data logging software of measured data, Hawkeye II software for supporting the efficient survey works, and a real-time navigation viewer. In addition, we presents the another implementation method for an integrated information network of oceanographic research vessels.

  • PDF

Digital Logic Extraction from QCA Designs (QCA 설계에서 디지털 논리 자동 추출)

  • Oh, Youn-Bo;Kim, Kyo-Sun
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.1
    • /
    • pp.107-116
    • /
    • 2009
  • Quantum-dot Cellular Automata (QCA) is one of the most promising next generation nanoelectronic devices which will inherit the throne of CMOS which is the domineering implementation technology for large scale low power digital systems. In late 1990s, the basic operations of the QCA cell were already demonstrated on a hardware implementation. Also, design tools and simulators were developed. Nevertheless, its design technology is not quite ready for ultra large scale designs. This paper proposes a new approach which enables the QCA designs to inherit the verification methodologies and tools of CMOS designs, as well. First, a set of disciplinary rules strictly restrict the cell arrangement not to deviate from the predefined structures but to guarantee the deterministic digital behaviors is proposed. After the gate and interconnect structures of. the QCA design are identified, the signal integrity requirements including the input path balancing of majority gates, and the prevention of the noise amplification are checked. And then the digital logic is extracted and stored in the OpenAccess common engineering database which provides a connection to a large pool of CMOS design verification tools. Towards validating the proposed approach, we designed a 2-bit adder, a bit-serial adder, and an ALU bit-slice. For each design, the digital logic is extracted, translated into the Verilog net list, and then simulated using a commercial software.

Assistant Professor, Department of Computer Engineering Pukyong Universisty (한국형 방송 프로그램 시스템 디코더 ASSP의 개발)

  • Jo, Gyeong-Yeon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.5
    • /
    • pp.1229-1239
    • /
    • 1996
  • The increase of additional information broadcasting of TV demands a graphic overlay processor. This paper is about the design, implementation and testing of a graphic overlay processor called by KBPS decoder ASSP (Applicatio n Specific Standard Product) which is compliance with Korea Broadcast Programming System. KBPS decoder ASSP consists of embedded 8 bit microprocessor Z80, graphic overlay controller, KBPS schedule decoder, memory controller, priority interrupt controller, MIDI controller, infrared raccoon receiver, async scrial communication controller, timer, bus controller, universal parallel input-output port and serial-parallel interface. The 0.8 micron CMOS Sea of Gate is used to implement the ASSP in amount of about 31,500 gates, and it is running at 14.318MHz.

  • PDF

Multi-mode Layered LDPC Decoder for IEEE 802.11n (IEEE 802.11n용 다중모드 layered LDPC 복호기)

  • Na, Young-Heon;Shin, Kyung-Wook
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.48 no.11
    • /
    • pp.18-26
    • /
    • 2011
  • This paper describes a multi-mode LDPC decoder which supports three block lengths(648, 1296, 1944) and four code rates(1/2, 2/3, 3/4, 5/6) of IEEE 802.11n wireless LAN standard. To minimize hardware complexity, it adopts a block-serial (partially parallel) architecture based on the layered decoding scheme. A novel memory reduction technique devised using the min-sum decoding algorithm reduces the size of check-node memory by 47% as compared to conventional method. From fixed-point modeling and Matlab simulations for various bit-widths, decoding performance and optimal hardware parameters such as fixed-point bit-width are analyzed. The designed LDPC decoder is verified by FPGA implementation, and synthesized with a 0.18-${\mu}m$ CMOS cell library. It has 219,100 gates and 45,036 bits RAM, and the estimated throughput is about 164~212 Mbps at 50 MHz@2.5v.