• Title/Summary/Keyword: FPGA 합성

Search Result 262, Processing Time 0.025 seconds

Hardware optimized high quality image signal processor for single-chip CMOS Image Sensor (Single-chip CMOS Image Sensor를 위한 하드웨어 최적화된 고화질 Image Signal Processor 설계)

  • Lee, Won-Jae;Jung, Yun-Ho;Lee, Seong-Joo;Kim, Jae-Seok
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.5
    • /
    • pp.103-111
    • /
    • 2007
  • In this paper, we propose a VLSI architecture of hardware optimized high quality image signal processor for a Single-chip CMOS Image Sensor(CIS). The Single-chip CIS is usually used for mobile applications, so it has to be implemented as small as possible while maintaining the image quality. Several image processing algorithms are used in ISP to improve captured image quality. Among the several image processing blocks, demosaicing and image filter are the core blocks in ISP. These blocks need line memories, but the number of line memories is limited in a low cost Single-chip CIS. In our design, high quality edge-adaptive and cross channel correlation considered demosaicing algorithm is adopted. To minimize the number of required line memories for image filter, we share the line memories using the characteristics of demosaicing algorithm which consider the cross correlation. Based on the proposed method, we can achieve both high quality and low hardware complexity with a small number of line memories. The proposed method was implemented and verified successfully using verilog HDL and FPGA. It was synthesized to gate-level circuits using 0.25um CMOS standard cell library. The total logic gate count is 37K, and seven and half line memories are used.

Scalable RSA public-key cryptography processor based on CIOS Montgomery modular multiplication Algorithm (CIOS 몽고메리 모듈러 곱셈 알고리즘 기반 Scalable RSA 공개키 암호 프로세서)

  • Cho, Wook-Lae;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.1
    • /
    • pp.100-108
    • /
    • 2018
  • This paper describes a design of scalable RSA public-key cryptography processor supporting four key lengths of 512/1,024/2,048/3,072 bits. The modular multiplier that is a core arithmetic block for RSA crypto-system was designed with 32-bit datapath, which is based on the CIOS (Coarsely Integrated Operand Scanning) Montgomery modular multiplication algorithm. The modular exponentiation was implemented by using L-R binary exponentiation algorithm. The scalable RSA crypto-processor was verified by FPGA implementation using Virtex-5 device, and it takes 456,051/3,496347/26,011,947/88,112,770 clock cycles for RSA computation for the key lengths of 512/1,024/2,048/3,072 bits. The RSA crypto-processor synthesized with a $0.18{\mu}m$ CMOS cell library occupies 10,672 gate equivalent (GE) and a memory bank of $6{\times}3,072$ bits. The estimated maximum clock frequency is 147 MHz, and the RSA decryption takes 3.1/23.8/177/599.4 msec for key lengths of 512/1,024/2,048/3,072 bits.

Design of an Adaptive Reed-Solomon Decoder with Varying Block Length (가변 블록길이를 갖는 적응형 리드솔로몬 복호기의 설계)

  • Song, Moon-Kyou;Kong, Min-Han
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.28 no.4C
    • /
    • pp.365-373
    • /
    • 2003
  • In this paper, we design a versatle RS decoder which can decode RS codes of any block length n as well as any message length k, based on a modified Euclid's algorithm (MEA). This unique feature is favorable for a shortened RS code of any block length it eliminates the need to insert zeros before decoding a shortened RS code. Furthermore, the value of error correcting capability t can be changed in real time at every codeword block. Thus, when a return channel is available, the error correcting capability can be adaptiverly altered according to channel state. The decoder permits 4-step pipelined processing : (1) syndrome calculation (2) MEA block (3) error magnitude calculation (4) decoder failure check. Each step is designed to form a structure suitable for decoding a RS code with varying block length. A new architecture is proposed for a MEA block in step (2) and an architecture of outputting in reversed order is employed for a polynomial evaluation in step (3). To maintain to throughput rate with less circuitry, the MEA block uses not only a multiplexing and recursive technique but also an overclocking technique. The adaptive RS decoder over GF($2^8$) with the maximal error correcting capability of 10 has been designed in VHDL, and successfully synthesized in a FPGA.

Novel Radix-26 DF IFFT Processor with Low Computational Complexity (연산복잡도가 적은 radix-26 FFT 프로세서)

  • Cho, Kyung-Ju
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.13 no.1
    • /
    • pp.35-41
    • /
    • 2020
  • Fast Fourier transform (FFT) processors have been widely used in various application such as communications, image, and biomedical signal processing. Especially, high-performance and low-power FFT processing is indispensable in OFDM-based communication systems. This paper presents a novel radix-26 FFT algorithm with low computational complexity and high hardware efficiency. Applying a 7-dimensional index mapping, the twiddle factor is decomposed and then radix-26 FFT algorithm is derived. The proposed algorithm has a simple twiddle factor sequence and a small number of complex multiplications, which can reduce the memory size for storing the twiddle factor. When the coefficient of twiddle factor is small, complex constant multipliers can be used efficiently instead of complex multipliers. Complex constant multipliers can be designed more efficiently using canonic signed digit (CSD) and common subexpression elimination (CSE) algorithm. An efficient complex constant multiplier design method for the twiddle factor multiplication used in the proposed radix-26 algorithm is proposed applying CSD and CSE algorithm. To evaluate performance of the previous and the proposed methods, 256-point single-path delay feedback (SDF) FFT is designed and synthesized into FPGA. The proposed algorithm uses about 10% less hardware than the previous algorithm.

A Design of Memory-efficient 2k/8k FFT/IFFT Processor using R4SDF/R4SDC Hybrid Structure (R4SDF/R4SDC Hybrid 구조를 이용한 메모리 효율적인 2k/8k FFT/IFFT 프로세서 설계)

  • 신경욱
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.2
    • /
    • pp.430-439
    • /
    • 2004
  • This paper describes a design of 8192/2048-point FFT/IFFT processor (CFFT8k2k), which performs multi-carrier modulation/demodulation in OFDM-based DVB-T receiver. Since a large size FFT requires a large buffer memory, two design techniques are considered to achieve memory-efficient implementation of 8192-point FFT/IFFT. A hybrid structure, which is composed of radix-4 single-path delay feedback (R4SDF) and radix-4 single-path delay commutator (R4SDC), reduces its memory by 20% compared to R4SDC structure. In addition, a memory reduction of about 24% is achieved by a novel two-step convergent block floating-point scaling. As a result, it requires only 57% of memory used in conventional design, reducing chip area and power consumption. The CFFT8k2k core is designed in Verilog-HDL, and has about 102,000 Bates, RAM of 292k bits, and ROM of 39k bits. Using gate-level netlist with SDF which is synthesized using a $0.25-{\um}m$ CMOS library, timing simulation show that it can safely operate with 50-MHz clock at 2.5-V supply, resulting that a 8192-point FFT/IFFT can be computed every 164-${\mu}\textrm{s}$. The functionality of the core is fully verified by FPGA implementation, and the average SQNR of 60-㏈ is achieved.

A Crypto-processor Supporting Multiple Block Cipher Algorithms (다중 블록 암호 알고리듬을 지원하는 암호 프로세서)

  • Cho, Wook-Lae;Kim, Ki-Bbeum;Bae, Gi-Chur;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.11
    • /
    • pp.2093-2099
    • /
    • 2016
  • This paper describes a design of crypto-processor that supports multiple block cipher algorithms of PRESENT, ARIA, and AES. The crypto-processor integrates three cores that are PRmo (PRESENT with mode of operation), AR_AS (ARIA_AES), and AES-16b. The PRmo core implementing 64-bit block cipher PRESENT supports key length 80-bit and 128-bit, and four modes of operation including ECB, CBC, OFB, and CTR. The AR_AS core supporting key length 128-bit and 256-bit integrates two 128-bit block ciphers ARIA and AES into a single data-path by utilizing resource sharing technique. The AES-16b core supporting key length 128-bit implements AES with a reduced data-path of 16-bit for minimizing hardware. Each crypto-core contains its own on-the-fly key scheduler, and consecutive blocks of plaintext/ciphertext can be processed without reloading key. The crypto-processor was verified by FPGA implementation. The crypto-processor implemented with a $0.18{\mu}m$ CMOS cell library occupies 54,500 gate equivalents (GEs), and it can operate with 55 MHz clock frequency.

Hardware Implementation of Elliptic Curve Scalar Multiplier over GF(2n) with Simple Power Analysis Countermeasure (SPA 대응 기법을 적용한 이진체 위의 타원곡선 스칼라곱셈기의 하드웨어 구현)

  • 김현익;정석원;윤중철
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.41 no.9
    • /
    • pp.73-84
    • /
    • 2004
  • This paper suggests a new scalar multiplication algerian to resist SPA which threatens the security of cryptographic primitive on the hardware recently, and discusses how to apply this algerian Our algorithm is better than other SPA countermeasure algorithms aspect to computational efficiency. Since known SPA countermeasure algorithms have dependency of computation. these are difficult to construct parallel architecture efficiently. To solve this problem our algorithm removes dependency and computes a multiplication and a squaring during inversion with parallel architecture in order to minimize loss of performance. We implement hardware logic with VHDL(VHSIC Hardware Description Language) to verify performance. Synthesis tool is Synplify Pro 7.0 and target chip is Xillinx VirtexE XCV2000EFGl156. Total equivalent gate is 60,508 and maximum frequency is 30Mhz. Our scalar multiplier can be applied to digital signature, encryption and decryption, key exchange, etc. It is applied to a embedded-micom it protects SPA and provides efficient computation.

Design and Implementation of Time Synchronizer for Advanced ZigBee Systems (개선된 지그비 시스템을 위한 시간 동기부 설계 및 구현)

  • Hwang, Hyunsu;Jung, Yongcheol;Jung, Yunho
    • Journal of Advanced Navigation Technology
    • /
    • v.20 no.5
    • /
    • pp.453-461
    • /
    • 2016
  • Recently, with the growth of various sensor applications, the need of wireless communication systems which can support variable data rate is increasing. Therefore, advanced ZigBee (AZB) systems that support the various data rate under 250 kbps are proposed. However, the preamble structure for AZB systems causes the complexity increase of time synchronization circuits. In this paper, we propose preamble structure and time synchronization algorithm which can solve the problem of the complexity increase of time synchronization circuits. Implementation results show that the proposed time synchronizer for AZB systems include the logic slices of 6.92 k and, which are reduced at the rate of 62.3% compared with existing architecture.

A Pseudo-Random Number Generator based on Segmentation Technique (세그먼테이션 기법을 이용한 의사 난수 발생기)

  • Jeon, Min-Jung;Kim, Sang-Choon;Lee, Je-Hoon
    • Convergence Security Journal
    • /
    • v.12 no.4
    • /
    • pp.17-23
    • /
    • 2012
  • Recently, the research for cryptographic algorithm, in particular, a stream cipher has been actively conducted for wireless devices as growing use of wireless devices such as smartphone and tablet. LFSR based random number generator is widely used in stream cipher since it has simple architecture and it operates very fast. However, the conventional multi-LFSR RNG (random number generator) suffers from its hardware complexity as well as very closed correlation between the numbers generated. A leap-ahead LFSR was presented to solve these problems. However, it has another disadvantage that the maximum period of the generated random numbers are significantly decreased according to the relationship between the number of the stages of the LFSR and the number of the output bits of the RNG. This paper presents new leap-ahead LFSR architecture to prevent this decrease in the maximum period by applying segmentation technique to the conventional leap-ahead LFSR. The proposed architecture is implemented using VHDL and it is simulated in FPGA using Xilinx ISE 10.1, with a device Virtex 4, XC4VLX15. From the simulation results, the proposed architecture has only 20% hardware complexity but it can increases the maximum period of the generated random numbers by 40% compared to the conventional Leap-ahead archtecture.

A Design of All-Digital QPSK Demodulator for High-Speed Wireless Transmission Systems (고속 무선 전송시스템을 위한 All-Digital QPSK 복조기의 설계)

  • 고성찬;정지원
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.8 no.1
    • /
    • pp.83-91
    • /
    • 2003
  • High-speed QPSK demodulator has been in important design objective of any wireless communication systems, especially those offering broadband multimedia service. This paper describes all-digital QPSK demodulator for high-speed wireless communications, and its hardware structures are discussed. All-digital QPSK demodulator is mainly composed of symbol time circuit and carrier recovery circuit to estimate timing and phase-offsets. There are various schemes. Among them, we use Gardner algorithm and Decision-Directed carrier recovery algorithm which is most efficient scheme to warrant the fast acquisition and tacking to fabricate FPGA chip. The testing results of the implemented onto CPLD-EPF10K100GC 503-4 chip show demodulation speed is reached up to 2.6[Mbps]. If it is implemented a CPLD chip with speed grade 1, the demodulation speed can be faster by about 5 times. Actually in case of designing by ASIC, its speed my be faster than CPLD by 5 times. Therefore, it is possible to fabricate the all-digital QPSK demodulator chipset with speed of 50[Mbps].

  • PDF