• Title/Summary/Keyword: FPGA 합성

Search Result 262, Processing Time 0.025 seconds

JPEG2000 IP Design and Implementation for SoC Design (SoC를 위한 JPEG2000 IP 설계 및 구현)

  • 정재형;한상균;홍성훈;김영철
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2002.11a
    • /
    • pp.63-68
    • /
    • 2002
  • JPEG2000은 기존의 정지영상압축부호화 방식에 비해 우수한 비트율-왜곡(Rate-Distortion)특성과 향상된 주관적 화질을 제공하며 인터넷, 디지털 영상카메라, 이동단말기, 의학영상 등 다양한 분야에서 적용될 수 있는 새로운 정지영상압축 표준이다. 본 논문에서는 SoC(System on a Chip)설계를 고려한 JPEG2000 인코더의 구조를 제안하고 IP(Intellectual Property)를 설계 및 검증하였다. 구현된 JPEG2000 IP는 DWT(Discrete Wavelet Transform)블록, 스칼라양자화블록, EBCOT(Embedded Block Coding with Optimized Truncation)블록으로 구성되어 있다. IP는 모의실험을 통해 구현 구조에 대한 타당성을 검증하였고, 반도체설계자산연구센터에서 제시한 'RTL Coding Guideline'에 따라 HDL을 설계하였다. 특히, DWT블록은 구현시 많은 연산과 메모리 용량이 필요하므로 영상을 저장할 외부 메모리를 사용하였고, 빠른 곱셈과 덧셈연산을 위한 3단 파이프라인 부스곱셈기(3-state pipeline booth multiplier)와 캐리예측 덧셈기(carry lookahead adder)를 사용하였다. 설계된 JPEG2000 IP들은 삼성 0.35$\mu\textrm{m}$ 라이브러리를 이용하여 Synopsys사 Design Analyzer 틀을 통해 논리 합성하였으며, Xillinx 100만 게이트 FPGA칩에 구현하여 그 동작을 검증하였다. 또한, Hard IP 설계를 위해 Avanti사의 Apollo툴을 이용하여 Layout을 수행하였다.

  • PDF

A New Approach to the Synthesis of Two-Dimensional Cellular Arrays Using Internal Don't Cares (내부 Dont't care를 이용한 이차원 셀 배열의 새로운 합성 방법)

  • Lee, Dong-Geon;Jeong, Mi-Gyeong;Lee, Gwi-Sang
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.49 no.2
    • /
    • pp.81-87
    • /
    • 2000
  • This paper presents a new approach to the synthesis of two-dimensional arrays such as Atmel 6000 series FPGAs using internal don't cares. Basically complex terms which fits to the linear array of cells without further routing wires are generated and they are collected by OR/XOR operations. In previous methods, complex terms are collected only by XOR operations, which may not be effective for nearly unate functions. In this paper, we allow complex terms to be collected by OR operations in addition to XOR operations. First, complex terms that lies in the ON-set of the function are generated and collected by OR operations. The sub-function realized by the first stage becomes an internal don't cares and they are exploited in the second stage which generates complex terms collectable by XOR operation. Experimental results shows the efficacy of the proposed method compared to the previous methods.

  • PDF

An Area-efficient Design of ECC Processor Supporting Multiple Elliptic Curves over GF(p) and GF(2m) (GF(p)와 GF(2m) 상의 다중 타원곡선을 지원하는 면적 효율적인 ECC 프로세서 설계)

  • Lee, Sang-Hyun;Shin, Kyung-Wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2019.05a
    • /
    • pp.254-256
    • /
    • 2019
  • 소수체 GF(p)와 이진체 $GF(2^m)$ 상의 다중 타원곡선을 지원하는 듀얼 필드 ECC (DF-ECC) 프로세서를 설계하였다. DF-ECC 프로세서의 저면적 설와 다양한 타원곡선의 지원이 가능하도록 워드 기반 몽고메리 곱셈 알고리듬을 적용한 유한체 곱셈기를 저면적으로 설계하였으며, 페르마의 소정리(Fermat's little theorem)를 유한체 곱셈기에 적용하여 유한체 나눗셈을 구현하였다. 설계된 DF-ECC 프로세서는 스칼라 곱셈과 점 연산, 그리고 모듈러 연산 기능을 가져 다양한 공개키 암호 프로토콜에 응용이 가능하며, 유한체 및 모듈러 연산에 적용되는 파라미터를 내부 연산으로 생성하여 다양한 표준의 타원곡선을 지원하도록 하였다. 설계된 DF-ECC는 FPGA 구현을 하드웨어 동작을 검증하였으며, 0.18-um CMOS 셀 라이브러리로 합성한 결과 22,262 GEs (gate equivalences)와 11 kbit RAM으로 구현되었으며, 최대 100 MHz의 동작 주파수를 갖는다. 설계된 DF-ECC 프로세서의 연산성능은 B-163 Koblitz 타원곡선의 경우 스칼라 곱셈 연산에 885,044 클록 사이클이 소요되며, B-571 슈도랜덤 타원곡선의 스칼라 곱셈에는 25,040,625 사이클이 소요된다.

  • PDF

The Hardware Design of Real-time Image Processing System-on-chip for Visual Auxiliary Equipment (시각보조기기를 위한 실시간 영상처리 SoC 하드웨어 설계)

  • Jo, Heungsun;Kim, Jiho;Shin, Hyuntaek;Im, Junseong;Ryoo, Kwangki
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.11a
    • /
    • pp.1525-1527
    • /
    • 2013
  • 본 논문에서는 저시력자의 개선된 독서 환경을 제공하는 시각보조기기를 위한 실시간 영상처리 SoC(System on Chip) 하드웨어 구조 설계에 대해서 기술한다. 기존의 시각보조기기는 화면 영상이 실제 움직임보다 늦게 출력되는 잔상 현상이 발생하며, 색 변환 기능도 제한적이다. 따라서 본 논문에서 제안하는 실시간 영상처리 SoC 하드웨어 구조는 데이터 연산을 최소화함으로써 잔상 현상이 감소되며, 저시력자를 위한 다양한 색상 모드를 지원한다. 제안하는 영상처리 SoC 하드웨어 구조는 Core-A 모듈, Memory Controller 모듈, AMBA AHB bus 모듈, ISP(Image Signal Processing) 모듈, TFT-LCD Controller 모듈, VGA Controller 모듈, CIS Controller 모듈, UART 모듈, Block Memory 모듈로 구성된다. 시각보조기기를 위한 실시간 영상처리 SoC 하드웨어 구조는 Virtex4 XC4VLX80 FPGA 디바이스를 이용하여 검증하였으며, TSMC 180nm 셀 라이브러리로 합성한 결과 동작주파수는 54MHz, 게이트 수 197k이다.

A Design Method for Pre-Distortion Compensation of SAR Chirp Signal based on Envelop Sampling and Interpolation Filter (위성 탑재 영상레이다 첩 신호의 전치왜곡 보상을 위한 포락선 샘플링 및 보간 필터 기반의 설계 기법)

  • Lee, Young-Bok
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.25 no.4
    • /
    • pp.347-354
    • /
    • 2022
  • The synthetic aperture radar(SAR) is an equipment that can acquire images in all weathers day and night based on radar signals. The on-board processor of satellite SAR generates transmission signal by digital signal processing, converts it into an analog signal and transmits to antenna. Until the transmission signal generated by on-board processor is output, the signal passes the transmission cables and analog devices. At this time, these hardware distort the signal and makes SAR performance worse. To improve the performance, pre-distortion technique is used. But, general pre-distortion using taylor series is not sufficient to compensate for the distortion. This paper suggests transmit signal design method with improved pre-distortion. This paper uses envelop sampling method and interpolation filter for frequency domain compensation. The proposed method accurately compensates the hardware distortion and reduces resource usage of FPGA. To analyze proposed method's performance, IRF characteristics are compared when the proposed method applies to signal with errors.

Implementation of WLAN Baseband Processor Based on Space-Frequency OFDM Transmit Diversity Scheme (공간-주파수 OFDM 전송 다이버시티 기법 기반 무선 LAN 기저대역 프로세서의 구현)

  • Jung Yunho;Noh Seungpyo;Yoon Hongil;Kim Jaeseok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.5 s.335
    • /
    • pp.55-62
    • /
    • 2005
  • In this paper, we propose an efficient symbol detection algorithm for space-frequency OFDM (SF-OFDM) transmit diversity scheme and present the implementation results of the SF-OFDM WLAN baseband processor with the proposed algorithm. When the number of sub-carriers in SF-OFDM scheme is small, the interference between adjacent sub-carriers may be generated. The proposed algorithm eliminates this interference in a parallel manner and obtains a considerable performance improvement over the conventional detection algorithm. The bit error rate (BER) performance of the proposed detection algorithm is evaluated by the simulation. In the case of 2 transmit and 2 receive antennas, at $BER=10^{-4}$ the proposed algorithm obtains about 3 dB gain over the conventional detection algorithm. The packet error rate (PER), link throughput, and coverage performance of the SF-OFDM WLAN with the proposed detection algorithm are also estimated. For the target throughput at $80\%$ of the peak data rate, the SF-OFDM WLAN achieves the average SNR gain of about 5.95 dB and the average coverage gain of 3.98 meter. The SF-OFDM WLAN baseband processor with the proposed algorithm was designed in a hardware description language and synthesized to gate-level circuits using 0.18um 1.8V CMOS standard cell library. With the division-free architecture, the total logic gate count for the processor is 945K. The real-time operation is verified and evaluated using a FPGA test system.

A Design of FFT/IFFT Core with R2SDF/R2SDC Hybrid Structure For Terrestrial DMB Modem (지상파 DMB 모뎀용 R2SDF/R2SDC 하이브리드 구조의 FFT/IFFT 코어 설계)

  • Lee Jin-Woo;Shin Kyung-Wook
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.11
    • /
    • pp.33-40
    • /
    • 2005
  • This paper describes a design of FFT/IFFT Core(FFT256/2k), which is an essential block in terrestrial DMB modem. It has four operation modes including 256/512/1024/2048-point FFT/IFFT in order to support the Eureka-147 transmission modes. The hybrid architecture, which is composed of R2SDF and R2SDC structure, reduces memory by $62\%$ compared to R2SDC structure, and the SQNR performance is improved by TS_CBFP(Two Step Convergent Block Floating Point). Timing simulation results show that it can operate up to 50MHz(a)2.5-V, resulting that a 2048-point FFT/IFFT can be computed in 41-us. The FFT256/2k core designed in Verilog-HDL has about 68,400 gates and 58,130 RAM. The average power consumption estimated using switching activity is about 113-mW, and the total average SQNR of over 50-dB is achieved. The functionality of the core was fully verified by FPGA implementation.

Design of Hardwired Variable Length Decoder for H.264/AVC (하드웨어 구조의 H.264/AVC 가변길이 복호기 설계)

  • Yu, Yong-Hoon;Lee, Chan-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.11
    • /
    • pp.71-76
    • /
    • 2008
  • H.264(or MPEG-4/AVC pt.10) is a high performance video coding standard, and is widely used. Variable length code (VLC) of the H.264 standard compresses data using the statistical distribution of values. A decoder parses the compressed bit stream and searches decoded values in lookup tables, and the decoding process is not easy to implement by hardware. We propose an architecture of variable length decoder(VLD) for the H.264 baseline profile(BP) L4. The CAVLD decodes syntax elements using the combination of arithmetic units and lookup tables for the optimized hardware architecture. A barral shifter and a first 1's detector parse NAL bit stream, and are shared by Exp-Golomb decoder and CAVLD. A FIFO memory between CAVLD and the reorder unit and a buffer at the output of the reorder unit eliminate the bottleneck of data stream. The proposed VLD is designed using Verilog-HDL and is implemented using an FPGA. The synthesis result using a 0.18um standard CMOS technology shows that the gate count is 22,604 and the decoder can process HD($1920{\times}1080$) video at 120MHz.

A Design of PRESENT Crypto-Processor Supporting ECB/CBC/OFB/CTR Modes of Operation and Key Lengths of 80/128-bit (ECB/CBC/OFB/CTR 운영모드와 80/128-비트 키 길이를 지원하는 PRESENT 암호 프로세서 설계)

  • Kim, Ki-Bbeum;Cho, Wook-Lae;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.6
    • /
    • pp.1163-1170
    • /
    • 2016
  • A hardware implementation of ultra-lightweight block cipher algorithm PRESENT which was specified as a standard for lightweight cryptography ISO/IEC 29192-2 is described. The PRESENT crypto-processor supports two key lengths of 80 and 128 bits, as well as four modes of operation including ECB, CBC, OFB, and CTR. The PRESENT crypto-processor has on-the-fly key scheduler with master key register, and it can process consecutive blocks of plaintext/ciphertext without reloading master key. In order to achieve a lightweight implementation, the key scheduler was optimized to share circuits for key lengths of 80 bits and 128 bits. The round block was designed with a data-path of 64 bits, so that one round transformation for encryption/decryption is processed in a clock cycle. The PRESENT crypto-processor was verified using Virtex5 FPGA device. The crypto-processor that was synthesized using a $0.18{\mu}m$ CMOS cell library has 8,100 gate equivalents(GE), and the estimated throughput is about 908 Mbps with a maximum operating clock frequency of 454 MHz.

A small-area implementation of public-key cryptographic processor for 224-bit elliptic curves over prime field (224-비트 소수체 타원곡선을 지원하는 공개키 암호 프로세서의 저면적 구현)

  • Park, Byung-Gwan;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.6
    • /
    • pp.1083-1091
    • /
    • 2017
  • This paper describes a design of cryptographic processor supporting 224-bit elliptic curves over prime field defined by NIST. Scalar point multiplication that is a core arithmetic function in elliptic curve cryptography(ECC) was implemented by adopting the modified Montgomery ladder algorithm. In order to eliminate division operations that have high computational complexity, projective coordinate was used to implement point addition and point doubling operations, which uses addition, subtraction, multiplication and squaring operations over GF(p). The final result of the scalar point multiplication is converted to affine coordinate and the inverse operation is implemented using Fermat's little theorem. The ECC processor was verified by FPGA implementation using Virtex5 device. The ECC processor synthesized using a 0.18 um CMOS cell library occupies 2.7-Kbit RAM and 27,739 gate equivalents (GEs), and the estimated maximum clock frequency is 71 MHz. One scalar point multiplication takes 1,326,985 clock cycles resulting in the computation time of 18.7 msec at the maximum clock frequency.