Search | Korea Science

Implementation of FPGA Verification System with Slave FIFO Interface and FX3 USB 3 Bridge Chip (FX3 USB 3 브릿지 칩과 slave FIFO 인터페이스를 사용하는 FPGA 검증 시스템 구현)

Choi, Byeong-Yoon
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.25 no.2
- /
- pp.259-266
- /
- 2021
USB bus not only works with convenience but also transmits data fast and becomes a standard peripheral interface between FPGA development board and personal computer. In this paper FPGA verification system with slave FIFO interface for Cypress FX3 USB 3 bridge chip was implemented. The designed slave FIFO interface consists of host interface module based on FIFO structure, master bus controller and command decoder and supports streaming communication interface for FX3 bridge chip and memory-mapped input and output interface for user design circuit. The ZestSC3 board with Cypress FX3 USB 3 bridge chip and Xilinx Artix FPGA(XC7A35T-1C5G3241) was used to implement FPGA verification system. It was verified that the FPGA verification system for user design circuit operated correctly under various clock frequencies using GUI software developed by visual C# and C++ DLL. The designed slave FIFO interface for FPGA verification system has modular structure and can be applicable to the different user designs with memory-mapped I/O interface.
https://doi.org/10.6109/jkiice.2021.25.2.259 인용 PDF KSCI

A Scalable ECC Processor for Elliptic Curve based Public-Key Cryptosystem (타원곡선 기반 공개키 암호 시스템 구현을 위한 Scalable ECC 프로세서)

Choi, Jun-Baek;Shin, Kyung-Wook
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.25 no.8
- /
- pp.1095-1102
- /
- 2021
A scalable ECC architecture with high scalability and flexibility between performance and hardware complexity is proposed. For architectural scalability, a modular arithmetic unit based on a one-dimensional array of processing element (PE) that performs finite field operations on 32-bit words in parallel was implemented, and the number of PEs used can be determined in the range of 1 to 8 for circuit synthesis. A scalable algorithms for word-based Montgomery multiplication and Montgomery inversion were adopted. As a result of implementing scalable ECC processor (sECCP) using 180-nm CMOS technology, it was implemented with 100 kGEs and 8.8 kbits of RAM when N_PE=1, and with 203 kGEs and 12.8 kbits of RAM when N_PE=8. The performance of sECCP with N_PE=1 and N_PE=8 was analyzed to be 110 PSMs/sec and 610 PSMs/sec, respectively, on P256R elliptic curve when operating at 100 MHz clock.
https://doi.org/10.6109/jkiice.2021.25.8.1095 인용 PDF KSCI

Chip Implementation of 830-Mb/s/pin Transceiver for LPDDR2 Memory Controller (LPDDR2 메모리 컨트롤러를 위한 830-Mb/s/pin 송수신기 칩 구현)

Jong-Hyeok, Lee;Chang-Min, Song;Young-Chan, Jang
- Journal of IKEEE
- /
- v.26 no.4
- /
- pp.659-670
- /
- 2022
An 830-Mb/s/pin transceiver for a controller supporting ×32 LPDDR2 memory is designed. The transmitter consists of eight unit circuits has an impedance in the range of 34Ω ∽ 240Ω, and its impedance is controlled by an impedance correction circuit. The transmitted DQS signal has a phase shifted by 90° compared to the DQ signals. In the receive operation, the read time calibration is performed by per-pin skew calibration and clock-domain crossing within a byte. The implemented transceiver for the LPDDR2 memory controller is designed by using a 55-nm process using a 1.2V supply voltage and has a maximum signal transmission rate of 830 Mb/s/pin. The area and power consumption of each lane are 0.664 mm² and 22.3 mW, respectively.
https://doi.org/10.7471/ikeee.2022.26.4.659 인용 PDF KSCI

A Non-Calibrated 2x Interleaved 10b 120MS/s Pipeline SAR ADC with Minimized Channel Offset Mismatch (보정기법 없이 채널 간 오프셋 부정합을 최소화한 2x Interleaved 10비트 120MS/s 파이프라인 SAR ADC)

Cho, Young-Sae;Shim, Hyun-Sun;Lee, Seung-Hoon
- Journal of the Institute of Electronics and Information Engineers
- /
- v.52 no.9
- /
- pp.63-73
- /
- 2015
This work proposes a 2-channel time-interleaved (T-I) 10b 120MS/s pipeline SAR ADC minimizing offset mismatch between channels without any calibration scheme. The proposed ADC employs a 2-channel SAR and T-I topology based on a 2-step pipeline ADC with 4b and 7b in the first and second stage for high conversion rate and low power consumption. Analog circuits such as comparator and residue amplifier are shared between channels to minimize power consumption, chip area, and offset mismatch which limits the ADC linearity in the conventional T-I architecture, without any calibration scheme. The TSPC D flip-flop with a short propagation delay and a small number of transistors is used in the SAR logic instead of the conventional static D flip-flop to achieve high-speed SAR operation as well as low power consumption and chip area. Three separate reference voltage drivers for 4b SAR, 7b SAR circuits and a single residue amplifier prevent undesirable disturbance among the reference voltages due to each different switching operation and minimize gain mismatch between channels. High-frequency clocks with a controllable duty cycle are generated on chip to eliminate the need of external complicated high-frequency clocks for SAR operation. The prototype ADC in a 45nm CMOS technology demonstrates a measured DNL and INL within 0.69LSB and 0.77LSB, with a maximum SNDR and SFDR of 50.9dB and 59.7dB at 120MS/s, respectively. The proposed ADC occupies an active die area of 0.36mm2 and consumes 8.8mW at a 1.1V supply voltage.
https://doi.org/10.5573/ieie.2015.52.9.063 인용 PDF KSCI

Design and Hardware Implementation of High-Speed Variable-Length RSA Cryptosystem (가변길이 고속 RSA 암호시스템의 설계 및 하드웨어 구현)

박진영;서영호;김동욱
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.27 no.9C
- /
- pp.861-870
- /
- 2002
In this paper, with targeting on the drawback of RSA of operation speed, a new 1024-bit RSA cryptosystem has been proposed and implemented in hardware to increase the operational speed and perform the variable-length encryption. The proposed cryptosystem mainly consists of the modular exponentiation part and the modular multiplication part. For the modular exponentiation, the RL-binary method, which performs squaring and modular multiplying in parallel, was improved, and then applied. And 4-stage CSA structure and radix-4 booth algorithm were applied to enhance the variable-length operation and reduce the number of partial product in modular multiplication arithmetic. The proposed RSA cryptosystem which can calculate at most 1024 bits at a tittle was mapped into the integrated circuit using the Hynix Phantom Cell Library for Hynix 0.35㎛ 2-Poly 4-Metal CMOS process. Also, the result of software implementation, which had been programmed prior to the hardware research, has been used to verify the operation of the hardware system. The size of the result from the hardware implementation was about 190k gate count and the operational clock frequency was 150㎒. By considering a variable-length of modulus number, the baud rate of the proposed scheme is one and half times faster than the previous works. Therefore, the proposed high speed variable-length RSA cryptosystem should be able to be used in various information security system which requires high speed operation.
PDF KSCI

FPGA-based One-Chip Architecture and Design of Real-time Video CODEC with Embedded Blind Watermarking (블라인드 워터마킹을 내장한 실시간 비디오 코덱의 FPGA기반 단일 칩 구조 및 설계)

서영호;김대경;유지상;김동욱
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.29 no.8C
- /
- pp.1113-1124
- /
- 2004
In this paper, we proposed a hardware(H/W) structure which can compress and recontruct the input image in real time operation and implemented it into a FPGA platform using VHDL(VHSIC Hardware Description Language). All the image processing element to process both compression and reconstruction in a FPGA were considered each of them was mapped into H/W with the efficient structure for FPGA. We used the DWT(discrete wavelet transform) which transforms the data from spatial domain to the frequency domain, because use considered the motion JPEG2000 as the application. The implemented H/W is separated to both the data path part and the control part. The data path part consisted of the image processing blocks and the data processing blocks. The image processing blocks consisted of the DWT Kernel fur the filtering by DWT, Quantizer/Huffman Encoder, Inverse Adder/Buffer for adding the low frequency coefficient to the high frequency one in the inverse DWT operation, and Huffman Decoder. Also there existed the interface blocks for communicating with the external application environments and the timing blocks for buffering between the internal blocks The global operations of the designed H/W are the image compression and the reconstruction, and it is operated by the unit of a field synchronized with the A/D converter. The implemented H/W used the 69%(16980) LAB(Logic Array Block) and 9%(28352) ESB(Embedded System Block) in the APEX20KC EP20K600CB652-7 FPGA chip of ALTERA, and stably operated in the 70MHz clock frequency. So we verified the real time operation of 60 fields/sec(30 frames/sec).
PDF KSCI

Elliptic Curve Cryptography Coprocessors Using Variable Length Finite Field Arithmetic Unit (크기 가변 유한체 연산기를 이용한 타원곡선 암호 프로세서)

Lee Dong-Ho
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.42 no.1
- /
- pp.57-67
- /
- 2005
Fast scalar multiplication of points on elliptic curve is important for elliptic curve cryptography applications. In order to vary field sizes depending on security situations, the cryptography coprocessors should support variable length finite field arithmetic units. To determine the effective variable length finite field arithmetic architecture, two well-known curve scalar multiplication algorithms were implemented on FPGA. The affine coordinates algorithm must use a hardware division unit, but the projective coordinates algorithm only uses a fast multiplication unit. The former algorithm needs the division hardware. The latter only requires a multiplication hardware, but it need more space to store intermediate results. To make the division unit versatile, we need to add a feedback signal line at every bit position. We proposed a method to mitigate this problem. For multiplication in projective coordinates implementation, we use a widely used digit serial multiplication hardware, which is simpler to be made versatile. We experimented with our implemented ECC coprocessors using variable length finite field arithmetic unit which has the maximum field size 256. On the clock speed 40 MHz, the scalar multiplication time is 6.0 msec for affine implementation while it is 1.15 msec for projective implementation. As a result of the study, we found that the projective coordinates algorithm which does not use the division hardware was faster than the affine coordinate algorithm. In addition, the memory implementation effectiveness relative to logic implementation will have a large influence on the implementation space requirements of the two algorithms.
PDF KSCI

A 14b 200KS/s $0.87mm^2$ 1.2mW 0.18um CMOS Algorithmic A/D Converter (14b 200KS/s $0.87mm^2$ 1.2mW 0.18um CMOS 알고리즈믹 A/D 변환기)

Park, Yong-Hyun;Lee, Kyung-Hoon;Choi, Hee-Cheol;Lee, Seung-Hoon
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.43 no.12 s.354
- /
- pp.65-73
- /
- 2006
This work presents a 14b 200KS/s $0.87mm^2$ 1.2mW 0.18um CMOS algorithmic A/D converter (ADC) for intelligent sensors control systems, battery-powered system applications simultaneously requiring high resolution, low power, and small area. The proposed algorithmic ADC not using a conventional sample-and-hold amplifier employs efficient switched-bias power-reduction techniques in analog circuits, a clock selective sampling-capacitor switching in the multiplying D/A converter, and ultra low-power on-chip current and voltage references to optimize sampling rate, resolution, power consumption, and chip area. The prototype ADC implemented in a 0.18um 1P6M CMOS process shows a measured DNL and INL of maximum 0.98LSB and 15.72LSB, respectively. The ADC demonstrates a maximum SNDR and SFDR of 54dB and 69dB, respectively, and a power consumption of 1.2mW at 200KS/s and 1.8V. The occupied active die area is $0.87mm^2$.
PDF KSCI

A small-area implementation of public-key cryptographic processor for 224-bit elliptic curves over prime field (224-비트 소수체 타원곡선을 지원하는 공개키 암호 프로세서의 저면적 구현)

Park, Byung-Gwan;Shin, Kyung-Wook
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.21 no.6
- /
- pp.1083-1091
- /
- 2017
This paper describes a design of cryptographic processor supporting 224-bit elliptic curves over prime field defined by NIST. Scalar point multiplication that is a core arithmetic function in elliptic curve cryptography(ECC) was implemented by adopting the modified Montgomery ladder algorithm. In order to eliminate division operations that have high computational complexity, projective coordinate was used to implement point addition and point doubling operations, which uses addition, subtraction, multiplication and squaring operations over GF(p). The final result of the scalar point multiplication is converted to affine coordinate and the inverse operation is implemented using Fermat's little theorem. The ECC processor was verified by FPGA implementation using Virtex5 device. The ECC processor synthesized using a 0.18 um CMOS cell library occupies 2.7-Kbit RAM and 27,739 gate equivalents (GEs), and the estimated maximum clock frequency is 71 MHz. One scalar point multiplication takes 1,326,985 clock cycles resulting in the computation time of 18.7 msec at the maximum clock frequency.
https://doi.org/10.6109/jkiice.2017.21.6.1083 인용 PDF KSCI

Hardware-Based High Performance XML Parsing Technique Using an FPGA (FPGA를 이용한 하드웨어 기반 고성능 XML 파싱 기법)

Lee, Kyu-hee;Seo, Byeong-seok
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.40 no.12
- /
- pp.2469-2475
- /
- 2015
A structured XML has been widely used to present services on various Web-services. The XML is also used for digital documents and digital signatures and for the representation of multimedia files in email systems. The XML document should be firstly parsed to access elements in the XML. The parsing is the most compute-instensive task in the use of XML documents. Most of the previous work has focused on hardware based XML parsers in order to improve parsing performance, while a little work has studied parsing techniques. We present the high performance parsing technique which can be used all of XML parsers and design hardware based XML parser using an FPGA. The proposed parsing technique uses element analyzers instead of the state machine and performs multibyte-based element matching. As a result, our parsing technique can reduce the number of clock cycles per byte(CPB) and does not need to require any preprocessing, such as loading XML data into memory. Compared to other parsers, our parser acheives 1.33~1.82 times improvement in the system performance. Therefore, the proposed parsing technique can process XML documents in real time and is suitable for applying to all of XML parsers.
https://doi.org/10.7840/kics.2015.40.12.2469 인용 PDF KSCI

Search Result 277, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)