Search | Korea Science

Optimized and Portable FPGA-Based Systolic Cell Architecture for Smith-Waterman-Based DNA Sequence Alignment

Shah, Hurmat Ali;Hasan, Laiq;Koo, Insoo
- Journal of information and communication convergence engineering
- /
- v.14 no.1
- /
- pp.26-34
- /
- 2016
The alignment of DNA sequences is one of the important processes in the field of bioinformatics. The Smith-Waterman algorithm (SWA) performs optimally for aligning sequences but is computationally expensive. Field programmable gate array (FPGA) performs the best on parameters such as cost, speed-up, and ease of re-configurability to implement SWA. The performance of FPGA-based SWA is dependent on efficient cell-basic implementation-unit design. In this paper, we present an optimized systolic cell design while avoiding oversimplification, very large-scale integration (VLSI)-level design, and direct mapping of iterative equations such as previous cell designs. The proposed design makes efficient use of hardware resources and provides portability as the proposed design is not based on gate-level details. Our cell design implementing a linear gap penalty resulted in a performance improvement of 32× over a GPP platform and surpassed the hardware utilization of another implementation by a factor of 4.23.
https://doi.org/10.6109/jicce.2016.14.1.026 인용 PDF KSCI KPUBS HTML

A Study on Hardware Implementation of 128-bit LEA Encryption Block (128비트 LEA 암호화 블록 하드웨어 구현 연구)

Yoon, Gi Ha;Park, Seong Mo
- Smart Media Journal
- /
- v.4 no.4
- /
- pp.39-46
- /
- 2015
This paper describes hardware implementation of the encryption block of the '128 bit block cipher LEA' among various lightweight encryption algorithms for IoT (Internet of Things) security. Round function blocks and key-schedule blocks are designed by parallel circuits for high throughput. The encryption blocks support secret-key of 128 bits, and are designed by FSM method and 24/n stage(n=1, 2, 3, 4, 8, 12) pipeline methods. The LEA-128 encryption blocks are modeled using Verilog-HDL and implemented on FPGA, and according to the synthesis results, minimum area and maximum throughput are provided.
PDF KSCI

FPGA Implementation of an FDTrS/DF Signal Detector for High-density DVD System (고밀도 DVD 시스템을 위한 FDTrS/DF 신호 검출기의 FPGA 구현)

정조훈
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.25 no.10B
- /
- pp.1732-1743
- /
- 2000
In this paper a fixed-delay trellis search with decision feedback (FDTrS/DF) for high-density DVD systems (4.7-15GB) is proposed and implemented with FPGA. The proposed FDTrS/DF is derived by transforming the binary tree search structure into trellis search structure implying that FDTrS/DF performs better than the singnal detection techniques based on tree search structure such as FDTS/DF and SSD/DF. Advantages of FDTrS/DF are significant reductions in hardware complexity due to the unique structure of FDTrS composed of only one trellis stage requiring no traceback procedure usually implemented in the Viterbi detector. Also in this paper the PDFS/DF and SSD/DF orginally proposed for high-density magnetic recording systems are modified for the DVD system and compared with the proposed FDTrS/DF. In order to increase speed in the FPGA implementation the pipelining technique and absolute branch metric (instead of square branch metric) are applied. The proposed FDTrS/DF is shown to provide the best performance among various signal detection techniques such as PRML, DFE, FDTS/DF and SSD/DF even with a small hardware complexity.
PDF

Implementation of the Wireless Transducer Interface Module and NCAP architecture (무선 센서 인터페이스 모듈과 NCAP 구조의 구현)

Oh, Se-Moon;Keum, Min-Ha;Kim, Dong-Hyeok;Kim, Jin-Sang;Cho, Won-Kyung
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.33 no.12A
- /
- pp.1261-1269
- /
- 2008
This paper presents an implementation of the Network Capable Application Processor (NCAP) and the Wireless Transducer Interface Module (WTIM) architectures based on the new IEEE P1451.5 standard. Proposed architecture is implemented using a computer for NCAP, an FPGA board, a sensor board and two radio modules, which communicate through the ZigBee wireless communication technology between the NCAP and the WTIM based on the IEEE 1451.0 and the IEEE 1451.5 interfaces. In this paper, two experiments has been done to verify operations of proposed architecture. From the experimental results, we verify that the proposed architecture performs the wireless sensor communication functions efficiently.
PDF KSCI

An Area-efficient Implementation of Layered LDPC Decoder for IEEE 802.11n WLAN (IEEE 802.11n WLAN 표준용 Layered LDPC 복호기의 저면적 구현)

Jeong, Sang-Hyeok;Na, Young-Heon;Shin, Kyung-Wook
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2010.05a
- /
- pp.486-489
- /
- 2010
This paper describes a layered LDPC decoder which supports block length of 1,944 bits and code rate 1/2 for IEEE 802.11n WLAN standard. To reduce the hardware complexity, the min-sum algorithm and layered architecture is adopted. A novel memory reduction technique suitable for min-sum algorithm reduces memory size by 75% compared with conventional method. The designed processor has 200,400 gates and 19,400 bits memory, and it is verified by FPGA implementation. The estimated throughput is about 200 Mbps at 120 MHz clock by using Xilinx Virtex-4 FPGA device.
PDF

SoC Implementation of Deblocking Filter for Block-based Compressed Images and Videos (블록 기반 압축 이미지 및 비디오를 위한 디블로킹 필터의 SoC 구현)

Seo, Gwang-Seok;Lee, Joo-Heung
- Journal of IKEEE
- /
- v.23 no.3
- /
- pp.925-933
- /
- 2019
In this paper, we implement ZYNQ SoC-based post-processing system that utilizes partial reconfiguration to remove blocking artifacts generated by compression algorithm. Hardware implementation of the deblocking filter in a Field Programmable Gate Array (FPGA) provides high computational capability and can be partially reconfigured to process 1080p images in real time. Partially reconfigurable areas in FPGA can be utilized to use hardware more efficiently in highly resource-constrained embedded systems. Experimental results of the proposed system show improvement of visual quality both objectively and subjectively with 0.6dB higher PSNR after deblocking filtering process. The measured power consumption of the deblocking filter during run-time is 68.33mW.
https://doi.org/10.7471/ikeee.2019.23.3.925 인용 PDF KSCI

Availability Analysis of Xilinx 7-Series FPGA against Soft Error (Xilinx 7-Series FPGA의 소프트 에러에 대한 가용성 분석)

Ryu, Sang-Moon
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2016.10a
- /
- pp.655-658
- /
- 2016
Xilinx 7-Series FPGA(Field Programmable Gate Array)s mainly used for the implementation of high-performance digital circuit have SRAM-type configuration memory and can malfunction when soft errors occur in their configuration memory. SEM(Soft Error Mitigation Controller) offered by Xilinx helps users mitigate the influence of soft errors in configuration memory. When soft errors occur, SEM Controller can recover the state of FPGA through partial reconfiguration if the soft errors are correctable by ECC(Error Correction Code) and CRC(Cyclic Redundancy Code). This paper presents the availability analysis of Xilinx 7-Series FPGAs against soft errors under the protection of the SEM Controller. Availability functions are derived and compared according to the correction capability of the SEM Controller. The result may help to estimate the reliability of SRAM-based FPGA running in an environment where soft errors may occur.
PDF

An Efficient Hardware Implementation of Block Cipher CLEFIA-128 (블록암호 CLEFIA-128의 효율적인 하드웨어 구현)

Bae, Gi-Chur;Shin, Kyung-Wook
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2015.05a
- /
- pp.404-406
- /
- 2015
This paper describes a small-area hardware implementation of the block cipher algorithm CLEFIA-128 which supports for 128-bit master key. A compact structure using single data processing block is adopted, which shares hardware resources for round transformation and the generation of intermediate values for round key scheduling. In addition, data processing and key scheduling blocks are simplified by utilizing a modified GFN(generalized Feistel network) and key scheduling scheme. The CLEFIA-128 crypto-processor is verified by FPGA implementation. It consumes 823 slices of Virtex5 XC5VSX50T device and the estimated throughput is about 105 Mbps with 145 MHz clock frequency.
PDF

FPGA Implementation of Unitary MUSIC Algorithm for DoA Estimation (도래방향 추정을 위한 유니터리 MUSIC 알고리즘의 FPGA 구현)

Ju, Woo-Yong;Lee, Kyoung-Sun;Jeong, Bong-Sik
- Journal of the Institute of Convergence Signal Processing
- /
- v.11 no.1
- /
- pp.41-46
- /
- 2010
In this paper, the DoA(Direction of Arrival) estimator using unitary MUSIC algorithm is studied. The complex-valued correlation matrix of MUSIC algorithm is transformed to the real-valued one using unitary transform for easy implementation. The eigenvalue and eigenvector are obtained by the combined Jacobi-CORDIC algorithm. CORDIC algorithm can be implemented by only ADD and SHIFT operations and MUSIC spectrum computed by 256 point DFT algorithm. Results of unitary MUSIC algorithm designed by System Generator for FPGA implementation is entirely consistent with Matlab results. Its performance is evaluated through hardware co-simulation and resource estimation.
PDF KSCI

Energy Efficient and Low-Cost Server Architecture for Hadoop Storage Appliance

Choi, Do Young;Oh, Jung Hwan;Kim, Ji Kwang;Lee, Seung Eun
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.14 no.12
- /
- pp.4648-4663
- /
- 2020
This paper proposes the Lempel-Ziv 4(LZ4) compression accelerator optimized for scale-out servers in data centers. In order to reduce CPU loads caused by compression, we propose an accelerator solution and implement the accelerator on an Field Programmable Gate Array(FPGA) as heterogeneous computing. The LZ4 compression hardware accelerator is a fully pipelined architecture and applies 16 dictionaries to enhance the parallelism for high throughput compressor. Our hardware accelerator is based on the 20-stage pipeline and dictionary architecture, highly customized to LZ4 compression algorithm and parallel hardware implementation. Proposing dictionary architecture allows achieving high throughput by comparing input sequences in multiple dictionaries simultaneously compared to a single dictionary. The experimental results provide the high throughput with intensively optimized in the FPGA. Additionally, we compare our implementation to CPU implementation results of LZ4 to provide insights on FPGA-based data centers. The proposed accelerator achieves the compression throughput of 639MB/s with fine parallelism to be deployed into scale-out servers. This approach enables the low power Intel Atom processor to realize the Hadoop storage along with the compression accelerator.
https://doi.org/10.3837/tiis.2020.12.002 인용 PDF KSCI HTML

Search Result 960, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)