• 제목/요약/키워드: Hardware Implementation

검색결과 2,125건 처리시간 0.033초

Energy Efficient and Low-Cost Server Architecture for Hadoop Storage Appliance

  • Choi, Do Young;Oh, Jung Hwan;Kim, Ji Kwang;Lee, Seung Eun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권12호
    • /
    • pp.4648-4663
    • /
    • 2020
  • This paper proposes the Lempel-Ziv 4(LZ4) compression accelerator optimized for scale-out servers in data centers. In order to reduce CPU loads caused by compression, we propose an accelerator solution and implement the accelerator on an Field Programmable Gate Array(FPGA) as heterogeneous computing. The LZ4 compression hardware accelerator is a fully pipelined architecture and applies 16 dictionaries to enhance the parallelism for high throughput compressor. Our hardware accelerator is based on the 20-stage pipeline and dictionary architecture, highly customized to LZ4 compression algorithm and parallel hardware implementation. Proposing dictionary architecture allows achieving high throughput by comparing input sequences in multiple dictionaries simultaneously compared to a single dictionary. The experimental results provide the high throughput with intensively optimized in the FPGA. Additionally, we compare our implementation to CPU implementation results of LZ4 to provide insights on FPGA-based data centers. The proposed accelerator achieves the compression throughput of 639MB/s with fine parallelism to be deployed into scale-out servers. This approach enables the low power Intel Atom processor to realize the Hadoop storage along with the compression accelerator.

타원곡선 암호연산 IP의 FPGA구현 (FPGA Implementation of Elliptic Curve Cryptography Processor as Intellectual Property)

  • 문상국
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국해양정보통신학회 2008년도 춘계종합학술대회 A
    • /
    • pp.670-673
    • /
    • 2008
  • C 프로그램을 사용하여 증명된 최적화된 알고리즘과 수식은 검증을 위해 Verilog와 같은 hardware description language를 통하여 다시 한번 분석하여 하드웨어 구현에 적합하도록 수정하여 최적화하여야 한다. 그 이유는 C 언어의 sequential한 특성이 하드웨어를 직접 구현 하는 데에 본질적으로 틀리기 때문이다. 알고리즘적인 접근과 더불어 하드웨어적으로 2중적으로 검증된 하드웨어 IP는 Altera 임베디드 시스템을 활용하여, ARM9이 내장되어 있는 Altera Excalibur FPGA에 매핑되어 실제 칩 프로토타입 IP로 구현한다. 구현된 유한체 연산 IP들은 실제적인 암호 시스템으로 구현되기 위하여, 193 비트 이상의 타원 곡선 암호 연산 IP를 구성하는 라이브러리 모듈로 사용될 수 있다.

  • PDF

Behavior Evolution of Autonomous Mobile Robot(AMR) using Genetic Programming Based on Evolvable Hardware

  • Sim, Kwee-Bo;Lee, Dong-Wook;Zhang, Byoung-Tak
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제2권1호
    • /
    • pp.20-25
    • /
    • 2002
  • This paper presents a genetic programming based evolutionary strategy for on-line adaptive learnable evolvable hardware. Genetic programming can be useful control method for evolvable hardware for its unique tree structured chromosome. However it is difficult to represent tree structured chromosome on hardware, and it is difficult to use crossover operator on hardware. Therefore, genetic programming is not so popular as genetic algorithms in evolvable hardware community in spite of its possible strength. We propose a chromosome representation methods and a hardware implementation method that can be helpful to this situation. Our method uses context switchable identical block structure to implement genetic tree on evolvable hardware. We composed an evolutionary strategy for evolvable hardware by combining proposed method with other's striking research results. Proposed method is applied to the autonomous mobile robots cooperation problem to verify its usefulness.

자율이동로봇의 행동진화를 위한 진화하드웨어 설계 (Design of Evolvable Hardware for Behavior Evolution of Autonomous Mobile Robots)

  • 이동욱;반창봉;전호병;심귀보
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2000년도 제15차 학술회의논문집
    • /
    • pp.254-254
    • /
    • 2000
  • This paper presents a genetic programming based evolutionary strategy for on-line adaptive learnable evolvable hardware. genetic programming can be useful control method for evolvable hardware for its unique tree structured chromosome. However it is difficult to represent tree structured chromosome on hardware, and it is difficult to use crossover operator on hardware. Therefore, genetic programming is not so popular as genetic algorithms in evolvable hardware community in spite of its possible strength. We propose a chromosome representation methods and a hardware implementation method that can be helpful to this situation. Our method uses context switchable identical block structure to implement genetic tree on evolvable hardware. We composed an evolutionary strategy (or evolvable hardware by combining proposed method with other's striking research results. Proposed method is applied to the autonomous mobile robots cooperation problem to verify its usefulness.

  • PDF

Virtual Prototyping of Area-Based Fast Image Stitching Algorithm

  • Mudragada, Lakshmi Kalyani;Lee, Kye-Shin;Kim, Byung-Gyu
    • Journal of Multimedia Information System
    • /
    • 제6권1호
    • /
    • pp.7-14
    • /
    • 2019
  • This work presents a virtual prototyping design approach for an area-based image stitching hardware. The virtual hardware obtained from virtual prototyping is equivalent to the conceptual algorithm, yet the conceptual blocks are linked to the actual circuit components including the memory, logic gates, and arithmetic units. Through the proposed method, the overall structure, size, and computation speed of the actual hardware can be estimated in the early design stage. As a result, the optimized virtual hardware facilitates the hardware implementation by eliminating trail design and redundant simulation steps to optimize the hardware performance. In order to verify the feasibility of the proposed method, the virtual hardware of an image stitching platform has been realized, where it required 10,522,368 clock cycles to stitch two $1280{\times}1024$ sized images. Furthermore, with a clock frequency of 250MHz, the estimated computation time of the proposed virtual hardware is 0.877sec, which is 10x faster than the software-based image stitch platform using MATLAB.

블록암호 알고리듬 ARIA의 효율적인 하드웨어 구현 (An Efficient Hardware Implementation of ARIA Block Cipher Algorithm)

  • 김동현;신경욱
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2012년도 춘계학술대회
    • /
    • pp.91-94
    • /
    • 2012
  • 본 논문에서는 국내 표준(KS)으로 제정된 블록암호 알고리듬 ARIA의 효율적인 하드웨어 구현을 제안한다. 제안된 ARIA 암 복호 프로세서는 표준에 제시된 세 가지 마스터 키 길이 128/192/256-비트를 모두 지원하도록 설계되었으며, 회로의 크기를 줄이기 위해 키 확장 초기화 과정과 암 복호 과정에 사용되는 라운드 함수가 공유되도록 설계를 최적화 하였으며, 이를 통해 게이트 수를 약 20% 감소시켰다. 설계된 ARIA 암 복호 프로세서를 FPGA로 구현하여 하드웨어 동작을 검증하였으며, 0.13-${\mu}m$ CMOS 셀 라이브러리로 합성한 결과 33,218 게이트로 구현되어 640 Mbps@100 MHz의 성능을 갖는 것으로 평가되었다.

  • PDF

FPGA-Based Hardware Accelerator for Feature Extraction in Automatic Speech Recognition

  • Choo, Chang;Chang, Young-Uk;Moon, Il-Young
    • Journal of information and communication convergence engineering
    • /
    • 제13권3호
    • /
    • pp.145-151
    • /
    • 2015
  • We describe in this paper a hardware-based improvement scheme of a real-time automatic speech recognition (ASR) system with respect to speed by designing a parallel feature extraction algorithm on a Field-Programmable Gate Array (FPGA). A computationally intensive block in the algorithm is identified implemented in hardware logic on the FPGA. One such block is mel-frequency cepstrum coefficient (MFCC) algorithm used for feature extraction process. We demonstrate that the FPGA platform may perform efficient feature extraction computation in the speech recognition system as compared to the generalpurpose CPU including the ARM processor. The Xilinx Zynq-7000 System on Chip (SoC) platform is used for the MFCC implementation. From this implementation described in this paper, we confirmed that the FPGA platform is approximately 500× faster than a sequential CPU implementation and 60× faster than a sequential ARM implementation. We thus verified that a parallelized and optimized MFCC architecture on the FPGA platform may significantly improve the execution time of an ASR system, compared to the CPU and ARM platforms.

멀티코어 DSP를 이용한 다중 안테나를 지원하는 SDR 기반 LTE-A PDSCH 디코더 구현 (Implementation of SDR-based LTE-A PDSCH Decoder for Supporting Multi-Antenna Using Multi-Core DSP)

  • 나용;안흥섭;최승원
    • 디지털산업정보학회논문지
    • /
    • 제15권4호
    • /
    • pp.85-92
    • /
    • 2019
  • This paper presents a SDR-based Long Term Evolution Advanced (LTE-A) Physical Downlink Shared Channel (PDSCH) decoder using a multicore Digital Signal Processor (DSP). For decoder implementation, multicore DSP TMS320C6670 is used, which provides various hardware accelerators such as turbo decoder, fast Fourier transformer and Bit Rate Coprocessors. The TMS320C6670 is a DSP specialized in implementing base station platforms and is not an optimized platform for implementing mobile terminal platform. Accordingly, in this paper, the hardware accelerator was changed to the terminal implementation to implement the LTE-A PDSCH decoder supporting the multi-antenna and the functions not provided by the hardware accelerator were implemented through core programming. Also pipeline using multicore was implemented to meet the transmission time interval. To confirm the feasibility of the proposed implementation, we verified the real-time decoding capability of the PDSCH decoder implemented using the LTE-A Reference Measurement Channel (RMC) waveform about transmission mode 2 and 3.

행렬구조 메모리 참조표를 사용한 페트리네트 제어기의 하드웨어 구현 (Hardware implementation of Petri net-based controller with matrix-based look-up tables)

  • 장래혁;정승권;권욱현
    • 제어로봇시스템학회논문지
    • /
    • 제4권2호
    • /
    • pp.194-202
    • /
    • 1998
  • This paper describes a hardware implementation method of a Petri Net-based controller. A flexible and systematic implementation method, based on look-up tables, is suggested, which enables to build high speed Petri net-based controllers. The suggested method overcomes the inherent speed limit that arises from the microprocessors by using of matrix-based look-up tables. Based on the matrix framework, this paper suggests various specific data path structures as well as a basic data path structure, accompanied by evolution algorithms, for sub-class Petri nets. A new sub-class Petri net, named Biarced Petri Net, resolves memory explosion problem that usually comes with matrix-based look-up tables. The suggested matrix-based method based on the Biarced Petri net has as good efficiency and expendability as the list-based methods. This paper shows the usefulness of the suggested method, evaluating the size of the look-up tables and introducing an architecture of the signal processing unit of a programmable controller. The suggested implementation method is supported by an automatic design support program.

  • PDF

Implementation of Intelligent Electronic Acupuncture Needles Based on Bluetooth

  • Han, Chang Pyoung;Hong, You Sik
    • International journal of advanced smart convergence
    • /
    • 제9권4호
    • /
    • pp.62-73
    • /
    • 2020
  • In this paper, we present electronic acupuncture needles we have developed using intelligence technology based on Bluetooth in order to allow anyone to simply receive customized remote diagnosis and treatment by clicking on the menu of the smartphone regardless of time and place. In order to determine the health condition and disease of patients, we have developed a software and a hardware of electronic acupuncture needles, operating on Bluetooth which transmits biometric data to oriental medical doctors using the functions of automatically determining pulse diagnosis, tongue diagnosis, and oxygen saturation; the functions are most commonly used in herbal treatment. In addition, using fuzzy logic and reasoning based on smartphones, we present in this paper an algorithm and the results of completion of hardware implementation for electronic acupuncture needles, appropriate for the body conditions of patients; the algorithm and the hardware implementation are for a treatment time duration by electronic acupuncture needles, an automatic determinations of pulse diagnosis, tongue diagnosis, and oxygen saturation, a function implementation for automatic display of acupuncture point, and a strength adjustment of electronic acupuncture needles. As a result of our simulation, we have shown that the treatment of patients, performed using an Electronic Acupuncture Needles based on intelligence, is more efficient compared to the treatment that was performed before.