• 제목/요약/키워드: FPGA 가속기

검색결과 60건 처리시간 0.027초

Design and Implementation of BNN based Human Identification and Motion Classification System Using CW Radar (연속파 레이다를 활용한 이진 신경망 기반 사람 식별 및 동작 분류 시스템 설계 및 구현)

  • Kim, Kyeong-min;Kim, Seong-jin;NamKoong, Ho-jung;Jung, Yun-ho
    • Journal of Advanced Navigation Technology
    • /
    • 제26권4호
    • /
    • pp.211-218
    • /
    • 2022
  • Continuous wave (CW) radar has the advantage of reliability and accuracy compared to other sensors such as camera and lidar. In addition, binarized neural network (BNN) has a characteristic that dramatically reduces memory usage and complexity compared to other deep learning networks. Therefore, this paper proposes binarized neural network based human identification and motion classification system using CW radar. After receiving a signal from CW radar, a spectrogram is generated through a short-time Fourier transform (STFT). Based on this spectrogram, we propose an algorithm that detects whether a person approaches a radar. Also, we designed an optimized BNN model that can support the accuracy of 90.0% for human identification and 98.3% for motion classification. In order to accelerate BNN operation, we designed BNN hardware accelerator on field programmable gate array (FPGA). The accelerator was implemented with 1,030 logics, 836 registers, and 334.904 Kbit block memory, and it was confirmed that the real-time operation was possible with a total calculation time of 6 ms from inference to transferring result.

Design and Implementation of CW Radar-based Human Activity Recognition System (CW 레이다 기반 사람 행동 인식 시스템 설계 및 구현)

  • Nam, Jeonghee;Kang, Chaeyoung;Kook, Jeongyeon;Jung, Yunho
    • Journal of Advanced Navigation Technology
    • /
    • 제25권5호
    • /
    • pp.426-432
    • /
    • 2021
  • Continuous wave (CW) Doppler radar has the advantage of being able to solve the privacy problem unlike camera and obtains signals in a non-contact manner. Therefore, this paper proposes a human activity recognition (HAR) system using CW Doppler radar, and presents the hardware design and implementation results for acceleration. CW Doppler radar measures signals for continuous operation of human. In order to obtain a single motion spectrogram from continuous signals, an algorithm for counting the number of movements is proposed. In addition, in order to minimize the computational complexity and memory usage, binarized neural network (BNN) was used to classify human motions, and the accuracy of 94% was shown. To accelerate the complex operations of BNN, the FPGA-based BNN accelerator was designed and implemented. The proposed HAR system was implemented using 7,673 logics, 12,105 registers, 10,211 combinational ALUTs, and 18.7 Kb of block memory. As a result of performance evaluation, the operation speed was improved by 99.97% compared to the software implementation.

Design and Implementation of a Hardware-based Transmission/Reception Accelerator for a Hybrid TCP/IP Offload Engine (하이브리드 TCP/IP Offload Engine을 위한 하드웨어 기반 송수신 가속기의 설계 및 구현)

  • Jang, Han-Kook;Chung, Sang-Hwa;Yoo, Dae-Hyun
    • Journal of KIISE:Computer Systems and Theory
    • /
    • 제34권9호
    • /
    • pp.459-466
    • /
    • 2007
  • TCP/IP processing imposes a heavy load on the host CPU when it is processed by the host CPU on a very high-speed network. Recently the TCP/IP Offload Engine (TOE), which processes TCP/IP on a network adapter instead of the host CPU, has become an attractive solution to reduce the load in the host CPU. There have been two approaches to implement TOE. One is the software TOE in which TCP/IP is processed by an embedded processor and the other is the hardware TOE in which TCP/IP is processed by a dedicated ASIC. The software TOE has poor performance and the hardware TOE is neither flexible nor expandable enough to add new features. In this paper we designed and implemented a hybrid TOE architecture, in which TCP/IP is processed by cooperation of hardware and software, based on an FPGA that has two embedded processor cores. The hybrid TOE can have high performance by processing time-critical operations such as making and processing data packets in hardware. The software based on the embedded Linux performs operations that are not time-critical such as connection establishment, flow control and congestions, thus the hybrid TOE can have enough flexibility and expandability. To improve the performance of the hybrid TOE, we developed a hardware-based transmission/reception accelerator that processes important operations such as creating data packets. In the experiments the hybrid TOE shows the minimum latency of about $19{\mu}s$. The CPU utilization of the hybrid TOE is below 6 % and the maximum bandwidth of the hybrid TOE is about 675 Mbps.

An Efficient Hardware Implementation of CABAC Using H/W-S/W Co-design (H/W-S/W 병행설계를 이용한 CABAC의 효율적인 하드웨어 구현)

  • Cho, Young-Ju;Ko, Hyung-Hwa
    • Journal of Advanced Navigation Technology
    • /
    • 제18권6호
    • /
    • pp.600-608
    • /
    • 2014
  • In this paper, CABAC H/W module is developed using co-design method. After entire H.264/AVC encoder was developed with C using reference SW(JM), CABAC H/W IP is developed as a block in H.264/AVC encoder. Context modeller of CABAC is included on the hardware to update the changed value during binary encoding, which enables the efficient usage of memory and the efficient design of I/O stream. Hardware IP is co-operated with the reference software JM of H.264/AVC, and executed on Virtex-4 FX60 FPGA on ML410 board. Functional simulation is done using Modelsim. Compared with existing H/W module of CABAC with register-level design, the development time is reduced greatly and software engineer can design H/W module more easily. As a result, the used amount of slice in CABAC is less than 1/3 of that of CAVLC module. The proposed co-design method is useful to provide hardware accelerator in need of speed-up of high efficient video encoder in embedded system.

Design of a Compact GPS/MEMS IMU Integrated Navigation Receiver Module for High Dynamic Environment (고기동 환경에 적용 가능한 소형 GPS/MEMS IMU 통합항법 수신모듈 설계)

  • Jeong, Koo-yong;Park, Dae-young;Kim, Seong-min;Lee, Jong-hyuk
    • Journal of Advanced Navigation Technology
    • /
    • 제25권1호
    • /
    • pp.68-77
    • /
    • 2021
  • In this paper, a GPS/MEMS IMU integrated navigation receiver module capable of operating in a high dynamic environment is designed and fabricated, and the results is confirmed. The designed module is composed of RF receiver unit, inertial measurement unit, signal processing unit, correlator, and navigation S/W. The RF receiver performs the functions of low noise amplification, frequency conversion, filtering, and automatic gain control. The inertial measurement unit collects measurement data from a MEMS class IMU applied with a 3-axis gyroscope, accelerometer, and geomagnetic sensor. In addition, it provides an interface to transmit to the navigation S/W. The signal processing unit and the correlator is implemented with FPGA logic to perform filtering and corrrelation value calculation. Navigation S/W is implemented using the internal CPU of the FPGA. The size of the manufactured module is 95.0×85.0×.12.5mm, the weight is 110g, and the navigation accuracy performance within the specification is confirmed in an environment of 1200m/s and acceleration of 10g.

A DSP-based Controller for a Small Humanoid Robot (DSP를 사용한 소형 인간형 로봇의 제어기)

  • Cho Jeong-San;Sung Young-Whee
    • Journal of the Institute of Convergence Signal Processing
    • /
    • 제6권4호
    • /
    • pp.191-197
    • /
    • 2005
  • Biped walking is the main feature of a humanoid robot. In a biped walking robot, there are many actuators to be controlled and many sensors to be interfaced. In this paper, we propose a DSP-based controller for a miniature biped walking robot with 21 RC servo motors. The proposed controller has a hierarchical structure; a host PC, a DSP-based main controller, and an auxiliary controller with an FPGA chip. The host PC generates and transmits the robot walking data for given walking parameters such as stride, walking period, etc. The main controller implemented with a TMS320LF2407A controls 21 RC servo motors via the auxiliary controller. We also perform some experiments for balancing motion and walking on a slope terrain with interfacing a 2-axis acceleration sensor and a TMS320LF2407A.

  • PDF

Power-Efficient DCNN Accelerator Mapping Convolutional Operation with 1-D PE Array (1-D PE 어레이로 컨볼루션 연산을 수행하는 저전력 DCNN 가속기)

  • Lee, Jeonghyeok;Han, Sangwook;Choi, Seungwon
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • 제18권2호
    • /
    • pp.17-26
    • /
    • 2022
  • In this paper, we propose a novel method of performing convolutional operations on a 2-D Processing Element(PE) array. The conventional method [1] of mapping the convolutional operation using the 2-D PE array lacks flexibility and provides low utilization of PEs. However, by mapping a convolutional operation from a 2-D PE array to a 1-D PE array, the proposed method can increase the number and utilization of active PEs. Consequently, the throughput of the proposed Deep Convolutional Neural Network(DCNN) accelerator can be increased significantly. Furthermore, the power consumption for the transmission of weights between PEs can be saved. Based on the simulation results, the performance of the proposed method provides approximately 4.55%, 13.7%, and 2.27% throughput gains for each of the convolutional layers of AlexNet, VGG16, and ResNet50 using the DCNN accelerator with a (weights size) x (output data size) 2-D PE array compared to the conventional method. Additionally the proposed method provides approximately 63.21%, 52.46%, and 39.23% power savings.

LC output filter for high accuracy and stability digital controlled MPS at PLS (포항가속기연구소 디지탈 전자석 전원장치의 LC 출력필터)

  • Kim, S.C.;Ha, K.M.;Huang, J.Y.;Choi, J.H.
    • Proceedings of the KIPE Conference
    • /
    • 전력전자학회 2005년도 전력전자학술대회 논문집
    • /
    • pp.106-108
    • /
    • 2005
  • High accuracy and stability digital controlled power supply for magnet is developed at PLS. This power supply has three sections. The first section is digital controller including DSP&FPGA and precision ADC, the second section consists of IGBT driver and four quad IGBT switch, and the third section is LC output filter section. AC input voltage of power supply is 3-phase 21V, output current is 0 ${\sim}$ 150 A dc. Switching frequency of four quad IGBT switch is 25 kHz. The output current of power supply has very high accuracy of 100 A step resolution at full range and the stability of +/- 1.5 ppm for short term and +/- 5 ppm for long term. This paper describes characteristics of filter and output current performance improvement after LC output filter at four quad digital power supplies.

  • PDF

Convolutional Neural Network Based on Accelerator-Aware Pruning for Object Detection in Single-Shot Multibox Detector (싱글숏 멀티박스 검출기에서 객체 검출을 위한 가속 회로 인지형 가지치기 기반 합성곱 신경망 기법)

  • Kang, Hyeong-Ju
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • 제24권1호
    • /
    • pp.141-144
    • /
    • 2020
  • Convolutional neural networks (CNNs) show high performance in computer vision tasks including object detection, but a lot of weight storage and computation is required. In this paper, a pruning scheme is applied to CNNs for object detection, which can remove much amount of weights with a negligible performance degradation. Contrary to the previous ones, the pruning scheme applied in this paper considers the base accelerator architecture. With the consideration, the pruned CNNs can be efficiently performed on an ASIC or FPGA accelerator. Even with the constrained pruning, the resulting CNN shows a negligible degradation of detection performance, less-than-1% point degradation of mAP on VOD0712 test set. With the proposed scheme, CNNs can be applied to objection dtection efficiently.

Semiconductor Characteristics and Design Methodology in Digital Front-End Design (Digital Front-End Design에서의 반도체 특성 연구 및 방법론의 고찰)

  • Jeong, Taik-Kyeong;Lee, Jang-Ho
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • 제10권10호
    • /
    • pp.1804-1809
    • /
    • 2006
  • The aim of this Paper is to describe the implementation of a low-power digital front-End Design (FED) that will act as the core of a stand-alone Power dissipation methodology. The design of digital integrated circuits is a large and diverse area, and we have chosen to focus on low power FED. Designs are made from synthesized logic, and we need to consider the low power digital FED including input clock, buffer, latches, voltage regulator, and capacitance-to-voltage counter which have been integrated onto hish bandwidth communication chips and system. These single- chip micro instruments, implemented in a 0.12um CMOS technology operate with a single 0.9V supply voltage, and can be used to monitor dynamic and static power dissipation, Vesture, acceleration junction temperature (Tj), etc.