• Title/Summary/Keyword: Hardware accelerator

Search Result 112, Processing Time 0.037 seconds

Design study of the Vacuum system for RAON accelerator using MonteCarlo method

  • Kim, Jae-Hong;Jeon, Dong-O
    • Proceedings of the Korean Vacuum Society Conference
    • /
    • 2015.08a
    • /
    • pp.70.1-70.1
    • /
    • 2015
  • The facility for RAON superconducting heavy-ion accelerator at a beam power of up to 400 kW will be produced rare isotopes with two electron cyclotron resonance (ECR) ion sources. Highly charged ions generated by the ECR ion source will be injected to a superconducting LINAC to accelerate them up to 200 MeV/u. During the acceleration of the heavy ions, a good vacuum system is required to avoid beam loss due to interaction with residual gases. Therefore ultra-high vacuum (UHV) is required to (i) limit beam losses, (ii) keep the radiation induced within safe levels, and (iii) prevent contamination of superconducting cavities by residual gas. In this work, a RAON vacuum design for all the accelerator system will be presented along with Monte Carlo simulation of vacuum levels in order to validate the vacuum hardware configuration, which is needed to meet the baseline requirements.

  • PDF

Comparative Performance Analysis of Network Security Accelerator based on Queuing System

  • Yun Yeonsang;Lee Seonyoung;Han Seonkyoung;Kim Youngdae;You Younggap
    • Proceedings of the IEEK Conference
    • /
    • summer
    • /
    • pp.269-273
    • /
    • 2004
  • This paper presents a comparative performance analysis of a network accelerator model based on M/M/l queuing system. It assumes the Poisson distribution as its input traffic load. The decoding delay is employed as a performance analysis measure. Simulation results based on the proposed model show only $15\%$ differences with respect to actual measurements on field traffic for BCM5820 accelerator device. The performance analysis model provides with reasonable hardware structure of network servers, and can be used to span design spaces statistically.

  • PDF

IPSec Accelerator Performance Analysis Model for Gbps VPN (기가급 VPN을 위한 IPSec 가속기 성능분석 모델)

  • 윤연상;류광현;박진섭;김용대;한선경;유영갑
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.14 no.4
    • /
    • pp.141-148
    • /
    • 2004
  • This paper proposes an IPSec accelerator performance analysis model based a queue model. It assumes Poison distribution as its input traffic load. The decoding delay is employed as a performance analysis measure. Simulation results based on the proposed model show around 15% differences with respect to actual measurements on field traffic for the BCM5820 accelerator device. The performance analysis model provides with reasonable hardware structure of network servers, and can be used to span design spaces statistically.

Implementation of FPGA for Efficient Ray Tracing Hardware Supporting Dynamic Scenes (동적 장면을 지원하는 효율적인 광선 추적 하드웨어에 대한 FPGA상에서의 구현)

  • Lee, Jin Young;Kim, Cheong Ghil;Park, Woo-Chan
    • Journal of the Semiconductor & Display Technology
    • /
    • v.21 no.4
    • /
    • pp.23-26
    • /
    • 2022
  • In this paper, our ray tracing hardware is implemented on the latest high-capacity FPGA board. The system included ray tracing hardware for rendering and tree building hardware for handling dynamic scenes. The FPGA board used in the implementation is a Xilinx Alveo U250 accelerator card for data centers. This included 12 ray tracing hardware cores and 1 tree-building hardware core. As a result of testing in various scenes in Full HD resolution, the FPS performance of the proposed ray tracing system was measured from 8 to 28. The overall average is about 17.7 FPS.

EPICS Based RF Control System for PAL Storage Ring (EPICS를 이용한 가속기 RF 제어시스템 개발)

  • Yoon, J.C.;Park, H.J.;Lee, J.Y.;Choi, J.Y.;Nam, S.Y.
    • Proceedings of the KIEE Conference
    • /
    • 2003.07d
    • /
    • pp.2239-2241
    • /
    • 2003
  • A new RF control system of Pohang Accelerator Laboratory (PAL) storage ring is a subsystem upgraded PAL control system, which is based upon Experimental Physics and Industrial Control System (EPICS). There are 5 control components, Low Level RF System (LRS), Klystron System, Circulator System, Cavity System, Local Cooling Water System (LCW) at the storage ring of PAL. The new RF control system for the storage ring has been under development for one years, first versions of individual VME (Versa Module Europa) Input/output modules under construction and system integration begun. In this system, VMEbus-based hardware is widely used for front-end controllers (FDS), Input/output controller (IOC). A number of Programmable Logic Controller (PLC) and SUN workstations are also used for Operator Interfaces (OPI) in the control system. This paper describes the development VME I/O module to the new control system and how the design of this new system.

  • PDF

Trends of Low-Precision Processing for AI Processor (NPU 반도체를 위한 저정밀도 데이터 타입 개발 동향)

  • Kim, H.J.;Han, J.H.;Kwon, Y.S.
    • Electronics and Telecommunications Trends
    • /
    • v.37 no.1
    • /
    • pp.53-62
    • /
    • 2022
  • With increasing size of transformer-based neural networks, a light-weight algorithm and efficient AI accelerator has been developed to train these huge networks in practical design time. In this article, we present a survey of state-of-the-art research on the low-precision computational algorithms especially for floating-point formats and their hardware accelerator. We describe the trends by focusing on the work of two leading research groups-IBM and Seoul National University-which have deep knowledge in both AI algorithm and hardware architecture. For the low-precision algorithm, we summarize two efficient floating-point formats (hybrid FP8 and radix-4 FP4) with accuracy-preserving algorithms for training on the main research stream. Moreover, we describe the AI processor architecture supporting the low-bit mixed precision computing unit including the integer engine.

Host Interface Design for TCP/IP Hardware Accelerator (TCP/IP Hardware Accelerator를 위한 Host Interface의 설계)

  • Jung, Yeo-Jin;Lim, Hye-Sook
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.2B
    • /
    • pp.1-10
    • /
    • 2005
  • TCP/IP protocols have been implemented in software program running on CPU in end systems. As the increased demand of fast protocol processing, it is required to implement the protocols in hardware, and Host Interface is responsible for communication between external CPU and the hardware blocks of TCP/IP implementation. The Host Interface follows AMBA AHB specification for the communication with external world. For control flow, the Host Interface behaves as a slave of AMBA AHB. Using internal Command/status Registers, the Host Interface receives commands from CPU and transfers hardware status and header information to CPU. On the other hand, the Host Interface behaves as a master for data flow. Data flow has two directions, Receive Flow and Transmit Flow. In Receive Flow, using internal RxFIFO, the Host Interface reads data from UDP FIFO or TCP buffer and transfers data to external RAM for CPU to read. For Transmit Flow, the Host Interface reads data from external RAM and transfers data to UDP buffer or TCP buffer through internal TxFIFO. TCP/IP hardware blocks generate packets using the data and transmit. Buffer Descriptor is one of the Command/Status Registers, and the information stored in Buffer Descriptor is used for external RAM access. Several testcases are designed to verify TCP/IP functions. The Host Interface is synthesized using the 0.18 micron technology, and it results in 173 K gates including the Command/status Registers and internal FIFOs.

Implementation of SDR-based LTE-A PDSCH Decoder for Supporting Multi-Antenna Using Multi-Core DSP (멀티코어 DSP를 이용한 다중 안테나를 지원하는 SDR 기반 LTE-A PDSCH 디코더 구현)

  • Na, Yong;Ahn, Heungseop;Choi, Seungwon
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.15 no.4
    • /
    • pp.85-92
    • /
    • 2019
  • This paper presents a SDR-based Long Term Evolution Advanced (LTE-A) Physical Downlink Shared Channel (PDSCH) decoder using a multicore Digital Signal Processor (DSP). For decoder implementation, multicore DSP TMS320C6670 is used, which provides various hardware accelerators such as turbo decoder, fast Fourier transformer and Bit Rate Coprocessors. The TMS320C6670 is a DSP specialized in implementing base station platforms and is not an optimized platform for implementing mobile terminal platform. Accordingly, in this paper, the hardware accelerator was changed to the terminal implementation to implement the LTE-A PDSCH decoder supporting the multi-antenna and the functions not provided by the hardware accelerator were implemented through core programming. Also pipeline using multicore was implemented to meet the transmission time interval. To confirm the feasibility of the proposed implementation, we verified the real-time decoding capability of the PDSCH decoder implemented using the LTE-A Reference Measurement Channel (RMC) waveform about transmission mode 2 and 3.

Design of Stand-alone AI Processor for Embedded System (독립운용이 가능한 임베디드 인공지능 프로세서 설계)

  • Cho, Kwon Neung;Choi, Do Young;Jeong, Young Woo;Lee, Seung Eun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.600-602
    • /
    • 2021
  • With the development of the mobile industry and growing interest in artificial intelligence (AI) technology, a lot of research for AI processors which applicable to embedded systems is under study. When implementing AI to embedded systems, the design should be considered the restriction of resource and power consumption. Moreover, it is efficient to include a dedicated hardware accelerator in order to complement the low computational performance of the embedded system. In this paper, we propose an stand-alone embedded AI processor. The proposed AI processor includes a hardware accelerator that is dedicated to the distance-based AI algorithm and a general-purpose MCU that supports flexible programmability for application to various embedded systems. The AI processor was designed with Verilog HDL and verified by implementing on Field Programmable Gate Array (FPGA).

  • PDF

An Embedded FAST Hardware Accelerator for Image Feature Detection (영상 특징 추출을 위한 내장형 FAST 하드웨어 가속기)

  • Kim, Taek-Kyu
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.2
    • /
    • pp.28-34
    • /
    • 2012
  • Various feature extraction algorithms are widely applied to real-time image processing applications for extracting significant features from images. Feature extraction algorithms are mostly combined with image processing algorithms mostly for image tracking and recognition. Feature extraction function is used to supply feature information to the other image processing algorithms and it is mainly implemented in a preprocessing stage. Nowadays, image processing applications are faced with embedded system implementation for a real-time processing. In order to satisfy this requirement, it is necessary to reduce execution time so as to improve the performance. Reducing the time for executing a feature extraction function dose not only extend the execution time for the other image processing algorithms, but it also helps satisfy a real-time requirement. This paper explains FAST (Feature from Accelerated Segment Test algorithm) of E. Rosten and presents FPGA-based embedded hardware accelerator architecture. The proposed acceleration scheme can be implemented by using approximately 2,217 Flip Flops, 5,034 LUTs, 2,833 Slices, and 18 Block RAMs in the Xilinx Vertex IV FPGA. In the Modelsim - based simulation result, the proposed hardware accelerator takes 3.06 ms to extract 954 features from a image with $640{\times}480$ pixels and this result shows the cost effectiveness of the propose scheme.