• 제목/요약/키워드: Memory access

Search Result 1,138, Processing Time 0.029 seconds

Analysis of System Performance of Change the Ring Architecture on Dual Ring CC-NUMA System (이중 링 CC-NUMA 시스템에서 링 구조 변화에 따른 시스템 성능 분석)

  • Yun, Joo-Beom;Jhang, Seong-Tae;Jhon, Shik-Jhon
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.29 no.2
    • /
    • pp.105-115
    • /
    • 2002
  • Since NUMA architecture has to access remote memory an interconnection network determines the performance of CC-NUMA system Bus which has been used as a popular interconnection network has many limits to build a large-scale system because of the limited physical scalabilty and bandwidth Dual ring interconnection network composed of high speed point-to-point links is made up for resolving the defects of the bus for large-scale system But it also has a problem that the response latency is rapidly increased when many node are attached to snooping based CC-NUMA system with dual ring In this paper we propose a chordal ring architecture in order to overcome the problem of the dual ring on snooping based CC-NUMA system and design and efficient link controller adopted to this architecture. We also analyze the effects of chordal ring architecture on the system performance and the response latency by using probability driven simulator.

A Study on Architecture of Motion Compensator for H.264/AVC Encoder (H.264/AVC부호화기용 움직임 보상기의 아키텍처 연구)

  • Kim, Won-Sam;Sonh, Seung-Il;Kang, Min-Goo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.3
    • /
    • pp.527-533
    • /
    • 2008
  • Motion compensation always produces the principal bottleneck in the real-time high quality video applications. Therefore, a fast dedicated hardware is needed to perform motion compensation in the real-time video applications. In many video encoding methods, the frames are partitioned into blocks of Pixels. In general, motion compensation predicts present block by estimating the motion from previous frame. In motion compensation, the higher pixel accuracy shows the better performance but the computing complexity is increased. In this paper, we studied an architecture of motion compensator suitable for H.264/AVC encoder that supports quarter-pixel accuracy. The designed motion compensator increases the throughput using transpose array and 3 6-tap Luma filters and efficiently reduces the memory access. The motion compensator is described in VHDL and synthesized in Xilinx ISE and verified using Modelsim_6.1i. Our motion compensator uses 36-tap filters only and performs in 640 clock-cycle per macro block. The motion compensator proposed in this paper is suitable to the areas that require the real-time video processing.

A Low Cost IBM PC/AT Based Image Processing System for Satellite Image Analysis: A New Analytical Tool for the Resource Managers

  • Yang, Young-Kyu;Cho, Seong-Ik;Lee, Hyun-Woo;Miller, Lee-D.
    • Korean Journal of Remote Sensing
    • /
    • v.4 no.1
    • /
    • pp.31-40
    • /
    • 1988
  • Low-cost microcomputer systems can be assembled which possess computing power, color display, memory, and storage capacity approximately equal to graphic workstactions. A low-cost, flexible, and user-friendly IBM/PC/XT/AT based image processing system has been developed and named as KMIPS(KAIST (Korea Advanced Institute of Science & Technology) Map and Image Processing Station). It can be easily utilized by the resource managers who are not computer specialists. This system can: * directly access Landsat MSS and TM, SPOT, NOAA AVHRR, MOS-1 satellite imagery and other imagery from different sources via magnetic tape drive connected with IBM/PC; * extract image up to 1024 line by 1024 column and display it up to 480 line by 672 column with 512 colors simultaneously available; * digitize photographs using a frame grabber subsystem(512 by 512 picture elements); * perform a variety of image analyses, GIS and terrain analyses, and display functions; and * generate map and hard copies to the various scales. All raster data input to the microcomputer system is geographically referenced to the topographic map series in any rater cell size selected by the user. This map oriented, georeferenced approach of this system enables user to create a very accurately registered(.+-.1 picture element), multivariable, multitemporal data sets which can be subsequently subsequently subjected to various analyses and display functions.

Forecasting of the COVID-19 pandemic situation of Korea

  • Goo, Taewan;Apio, Catherine;Heo, Gyujin;Lee, Doeun;Lee, Jong Hyeok;Lim, Jisun;Han, Kyulhee;Park, Taesung
    • Genomics & Informatics
    • /
    • v.19 no.1
    • /
    • pp.11.1-11.8
    • /
    • 2021
  • For the novel coronavirus disease 2019 (COVID-19), predictive modeling, in the literature, uses broadly susceptible exposed infected recoverd (SEIR)/SIR, agent-based, curve-fitting models. Governments and legislative bodies rely on insights from prediction models to suggest new policies and to assess the effectiveness of enforced policies. Therefore, access to accurate outbreak prediction models is essential to obtain insights into the likely spread and consequences of infectious diseases. The objective of this study is to predict the future COVID-19 situation of Korea. Here, we employed 5 models for this analysis; SEIR, local linear regression (LLR), negative binomial (NB) regression, segment Poisson, deep-learning based long short-term memory models (LSTM) and tree based gradient boosting machine (GBM). After prediction, model performance comparison was evelauated using relative mean squared errors (RMSE) for two sets of train (January 20, 2020-December 31, 2020 and January 20, 2020-January 31, 2021) and testing data (January 1, 2021-February 28, 2021 and February 1, 2021-February 28, 2021) . Except for segmented Poisson model, the other models predicted a decline in the daily confirmed cases in the country for the coming future. RMSE values' comparison showed that LLR, GBM, SEIR, NB, and LSTM respectively, performed well in the forecasting of the pandemic situation of the country. A good understanding of the epidemic dynamics would greatly enhance the control and prevention of COVID-19 and other infectious diseases. Therefore, with increasing daily confirmed cases since this year, these results could help in the pandemic response by informing decisions about planning, resource allocation, and decision concerning social distancing policies.

Efficient Intra Predictor Design for H.264/AVC Decoder (H.264/AVC 복호기를 위한 효율적인 인트라 예측기 설계)

  • Kim, Ok;Ryoo, Kwangki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2009.10a
    • /
    • pp.175-178
    • /
    • 2009
  • H.264/AVC is a video coding standard of ITU-T and ISO/IEC, and widely spreads its application due to its high compression ratio more than twice that of MPEG-2 and high image quality. In this paper, we explained Intra Prediction in H.264/AVC, which is able to achieve higher compressing efficiency from correlation removal of adjacent samples in spatial domain, and proposed efficient Intra Predictor architecture design for H.264/AVC decoder. The proposed system reduced computation cycle using processing element and precomputation processing element and also reduced the number of access to external memory using efficient register. We designed the proposed system with Verilog-HDL and verified with suitable test vector. The proposed Intra Predictor achieved about 60% cycle reduction comparing with existing Intra Predictors.

  • PDF

Design of AES-Based Encryption Chip for IoT Security (IoT 보안을 위한 AES 기반의 암호화칩 설계)

  • Kang, Min-Sup
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.1
    • /
    • pp.1-6
    • /
    • 2021
  • The paper proposes the design of AES-based encryption chip for IoT security. ROM based S-Box implementation occurs a number of memory space and some delay problems for its access. In this approach, S-Box is designed by pipeline structure on composite field GF((22)2) to get faster calculation results. In addition, in order to achieve both higher throughput and less delay, shared S-Box are used in each round transformation and the key scheduling process. The proposed AES crypto-processor is described in Veilog-HDL, and Xilinx ISE 14.7 tool is used for logic synthesis by using Xilinx XC6VLX75T FPGA. In order to perform the verification of the crypto-processor, the timing simulator(ModelSim 10.3) is also used.

Authentication Protocol Using Hamming Distance for Mobile Ad-hoc Network (모바일 Ad-hoc 네트워크에서 Hamming Distance를 이용한 인증프로토콜)

  • Lee, Seok-Lae;Song, Joo-Seok
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.16 no.5
    • /
    • pp.47-57
    • /
    • 2006
  • Mobile Ad-hoc networks have various implementation constraints such as infrastructure-free, no trusted authority, node mobility, and the limited power and small memory of mobile device. And just like wired networks, various security issues such as authentication, confidentiality, integrity, non-repudiation, access control, availability and so on have been arisen in mobile Ad-hoc networks. But we focus on authentication of these security issues because it is quitely affected by the characteristics of networks. In this paper, we propose the authentication protocol that can limit the size of certificate repository as $log_2N$ and assures to make a trusted certificate path from one node to another, adopting the concept of Hamming distance. Particularly, our protocol can construct a trusted certificate path in spite of decreasing or increasing the number of nodes in mobile Ad-hoc network.

Development of Commercial Game Engine-based Low Cost Driving Simulator for Researches on Autonomous Driving Artificial Intelligent Algorithms (자율주행 인공지능 알고리즘 연구를 위한 상용 게임 엔진 기반 초저가 드라이빙 시뮬레이터 개발)

  • Im, Ji Ung;Kang, Min Su;Park, Dong Hyuk;Won, Jong hoon
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.20 no.6
    • /
    • pp.242-263
    • /
    • 2021
  • This paper presents a method to implement a low-cost driving simulator for developing autonomous driving algorithms. This is implemented by using GTA V, a physical engine-based commercial game software, containing a function to emulate output and data of various sensors for autonomous driving. For this, NF of Script Hook V is incorporated to acquire GT data by accessing internal data of the software engine, and then, various sensor data for autonomous driving are generated. We present an overall function of the developed driving simulator and perform a verification of individual functions. We explain the process of acquiring GT data via direct access to the internal memory of the game engine to build up an autonomous driving algorithm development environment. And, finally, an example applicable to artificial neural network training and performance evaluation by processing the emulated sensor output is included.

Implementation of FPGA-based Accelerator for GRU Inference with Structured Compression (구조적 압축을 통한 FPGA 기반 GRU 추론 가속기 설계)

  • Chae, Byeong-Cheol
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.6
    • /
    • pp.850-858
    • /
    • 2022
  • To deploy Gate Recurrent Units (GRU) on resource-constrained embedded devices, this paper presents a reconfigurable FPGA-based GRU accelerator that enables structured compression. Firstly, a dense GRU model is significantly reduced in size by hybrid quantization and structured top-k pruning. Secondly, the energy consumption on external memory access is greatly reduced by the proposed reuse computing pattern. Finally, the accelerator can handle a structured sparse model that benefits from the algorithm-hardware co-design workflows. Moreover, inference tasks can be flexibly performed using all functional dimensions, sequence length, and number of layers. Implemented on the Intel DE1-SoC FPGA, the proposed accelerator achieves 45.01 GOPs in a structured sparse GRU network without batching. Compared to the implementation of CPU and GPU, low-cost FPGA accelerator achieves 57 and 30x improvements in latency, 300 and 23.44x improvements in energy efficiency, respectively. Thus, the proposed accelerator is utilized as an early study of real-time embedded applications, demonstrating the potential for further development in the future.

Microcode based Controller for Compact CNN Accelerators Aimed at Mobile Devices (모바일 디바이스를 위한 소형 CNN 가속기의 마이크로코드 기반 컨트롤러)

  • Na, Yong-Seok;Son, Hyun-Wook;Kim, Hyung-Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.3
    • /
    • pp.355-366
    • /
    • 2022
  • This paper proposes a microcode-based neural network accelerator controller for artificial intelligence accelerators that can be reconstructed using a programmable architecture and provide the advantages of low-power and ultra-small chip size. In order for the target accelerator to support various neural network models, the neural network model can be converted into microcode through microcode compiler and mounted on accelerator to control the operators of the accelerator such as datapath and memory access. While the proposed controller and accelerator can run various CNN models, in this paper, we tested them using the YOLOv2-Tiny CNN model. Using a system clock of 200 MHz, the Controller and accelerator achieved an inference time of 137.9 ms/image for VOC 2012 dataset to detect object, 99.5ms/image for mask detection dataset to detect wearing mask. When implementing an accelerator equipped with the proposed controller as a silicon chip, the gate count is 618,388, which corresponds to 65.5% reduction in chip area compared with an accelerator employing a CPU-based controller (RISC-V).