• Title/Summary/Keyword: 하드웨어 효율

Search Result 1,672, Processing Time 0.033 seconds

Low-Power Metamorphic MCU using Partial Firmware Update Method for Irregular Target Systems Control (불규칙한 대상 시스템 제어를 위하여 부분 펌웨어 업데이트 기법을 이용한 저전력 변성적 MCU)

  • Baek, Jongheon;Jung, Jiwoong;Kim, Minsung;Kwon, Jisu;Park, Daejin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.2
    • /
    • pp.301-307
    • /
    • 2021
  • In addition to the revival of the Internet of Things, embedded systems, which are at the core of the Internet of Things, require intelligent control as things change. Embedded systems, however, are heavily constrained by resources such as hardware, memory, time and power. When changes are needed to firmware in an embedded system, flash Memory must be initialized and the entire firmware must be uploaded again. Therefore, it is time- and energy-efficient in that areas that do not need to be modified must also be initialized and rewritten. In this paper, we propose how to upload firmware in installments to each sector of flash memory so that only firmware can be replace the firmware in the parts that need to be modified when the firmware needs to be modified. In this paper, the proposed method was evaluated using real target board, and as a result, the time was reduced by about half.

Implementation of FPGA-based Accelerator for GRU Inference with Structured Compression (구조적 압축을 통한 FPGA 기반 GRU 추론 가속기 설계)

  • Chae, Byeong-Cheol
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.6
    • /
    • pp.850-858
    • /
    • 2022
  • To deploy Gate Recurrent Units (GRU) on resource-constrained embedded devices, this paper presents a reconfigurable FPGA-based GRU accelerator that enables structured compression. Firstly, a dense GRU model is significantly reduced in size by hybrid quantization and structured top-k pruning. Secondly, the energy consumption on external memory access is greatly reduced by the proposed reuse computing pattern. Finally, the accelerator can handle a structured sparse model that benefits from the algorithm-hardware co-design workflows. Moreover, inference tasks can be flexibly performed using all functional dimensions, sequence length, and number of layers. Implemented on the Intel DE1-SoC FPGA, the proposed accelerator achieves 45.01 GOPs in a structured sparse GRU network without batching. Compared to the implementation of CPU and GPU, low-cost FPGA accelerator achieves 57 and 30x improvements in latency, 300 and 23.44x improvements in energy efficiency, respectively. Thus, the proposed accelerator is utilized as an early study of real-time embedded applications, demonstrating the potential for further development in the future.

Server State-Based Weighted Load Balancing Techniques in SDN Environments (SDN 환경에서 서버 상태 기반 가중치 부하분산 기법)

  • Kyoung-Han, Lee;Tea-Wook, Kwon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.6
    • /
    • pp.1039-1046
    • /
    • 2022
  • After the COVID-19 pandemic, the spread of the untact culture and the Fourth Industrial Revolution, which generates various types of data, generated so much data that it was not compared to before. This led to higher data throughput, revealing little by little the limitations of the existing network system centered on vendors and hardware. Recently, SDN technology centered on users and software that can overcome these limitations is attracting attention. In addition, SDN-based load balancing techniques are expected to increase efficiency in the load balancing area of the server cluster in the data center, which generates and processes vast and diverse data. Unlike existing SDN load distribution studies, this paper proposes a load distribution technique in which a controller checks the state of a server according to the occurrence of an event rather than periodic confirmation through a monitoring technique and allocates a user's request by weighting it according to a load ratio. As a result of the desired experiment, the proposed technique showed a better equal load balancing effect than the comparison technique, so it is expected to be more effective in a server cluster in a large and packet-flowing data center.

Energy-aware Dynamic Frequency Scaling Algorithm for Polling based Communication Systems (폴링기반 통신 시스템을 위한 에너지 인지적인 동적 주파수 조절 알고리즘)

  • Cho, Mingi;Park, Daejin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.9
    • /
    • pp.1405-1411
    • /
    • 2022
  • Power management is still an important issue in embedded environments as hardware advances like high-performance processors. Power management methods such as DVFS control CPU frequencies in an adaptive manner for efficient power management in polling-based I/O programs such as network communication. This paper presents the problems of the existing power management method and proposes a new power management method. Through this, it is possible to reduce electric consumption by increasing the polling cycle in situations where the frequency of data reception is low, and on the contrary, in situations where data reception is frequent, it can operate at the maximum frequency without performance degradation. After implementing this as a code layer on the embedded board and observing it through Atmel's Power Debugger, the proposed method showed a performance improvement of up to 30% in energy consumption compared to the existing power management method.

Efficient Memory Update Module for Video Object Segmentation (동영상 물체 분할을 위한 효율적인 메모리 업데이트 모듈)

  • Jo, Junho;Cho, Nam Ik
    • Journal of Broadcast Engineering
    • /
    • v.27 no.4
    • /
    • pp.561-568
    • /
    • 2022
  • Most deep learning-based video object segmentation methods perform the segmentation with past prediction information stored in external memory. In general, the more past information is stored in the memory, the better results can be obtained by accumulating evidence for various changes in the objects of interest. However, all information cannot be stored in the memory due to hardware limitations, resulting in performance degradation. In this paper, we propose a method of storing new information in the external memory without additional memory allocation. Specifically, after calculating the attention score between the existing memory and the information to be newly stored, new information is added to the corresponding memory according to each score. In this way, the method works robustly because the attention mechanism reflects the object changes well without using additional memory. In addition, the update rate is adaptively determined according to the accumulated number of matches in the memory so that the frequently updated samples store more information to maintain reliable information.

Analysis and Management Policies for Memory Thrashing of Swap-Enabled Smartphones (스왑 지원 스마트폰의 메모리 쓰레싱 분석 및 관리 방안)

  • Hyokyung Bahn;Jisun Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.2
    • /
    • pp.61-66
    • /
    • 2023
  • As the use of smartphones expands to various areas and the level of multitasking increases, the support of swap is becoming increasingly important. However, swap support in smartphones is known to cause excessive storage traffic, resulting in memory thrashing. In this paper, we analyze how the thrashing of swaps that occurred in early smartphones has changed with the advancement of smartphone hardware. As a result of this analysis, we show that the swap thrashing problem can be resolved to some extent when the memory size increases. However, we also show that thrashing still occurs when the number of running apps continues to increase. Based on further analysis, we observe that this thrashing is caused by some hot data and suggest a way to solve this through an NVM-based architecture. Specifically, we show that a small size NVM with judicious management can resolve the performance degradation caused by smartphone swap.

Hierarchical IoT Edge Resource Allocation and Management Techniques based on Synthetic Neural Networks in Distributed AIoT Environments (분산 AIoT 환경에서 합성곱신경망 기반 계층적 IoT Edge 자원 할당 및 관리 기법)

  • Yoon-Su Jeong
    • Advanced Industrial SCIence
    • /
    • v.2 no.3
    • /
    • pp.8-14
    • /
    • 2023
  • The majority of IoT devices already employ AIoT, however there are still numerous issues that need to be resolved before AI applications can be deployed. In order to more effectively distribute IoT edge resources, this paper propose a machine learning-based approach to managing IoT edge resources. The suggested method constantly improves the allocation of IoT resources by identifying IoT edge resource trends using machine learning. IoT resources that have been optimized make use of machine learning convolution to reliably sustain IoT edge resources that are always changing. By storing each machine learning-based IoT edge resource as a hash value alongside the resource of the previous pattern, the suggested approach effectively verifies the resource as an attack pattern in a distributed AIoT context. Experimental results evaluate energy efficiency in three different test scenarios to verify the integrity of IoT Edge resources to see if they work well in complex environments with heterogeneous computational hardware.

Analysis of Latency and Computation Cost for AES-based Whitebox Cryptography Technique (AES 기반 화이트박스 암호 기법의 지연 시간과 연산량 분석)

  • Lee, Jin-min;Kim, So-yeon;Lee, Il-Gu
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.115-117
    • /
    • 2022
  • Whitebox encryption technique is a method of preventing exposure of encryption keys by mixing encryption key information with a software-based encryption algorithm. Whitebox encryption technique is attracting attention as a technology that replaces conventional hardware-based security encryption techniques by making it difficult to infer confidential data and keys by accessing memory with unauthorized reverse engineering analysis. However, in the encryption and decryption process, a large lookup table is used to hide computational results and encryption keys, resulting in a problem of slow encryption and increased memory size. In particular, it is difficult to apply whitebox cryptography to low-cost, low-power, and light-weight Internet of Things products due to limited memory space and battery capacity. In addition, in a network environment that requires real-time service support, the response delay time increases due to the encryption/decryption speed of the whitebox encryption, resulting in deterioration of communication efficiency. Therefore, in this paper, we analyze whether the AES-based whitebox(WBC-AES) proposed by S.Chow can satisfy the speed and memory requirements based on the experimental results.

  • PDF

Design of Reconfigurable Processor for Information Security System (정보보호 시스템을 위한 재구성형 프로세서 설계)

  • Cha, Jeong-Woo;Kim, Il-Hyu;Kim, Chang-Hoon;Kim, Dong-Hwi
    • Annual Conference of KIPS
    • /
    • 2011.04a
    • /
    • pp.113-116
    • /
    • 2011
  • 최근 IT 기술의 급격한 발전으로 개인정보, 환경 등 다양한 정보를 수시로 수집 및 관리하면서 사용자가 원할시 즉각적인 정보서비스를 제공하고 있다. 그러나 유 무선상의 데이터 전송은 정보의 도청, 메시지의 위 변조 및 재사용, DoS(Denial of Service)등 외부의 공격으로부터 쉽게 노출된다. 이러한 외부 공격은 개인 프라이버시를 포함한 정보서비스 시스템 전반에 치명적인 손실을 야기 시킬 수 있기 때문에 정보보호 시스템의 필요성은 갈수록 그 중요성이 부각되고 있다. 현재까지 정보보호 시스템은 소프트웨어(S/W), 하드웨어(ASIC), FPGA(Field Progr- ammable Array) 디바이스를 이용하여 구현되었으며, 각각의 구현방법은 여러 가지 문제점이 있으며 그에 따른 해결방법이 제시되고 있다. 본 논문에서는 다양한 환경에서의 정보보호 서비스를 제공하기 위한 재구성형 SoC 구조를 제안한다. 제안된 SoC는 비밀키 암호알고리즘(AES), 암호학적 해쉬(SHA-256), 공개키 암호알고리즘(ECC)을 수행 할 수 있으며, 마스터 콘트롤러에 의해 제어된다. 또한 정보보호 시스템이 요구하는 다양한 제약조건(속도, 면적, 안전성, 유연성)을 만족하기 위해 S/W, ASIC, FPGA 디바이스의 모든 장점을 최대한 활용하였으며, MCU와의 효율적인 통신을 위한 I/O 인터페이스를 제안한다. 따라서 제안된 정보보호 시스템은 기존의 시스템보다 다양한 정보보호 알고리즘을 지원할 뿐만 아니라 속도 및 면적에 있어 상충 관계를 개선하였기 때문에 저비용 응용뿐만 아니라 고속 통신 장비 시스템에도 적용이 가능하다.

Design of Stand-alone AI Processor for Embedded System (독립운용이 가능한 임베디드 인공지능 프로세서 설계)

  • Cho, Kwon Neung;Choi, Do Young;Jeong, Young Woo;Lee, Seung Eun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.600-602
    • /
    • 2021
  • With the development of the mobile industry and growing interest in artificial intelligence (AI) technology, a lot of research for AI processors which applicable to embedded systems is under study. When implementing AI to embedded systems, the design should be considered the restriction of resource and power consumption. Moreover, it is efficient to include a dedicated hardware accelerator in order to complement the low computational performance of the embedded system. In this paper, we propose an stand-alone embedded AI processor. The proposed AI processor includes a hardware accelerator that is dedicated to the distance-based AI algorithm and a general-purpose MCU that supports flexible programmability for application to various embedded systems. The AI processor was designed with Verilog HDL and verified by implementing on Field Programmable Gate Array (FPGA).

  • PDF