• Title/Summary/Keyword: 엣지 디바이스

Search Result 45, Processing Time 0.025 seconds

Radix-2 Booth-based Variable Precision Multiplier for Lightweight CNN Accelerators (경량 CNN 가속기를 위한 Radix-2 Booth 기반 가변 정밀도 곱셈기)

  • Guem, Duck-Hyun;Jeon, Seung-Jin;Choi, Jae-Young;Kim, Ji-Hyeok;Kim, Sunhee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.05a
    • /
    • pp.494-496
    • /
    • 2022
  • 엣지 디바이스에서 딥러닝을 활용하기 위하여 CNN 경량화 연구들이 진행되고 있다. 경량 CNN 은 대부분 고정 소수점을 사용하며, 계층에 따라 정밀도는 달라진다. 본 논문에서는 경량 CNN 을 지원하기 위하여, 사용 계층에 따라 정밀도를 선택할 수 있는 가변 정밀도 곱셈기를 제안한다. 제안하는 가변 정밀도 곱셈기는 낮은 정밀도 곱셈기를 병합하는 구조로, 정밀도가 낮을 때는 병렬 처리를 통해 효율을 높인다. 제안하는 곱셈기를 Verilog HDL로 설계하고 ModelSim 에서 동작을 확인하였다. 설계된 곱셈기는 계층별로 정밀도가 다른 CNN 가속기에서 효율적으로 적용될 것으로 기대된다.

SystemC-based CNN Simulator (SystemC기반 CNN 시뮬레이터 구현)

  • Kim, Jinyoung;Lee, Seungsu;Kim, Yejun;Lim, Seung-Ho;Cho, Sang-Young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.30-33
    • /
    • 2020
  • 최근 엣지 컴퓨팅과 같은 임베디드 디바이스에서 CNN과 같은 딥러닝 모듈을 수행하기 위해서 하드웨어 설계 및 구현이 많이 진행되고 있다. 이러한 임베디드 시스템에 필요한 CNN모듈을 위한 하드웨어 설계를 위해서 먼저 모델링을 통해서 시뮬레이션이 필요하다. 본 논문에서는 오픈 라이센스를 이용한 RISC-V로 딥러닝 시뮬레이터를 제작하였다. SystemC로 구현된 RISC-V를 Virtual Platform로 시뮬레이터의 제작을 하여 시뮬레이팅을 하였고, SystemC의 특징인 모듈화와 모듈간 통신에 유의하여 시스템을 구성하였다. CNN 알고리즘을 참조하여 Convolution, Activation, Pooling 연산의 기능을 하는 시스템을 구성하였다.

Compression of Super-Resolution model Using Contrastive Learning (대조 학습 기반 초해상도 모델 경량화 기법)

  • Moon, HyeonCheol;Kwon, Yong-Hoon;Jeong, JinWoo;Kim, SungJei
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.06a
    • /
    • pp.1322-1324
    • /
    • 2022
  • 최근 딥러닝의 발전에 따라 단일 이미지 초해상도 분야에 좋은 성과를 보여주고 있다. 그러나 보다 더 높은 성능을 획득하기 위해 네트워크의 깊이 및 파라미터의 수가 크게 증가하였고, 모바일 및 엣지 디바이스에 원활하게 적용되기 위하여 딥러닝 모델 경량화의 필요성이 대두되고 있다. 이에 본 논문에서는 초해상도 모델 중 하나인 EDSR(Enhanced Deep Residual Network)에 대조 학습 기반 지식 전이를 적용한 경량화 기법을 제안한다. 실험 결과 제안한 지식 전이 기법이 기존의 다른 지식 증류 기법보다 향상된 성능을 보임을 확인하였다.

  • PDF

Cloud-based smart maritime logistics warehouse management system with IP cameras (IP 카메라와 클라우드 기반 스마트 해상물류 창고 관리 시스템)

  • Kang-Hyeon Ryu;Dae-Hoon Kang;Dong-Min Kim;Min-Ho Kim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.1082-1083
    • /
    • 2023
  • 우리나라의 수출입 대부분은 해상을 통해 이루어지고 있으나 항만의 물류 창고는 데이터 네트워크를 통한 유기적인 화물의 출입과 현황관리가 부족한 실정이다. 이는 부족한 데이터 네트워크 인프라와 CCTV에 의한 아날로그 영상 데이터에 의존하는 기존 시스템의 한계로 인해 기인하는 바가 크다. 이에 IP 카메라와 엣지 디바이스의 영상분석에 의한 개별 화물 창고의 디지털 현황 분석 기반을 구축하고 분산된 개별 화물 창고의 데이터를 클라우드에 위치한 중앙 집중 데이터 분석 시스템을 구축하여 유연한 개별 화물 창고 관리와 지속적인 모니터링 기반을 제공한다. 사용자 인터페이스는 웹 기반으로 구축하여 항만 화물 관계자에게 편의성과 위치에 구애받지 않는 서비스를 제공한다. 이 과정에서 사설 IoT 네트워크를 통한 최소한의 시공비용으로 항만 내 인터넷 데이터 네트워크를 구축하여 향후 항만 내 다양한 데이터 서비스를 위한 초석을 제공한다.

Performance Comparison of Task Partitioning Methods in MEC System (MEC 시스템에서 태스크 파티셔닝 기법의 성능 비교)

  • Moon, Sungwon;Lim, Yujin
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.5
    • /
    • pp.139-146
    • /
    • 2022
  • With the recent development of the Internet of Things (IoT) and the convergence of vehicles and IT technologies, high-performance applications such as autonomous driving are emerging, and multi-access edge computing (MEC) has attracted lots of attentions as next-generation technologies. In order to provide service to these computation-intensive tasks in low latency, many methods have been proposed to partition tasks so that they can be performed through cooperation of multiple MEC servers(MECSs). Conventional methods related to task partitioning have proposed methods for partitioning tasks on vehicles as mobile devices and offloading them to multiple MECSs, and methods for offloading them from vehicles to MECSs and then partitioning and migrating them to other MECSs. In this paper, the performance of task partitioning methods using offloading and migration is compared and analyzed in terms of service delay, blocking rate and energy consumption according to the method of selecting partitioning targets and the number of partitioning. As the number of partitioning increases, the performance of the service delay improves, but the performance of the blocking rate and energy consumption decreases.

Design and Implementation of Optimal Smart Home Control System (최적의 스마트 홈 제어 시스템 설계 및 구현)

  • Lee, Hyoung-Ro;Lin, Chi-Ho
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.18 no.1
    • /
    • pp.135-141
    • /
    • 2018
  • In this paper, we describe design and implementation of optimal smart home control system. Recent developments in technologies such as sensors and communication have enabled the Internet of Things to control a wide range of objects, such as light bulbs, socket-outlet, or clothing. Many businesses rely on the launch of collaborative services between them. However, traditional IoT systems often support a single protocol, although data is transmitted across multiple protocols for end-to-end devices. In addition, depending on the manufacturer of the Internet of things, there is a dedicated application and it has a high degree of complexity in registering and controlling different IoT devices for the internet of things. ARIoT system, special marking points and edge extraction techniques are used to detect objects, but there are relatively low deviations depending on the sampling data. The proposed system implements an IoT gateway of object based on OneM2M to compensate for existing problems. It supports diverse protocols of end to end devices and supported them with a single application. In addition, devices were learned by using deep learning in the artificial intelligence field and improved object recognition of existing systems by inference and detection, reducing the deviation of recognition rates.

Development and evaluation of edge devices for injection molding monitoring (사출성형공정 모니터링용 엣지 디바이스 개발 및 평가)

  • Kim, Jong-Sun;Lee, Jun-Han
    • Design & Manufacturing
    • /
    • v.14 no.4
    • /
    • pp.25-39
    • /
    • 2020
  • In this study, an edge device that monitors the injection molding process by measuring the mold vibration(acceleration) signal and the mold surface temperature was developed and evaluated its performance. During injection molding, signals of the injection start, V/P switchover, and packing end sections were obtained through the measurement of the mold vibration and the injection time and packing time were calculated by using the difference between the times of the sections. Then, the mold closed and mold open signals were obtained using a magnetic hall sensor, and cycle time was calculated by using the time difference between the mold closed time each process. As a result of evaluating the performance by comparing the process data monitored by the edge device with the shot data recorded on the injection molding machine, the cycle time, injection time, and packing time showed very small error of 0.70±0.38%, 1.40±1.17%, and 0.69±0.82%, respectively, and the values close to the actual were monitored and the accuracy and reliability of the edge device were confirmed. In addition, it was confirmed that the mold surface temperature measured by the edge device was similar to the actual mold surface temperature.

TPMP: A Privacy-Preserving Technique for DNN Prediction Using ARM TrustZone (TPMP : ARM TrustZone을 활용한 DNN 추론 과정의 기밀성 보장 기술)

  • Song, Suhyeon;Park, Seonghwan;Kwon, Donghyun
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.3
    • /
    • pp.487-499
    • /
    • 2022
  • Machine learning such as deep learning have been widely used in recent years. Recently deep learning is performed in a trusted execution environment such as ARM TrustZone to improve security in edge devices and embedded devices with low computing resource. To mitigate this problem, we propose TPMP that efficiently uses the limited memory of TEE through DNN model partitioning. TPMP achieves high confidentiality of DNN by performing DNN models that could not be run with existing memory scheduling methods in TEE through optimized memory scheduling. TPMP required a similar amount of computational resources to previous methodologies.

A Performance Comparison of Parallel Programming Models on Edge Devices (엣지 디바이스에서의 병렬 프로그래밍 모델 성능 비교 연구)

  • Dukyun Nam
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.4
    • /
    • pp.165-172
    • /
    • 2023
  • Heterogeneous computing is a technology that utilizes different types of processors to perform parallel processing. It maximizes task processing and energy efficiency by leveraging various computing resources such as CPUs, GPUs, and FPGAs. On the other hand, edge computing has developed with IoT and 5G technologies. It is a distributed computing that utilizes computing resources close to clients, thereby offloading the central server. It has evolved to intelligent edge computing combined with artificial intelligence. Intelligent edge computing enables total data processing, such as context awareness, prediction, control, and simple processing for the data collected on the edge. If heterogeneous computing can be successfully applied in the edge, it is expected to maximize job processing efficiency while minimizing dependence on the central server. In this paper, experiments were conducted to verify the feasibility of various parallel programming models on high-end and low-end edge devices by using benchmark applications. We analyzed the performance of five parallel programming models on the Raspberry Pi 4 and Jetson Orin Nano as low-end and high-end devices, respectively. In the experiment, OpenACC showed the best performance on the low-end edge device and OpenSYCL on the high-end device due to the stability and optimization of system libraries.

A group-wise attention based decoder for lightweight salient object detection on edge-devices (엣지 디바이스에서 객체 탐지를 위한 그룹별 어탠션 기반 경량 디코더 연구)

  • Thien-Thu Ngo;Md Delowar Hossain;Eui-Nam Huh
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.30-33
    • /
    • 2023
  • The recent scholarly focus has been directed towards the expeditious and accurate detection of salient objects, a task that poses considerable challenges for resource-limited edge devices due to the high computational demands of existing models. To mitigate this issue, some contemporary research has favored inference speed at the expense of accuracy. In an effort to reconcile the intrinsic trade-off between accuracy and computational efficiency, we present novel model for salient object detection. Our model incorporate group-wise attentive module within the decoder of the encoder-decoder framework, with the aim of minimizing computational overhead while preserving detection accuracy. Additionally, the proposed architectural design employs attention mechanisms to generate boundary information and semantic features pertinent to the salient objects. Through various experimentation across five distinct datasets, we have empirically substantiated that our proposed models achieve performance metrics comparable to those of computationally intensive state-of-the-art models, yet with a marked reduction in computational complexity.