• Title/Summary/Keyword: deep neural network compression

Search Result 35, Processing Time 0.024 seconds

Deep Neural Network compression based on clustering of per layer in frequency domain (주파수 영역에서의 군집화 기반 계층별 딥 뉴럴 네트워크 압축)

  • Hong, Minsoo;Kim, Sungjei;Jeong, Jinwoo
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.11a
    • /
    • pp.64-67
    • /
    • 2020
  • 최근 다양한 분야에서 딥 러닝 기반의 많은 연구가 진행되고 있으며 이에 따라 딥 러닝 모델의 경량화를 통해 제한된 메모리를 가진 하드웨어에 올릴 수 있는 경량화 된 딥 뉴럴 네트워크(DNN)를 개발하는 연구도 활발해졌다. 이에 본 논문은 주파수 영역에서의 군집화 기반 계층별 딥 뉴럴 네트워크 압축을 제안한다. 이산 코사인 변환, 양자화, 군집화, 적응적 엔트로피 코딩 과정을 각 모델의 계층에 순차적으로 적용하여 DNN이 차지하는 메모리를 줄인다. 제안한 알고리즘을 통해 VGG16을 손실률은 1% 미만의 손실에서 전체 가중치를 3.98%까지 압축, 약 25배가량 경량화 할 수 있었다.

  • PDF

Dynamic Filter Pruning for Compression of Deep Neural Network. (동적 필터 프루닝 기법을 이용한 심층 신경망 압축)

  • Cho, InCheon;Bae, SungHo
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.07a
    • /
    • pp.675-679
    • /
    • 2020
  • 최근 이미지 분류의 성능 향상을 위해 깊은 레이어와 넓은 채널을 가지는 모델들이 제안되어져 왔다. 높은 분류 정확도를 보이는 모델을 제안하는 것은 과한 컴퓨팅 파워와 계산시간을 요구한다. 본 논문에서는 이미지 분류 기법에서 사용되는 딥 뉴럴 네트워크 모델에 있어, 프루닝 방법을 통해 상대적으로 불필요한 가중치를 제거함과 동시에 분류 정확도 하락을 최소로 하는 동적 필터 프루닝 방법을 제시한다. 원샷 프루닝 기법, 정적 필터 프루닝 기법과 다르게 제거된 가중치에 대해서 소생 기회를 제공함으로써 더 좋은 성능을 보인다. 또한, 재학습이 필요하지 않기 때문에 빠른 계산 속도와 적은 컴퓨팅 파워를 보장한다. ResNet20 에서 CIFAR10 데이터셋에 대하여 실험한 결과 약 50%의 압축률에도 88.74%의 분류 정확도를 보였다.

  • PDF

Performance Evaluation of Efficient Vision Transformers on Embedded Edge Platforms (임베디드 엣지 플랫폼에서의 경량 비전 트랜스포머 성능 평가)

  • Minha Lee;Seongjae Lee;Taehyoun Kim
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.3
    • /
    • pp.89-100
    • /
    • 2023
  • Recently, on-device artificial intelligence (AI) solutions using mobile devices and embedded edge devices have emerged in various fields, such as computer vision, to address network traffic burdens, low-energy operations, and security problems. Although vision transformer deep learning models have outperformed conventional convolutional neural network (CNN) models in computer vision, they require more computations and parameters than CNN models. Thus, they are not directly applicable to embedded edge devices with limited hardware resources. Many researchers have proposed various model compression methods or lightweight architectures for vision transformers; however, there are only a few studies evaluating the effects of model compression techniques of vision transformers on performance. Regarding this problem, this paper presents a performance evaluation of vision transformers on embedded platforms. We investigated the behaviors of three vision transformers: DeiT, LeViT, and MobileViT. Each model performance was evaluated by accuracy and inference time on edge devices using the ImageNet dataset. We assessed the effects of the quantization method applied to the models on latency enhancement and accuracy degradation by profiling the proportion of response time occupied by major operations. In addition, we evaluated the performance of each model on GPU and EdgeTPU-based edge devices. In our experimental results, LeViT showed the best performance in CPU-based edge devices, and DeiT-small showed the highest performance improvement in GPU-based edge devices. In addition, only MobileViT models showed performance improvement on EdgeTPU. Summarizing the analysis results through profiling, the degree of performance improvement of each vision transformer model was highly dependent on the proportion of parts that could be optimized in the target edge device. In summary, to apply vision transformers to on-device AI solutions, either proper operation composition and optimizations specific to target edge devices must be considered.

Lightweight of ONNX using Quantization-based Model Compression (양자화 기반의 모델 압축을 이용한 ONNX 경량화)

  • Chang, Duhyeuk;Lee, Jungsoo;Heo, Junyoung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.1
    • /
    • pp.93-98
    • /
    • 2021
  • Due to the development of deep learning and AI, the scale of the model has grown, and it has been integrated into other fields to blend into our lives. However, in environments with limited resources such as embedded devices, it is exist difficult to apply the model and problems such as power shortages. To solve this, lightweight methods such as clouding or offloading technologies, reducing the number of parameters in the model, or optimising calculations are proposed. In this paper, quantization of learned models is applied to ONNX models used in various framework interchange formats, neural network structure and inference performance are compared with existing models, and various module methods for quantization are analyzed. Experiments show that the size of weight parameter is compressed and the inference time is more optimized than before compared to the original model.

Performance Analysis of Object Detection Neural Network According to Compression Ratio of RGB and IR Images (RGB와 IR 영상의 압축률에 따른 객체 탐지 신경망 성능 분석)

  • Lee, Yegi;Kim, Shin;Lim, Hanshin;Lee, Hee Kyung;Choo, Hyon-Gon;Seo, Jeongil;Yoon, Kyoungro
    • Journal of Broadcast Engineering
    • /
    • v.26 no.2
    • /
    • pp.155-166
    • /
    • 2021
  • Most object detection algorithms are studied based on RGB images. Because the RGB cameras are capturing images based on light, however, the object detection performance is poor when the light condition is not good, e.g., at night or foggy days. On the other hand, high-quality infrared(IR) images regardless of weather condition and light can be acquired because IR images are captured by an IR sensor that makes images with heat information. In this paper, we performed the object detection algorithm based on the compression ratio in RGB and IR images to show the detection capabilities. We selected RGB and IR images that were taken at night from the Free FLIR Thermal dataset for the ADAS(Advanced Driver Assistance Systems) research. We used the pre-trained object detection network for RGB images and a fine-tuned network that is tuned based on night RGB and IR images. Experimental results show that higher object detection performance can be acquired using IR images than using RGB images in both networks.