• Title/Summary/Keyword: VGG-16

Search Result 123, Processing Time 0.021 seconds

Deep Neural Network compression based on clustering of per layer in frequency domain (주파수 영역에서의 군집화 기반 계층별 딥 뉴럴 네트워크 압축)

  • Hong, Minsoo;Kim, Sungjei;Jeong, Jinwoo
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.11a
    • /
    • pp.64-67
    • /
    • 2020
  • 최근 다양한 분야에서 딥 러닝 기반의 많은 연구가 진행되고 있으며 이에 따라 딥 러닝 모델의 경량화를 통해 제한된 메모리를 가진 하드웨어에 올릴 수 있는 경량화 된 딥 뉴럴 네트워크(DNN)를 개발하는 연구도 활발해졌다. 이에 본 논문은 주파수 영역에서의 군집화 기반 계층별 딥 뉴럴 네트워크 압축을 제안한다. 이산 코사인 변환, 양자화, 군집화, 적응적 엔트로피 코딩 과정을 각 모델의 계층에 순차적으로 적용하여 DNN이 차지하는 메모리를 줄인다. 제안한 알고리즘을 통해 VGG16을 손실률은 1% 미만의 손실에서 전체 가중치를 3.98%까지 압축, 약 25배가량 경량화 할 수 있었다.

  • PDF

Intra-Class Random Erasing (ICRE) augmentation for audio classification

  • Kumar, Teerath;Park, Jinbae;Bae, Sung-Ho
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.11a
    • /
    • pp.244-247
    • /
    • 2020
  • Data augmentation has been helpful in improving the performance in deep learning, when we have a limited data and random erasing is one of the augmentations that have shown impressive performance in deep learning in multiple domains. But the main issue is that sometime it loses good features when randomly selected region is erased by some random values, that does not improve performance as it should. We target that problem in way that good features should not be lost and also want random erasing at the same time. For that purpose, we introduce new augmentation technique named Intra-Class Random Erasing (ICRE) that focuses on data to learn robust features of the same class samples by randomly exchanging randomly selected region. We perform multiple experiments by using different models including resnet18, VGG16 over variety of the datasets including ESC10, UrbanSound8K. Our approach has shown effectiveness over others methods including random erasing.

  • PDF

Accident Detection System in Tunnel using CCTV (CCTV를 이용한 터널내 사고감지 시스템)

  • Lee, Se-Hoon;Lee, Seung-Yeob;Noh, Yeong-Hun
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.3-4
    • /
    • 2021
  • 폐쇄된 터널 내부에서는 사고가 일어날 경우 외부에서는 터널 내 상황을 알 수가 없어 경미한 사고라 하더라도 대형 후속 2차 사고로 이어질 가능성이 크다. 또한영상탐지로사고 상황의 오검출을 줄이기 위해서, 본 연구에서는기존의 많은 CNN 모델 중 보유한 데이터에 가장 적합한 모델을 선택하는 과정에서 가장 좋은 성능을 보인 VGG16 모델을 전이학습 시키고 fully connected layer의 일부 layer에 Dropout을 적용시켜 Overfitting을일부방지하는 CNN 모델을 생성한 뒤Yolo를 이용한 영상 내 객체인식, OpenCV를 이용한 영상 프레임 내에서 객체의ROI를 추출하고이를 CNN 모델과 비교하여오검출을 줄이면서 사고를 검출하는 시스템을 제안하였다.

  • PDF

A Lightweight Deep Learning Model for Line-Art Colorization Using Two Stage Generator Model (이중 생성자를 사용한 저용량 선화 자동채색 모델)

  • Lee, Yeongseop;Lee, Seongjin
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2020.01a
    • /
    • pp.19-20
    • /
    • 2020
  • 미디어 산업의 발전으로 스토리보드와 같은 선화 이미지의 자동채색 연구가 국내외에서 진행되고 있다. 하지만 자동채색 모델 용량에 초점을 두는 연구는 아직 진행되고 있지 않다. 기존 자동채색 연구는 모델 용량이 최소 567MB 이상으로 모델 용량이 큰 단점을 가지고 있다. 본 논문에서는 채색을 2단계로 나누는 이중 생성자 구조와 기존 U-Net을 개선한 생성자를 사용해 기존 U-Net에 비해 30%, VGG16/19를 사용한 기법과 비교해 최대 85% 작은 106MB 모델을 생성했고 FID(Fréchet Inception Distance)를 통한 이미지 평가결과 512x512px에서 153.69의 채색성능을 얻었다.

  • PDF

An Enhanced Deep Learning for Animal Image Based on Small Datasets (적은 데이터 세트를 기반으로 한 동물 이미지의 향상된 딥 러닝)

  • Shin, Seong-Yoon;Shin, Kwang-Seong;Lee, Hyun-Chang
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2020.01a
    • /
    • pp.247-248
    • /
    • 2020
  • 본 논문은 동물 이미지 분류를 한 작은 데이터 세트를 기반으로 개선 된 딥 러닝 방법을 제안한다. 먼저, 소규모 데이터 세트에 대한 훈련 모델을 구축하기 위한 CNN이 사용되는 반면, 데이터 보강은 훈련 세트의 데이터 샘플을 확장하는 데 사용한다. 둘째, VGG16과 같은 대규모 데이터 세트에서 사전 훈련 된 네트워크를 사용하여 소규모 데이터 세트의 병목 현상 기능을 추출하여 두 개의 NumPy 파일에 새로운 학습 데이터 세트 및 테스트 데이터 세트로 저장한다. 마지막으로 새로운 데이터 세트로 완전히 연결된 네트워크를 학습한다.

  • PDF

Compression and Performance Evaluation of CNN Models on Embedded Board (임베디드 보드에서의 CNN 모델 압축 및 성능 검증)

  • Moon, Hyeon-Cheol;Lee, Ho-Young;Kim, Jae-Gon
    • Journal of Broadcast Engineering
    • /
    • v.25 no.2
    • /
    • pp.200-207
    • /
    • 2020
  • Recently, deep neural networks such as CNN are showing excellent performance in various fields such as image classification, object recognition, visual quality enhancement, etc. However, as the model size and computational complexity of deep learning models for most applications increases, it is hard to apply neural networks to IoT and mobile environments. Therefore, neural network compression algorithms for reducing the model size while keeping the performance have been being studied. In this paper, we apply few compression methods to CNN models and evaluate their performances in the embedded environment. For evaluate the performance, the classification performance and inference time of the original CNN models and the compressed CNN models on the image inputted by the camera are evaluated in the embedded board equipped with QCS605, which is a customized AI chip. In this paper, a few CNN models of MobileNetV2, ResNet50, and VGG-16 are compressed by applying the methods of pruning and matrix decomposition. The experimental results show that the compressed models give not only the model size reduction of 1.3~11.2 times at a classification performance loss of less than 2% compared to the original model, but also the inference time reduction of 1.2~2.21 times, and the memory reduction of 1.2~3.8 times in the embedded board.

Compression of CNN Using Low-Rank Approximation and CP Decomposition Methods (저계수 행렬 근사 및 CP 분해 기법을 이용한 CNN 압축)

  • Moon, HyeonCheol;Moon, Gihwa;Kim, Jae-Gon
    • Journal of Broadcast Engineering
    • /
    • v.26 no.2
    • /
    • pp.125-131
    • /
    • 2021
  • In recent years, Convolutional Neural Networks (CNNs) have achieved outstanding performance in the fields of computer vision such as image classification, object detection, visual quality enhancement, etc. However, as huge amount of computation and memory are required in CNN models, there is a limitation in the application of CNN to low-power environments such as mobile or IoT devices. Therefore, the need for neural network compression to reduce the model size while keeping the task performance as much as possible has been emerging. In this paper, we propose a method to compress CNN models by combining matrix decomposition methods of LR (Low-Rank) approximation and CP (Canonical Polyadic) decomposition. Unlike conventional methods that apply one matrix decomposition method to CNN models, we selectively apply two decomposition methods depending on the layer types of CNN to enhance the compression performance. To evaluate the performance of the proposed method, we use the models for image classification such as VGG-16, RestNet50 and MobileNetV2 models. The experimental results show that the proposed method gives improved classification performance at the same range of 1.5 to 12.1 times compression ratio than the existing method that applies only the LR approximation.

Development of Fender Segmentation System for Port Structures using Vision Sensor and Deep Learning (비전센서 및 딥러닝을 이용한 항만구조물 방충설비 세분화 시스템 개발)

  • Min, Jiyoung;Yu, Byeongjun;Kim, Jonghyeok;Jeon, Haemin
    • Journal of the Korea institute for structural maintenance and inspection
    • /
    • v.26 no.2
    • /
    • pp.28-36
    • /
    • 2022
  • As port structures are exposed to various extreme external loads such as wind (typhoons), sea waves, or collision with ships; it is important to evaluate the structural safety periodically. To monitor the port structure, especially the rubber fender, a fender segmentation system using a vision sensor and deep learning method has been proposed in this study. For fender segmentation, a new deep learning network that improves the encoder-decoder framework with the receptive field block convolution module inspired by the eccentric function of the human visual system into the DenseNet format has been proposed. In order to train the network, various fender images such as BP, V, cell, cylindrical, and tire-types have been collected, and the images are augmented by applying four augmentation methods such as elastic distortion, horizontal flip, color jitter, and affine transforms. The proposed algorithm has been trained and verified with the collected various types of fender images, and the performance results showed that the system precisely segmented in real time with high IoU rate (84%) and F1 score (90%) in comparison with the conventional segmentation model, VGG16 with U-net. The trained network has been applied to the real images taken at one port in Republic of Korea, and found that the fenders are segmented with high accuracy even with a small dataset.

Study on Image Use for Plant Disease Classification (작물의 병충해 분류를 위한 이미지 활용 방법 연구)

  • Jeong, Seong-Ho;Han, Jeong-Eun;Jeong, Seong-Kyun;Bong, Jae-Hwan
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.2
    • /
    • pp.343-350
    • /
    • 2022
  • It is worth verifying the effectiveness of data integration between data with different features. This study investigated whether the data integration affects the accuracy of deep neural network (DNN), and which integration method shows the best improvement. This study used two different public datasets. One public dataset was taken in an actual farm in India. And another was taken in a laboratory environment in Korea. Leaf images were selected from two different public datasets to have five classes which includes normal and four different types of plant diseases. DNN used pre-trained VGG16 as a feature extractor and multi-layer perceptron as a classifier. Data were integrated into three different ways to be used for the training process. DNN was trained in a supervised manner via the integrated data. The trained DNN was evaluated by using a test dataset taken in an actual farm. DNN shows the best accuracy for the test dataset when DNN was first trained by images taken in the laboratory environment and then trained by images taken in the actual farm. The results show that data integration between plant images taken in a different environment helps improve the performance of deep neural networks. And the results also confirmed that independent use of plant images taken in different environments during the training process is more effective in improving the performance of DNN.

A Study on Biometric Model for Information Security (정보보안을 위한 생체 인식 모델에 관한 연구)

  • Jun-Yeong Kim;Se-Hoon Jung;Chun-Bo Sim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.1
    • /
    • pp.317-326
    • /
    • 2024
  • Biometric recognition is a technology that determines whether a person is identified by extracting information on a person's biometric and behavioral characteristics with a specific device. Cyber threats such as forgery, duplication, and hacking of biometric characteristics are increasing in the field of biometrics. In response, the security system is strengthened and complex, and it is becoming difficult for individuals to use. To this end, multiple biometric models are being studied. Existing studies have suggested feature fusion methods, but comparisons between feature fusion methods are insufficient. Therefore, in this paper, we compared and evaluated the fusion method of multiple biometric models using fingerprint, face, and iris images. VGG-16, ResNet-50, EfficientNet-B1, EfficientNet-B4, EfficientNet-B7, and Inception-v3 were used for feature extraction, and the fusion methods of 'Sensor-Level', 'Feature-Level', 'Score-Level', and 'Rank-Level' were compared and evaluated for feature fusion. As a result of the comparative evaluation, the EfficientNet-B7 model showed 98.51% accuracy and high stability in the 'Feature-Level' fusion method. However, because the EfficietnNet-B7 model is large in size, model lightweight studies are needed for biocharacteristic fusion.