• Title/Summary/Keyword: VGG Net

Search Result 86, Processing Time 0.028 seconds

Knowledge Distillation based-on Internal/External Correlation Learning

  • Hun-Beom Bak;Seung-Hwan Bae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.4
    • /
    • pp.31-39
    • /
    • 2023
  • In this paper, we propose an Internal/External Knowledge Distillation (IEKD), which utilizes both external correlations between feature maps of heterogeneous models and internal correlations between feature maps of the same model for transferring knowledge from a teacher model to a student model. To achieve this, we transform feature maps into a sequence format and extract new feature maps suitable for knowledge distillation by considering internal and external correlations through a transformer. We can learn both internal and external correlations by distilling the extracted feature maps and improve the accuracy of the student model by utilizing the extracted feature maps with feature matching. To demonstrate the effectiveness of our proposed knowledge distillation method, we achieved 76.23% Top-1 image classification accuracy on the CIFAR-100 dataset with the "ResNet-32×4/VGG-8" teacher and student combination and outperformed the state-of-the-art KD methods.

Deep learning-based de-fogging method using fog features to solve the domain shift problem (Domain Shift 문제를 해결하기 위해 안개 특징을 이용한 딥러닝 기반 안개 제거 방법)

  • Sim, Hwi Bo;Kang, Bong Soon
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.10
    • /
    • pp.1319-1325
    • /
    • 2021
  • It is important to remove fog for accurate object recognition and detection during preprocessing because images taken in foggy adverse weather suffer from poor quality of images due to scattering and absorption of light, resulting in poor performance of various vision-based applications. This paper proposes an end-to-end deep learning-based single image de-fogging method using U-Net architecture. The loss function used in the algorithm is a loss function based on Mahalanobis distance with fog features, which solves the problem of domain shifts, and demonstrates superior performance by comparing qualitative and quantitative numerical evaluations with conventional methods. We also design it to generate fog through the VGG19 loss function and use it as the next training dataset.

Analysis of Cultural Context of Image Search with Deep Transfer Learning (심층 전이 학습을 이용한 이미지 검색의 문화적 특성 분석)

  • Kim, Hyeon-sik;Jeong, Jin-Woo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.5
    • /
    • pp.674-677
    • /
    • 2020
  • The cultural background of users utilizing image search engines has a significant impact on the satisfaction of the search results. Therefore, it is important to analyze and understand the cultural context of images for more accurate image search. In this paper, we investigate how the cultural context of images can affect the performance of image classification. To this end, we first collected various types of images (e.g,. food, temple, etc.) with various cultural contexts (e.g., Korea, Japan, etc.) from web search engines. Afterwards, a deep transfer learning approach using VGG19 and MobileNetV2 pre-trained with ImageNet was adopted to learn the cultural features of the collected images. Through various experiments we show the performance of image classification can be differently affected according to the cultural context of images.

Compression of CNN Using Low-Rank Approximation and CP Decomposition Methods (저계수행렬 근사 및 CP 분해 기법을 이용한 CNN 압축)

  • Moon, Hyeon-Cheol;Moon, Gi-Hwa;Kim, Jae-Gon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.11a
    • /
    • pp.133-135
    • /
    • 2020
  • 최근 CNN(Convolutional Neural Network)은 영상 분류, 객체 인식 등 다양한 비전 분야에서 우수한 성능을 보여주고 있으나, CNN 모델의 계산량 및 메모리가 매우 커짐에 따라 모바일 또는 IoT(lnternet of Things) 장치와 같은 저전력 환경에 적용되기에는 제한이 따른다. 따라서, CNN 모델의 임무 성능을 유지하연서 네트워크 모델을 압축하는 기법들이 연구되고 있다. 본 논문에서는 행렬 분해 기술인 저계수행렬 근사(Low-rank approximation)와 CP(Canonical Polyadic) 분해 기법을 결합하여 CNN 모델을 압축하는 기법을 제안한다. 제안하는 기법은 계층의 유형에 상관없이 하나의 행렬분해 기법만을 적용하는 기존의 기법과 달리 압축 성능을 높이기 위하여 CNN의 계층 타입에 따라 두 가지 분해 기법을 선택적으로 적용한다. 제안기법의 성능검증을 위하여 영상 분류 CNN 모델인 VGG-16, ResNet50, 그리고 MobileNetV2 모델 압축에 적용하였고, 모델의 계층 유형에 따라 두 가지의 분해 기법을 선택적으로 적용함으로써 저계수행렬 근사 기법만 적용한 경우 보다 1.5~12.1 배의 동일한 압축율에서 분류 성능이 향상됨을 확인하였다.

  • PDF

A Lightweight Deep Learning Model for Line-Art Colorization Using Two Stage Generator Model (이중 생성자를 사용한 저용량 선화 자동채색 모델)

  • Lee, Yeongseop;Lee, Seongjin
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2020.01a
    • /
    • pp.19-20
    • /
    • 2020
  • 미디어 산업의 발전으로 스토리보드와 같은 선화 이미지의 자동채색 연구가 국내외에서 진행되고 있다. 하지만 자동채색 모델 용량에 초점을 두는 연구는 아직 진행되고 있지 않다. 기존 자동채색 연구는 모델 용량이 최소 567MB 이상으로 모델 용량이 큰 단점을 가지고 있다. 본 논문에서는 채색을 2단계로 나누는 이중 생성자 구조와 기존 U-Net을 개선한 생성자를 사용해 기존 U-Net에 비해 30%, VGG16/19를 사용한 기법과 비교해 최대 85% 작은 106MB 모델을 생성했고 FID(Fréchet Inception Distance)를 통한 이미지 평가결과 512x512px에서 153.69의 채색성능을 얻었다.

  • PDF

Compression and Performance Evaluation of CNN Models on Embedded Board (임베디드 보드에서의 CNN 모델 압축 및 성능 검증)

  • Moon, Hyeon-Cheol;Lee, Ho-Young;Kim, Jae-Gon
    • Journal of Broadcast Engineering
    • /
    • v.25 no.2
    • /
    • pp.200-207
    • /
    • 2020
  • Recently, deep neural networks such as CNN are showing excellent performance in various fields such as image classification, object recognition, visual quality enhancement, etc. However, as the model size and computational complexity of deep learning models for most applications increases, it is hard to apply neural networks to IoT and mobile environments. Therefore, neural network compression algorithms for reducing the model size while keeping the performance have been being studied. In this paper, we apply few compression methods to CNN models and evaluate their performances in the embedded environment. For evaluate the performance, the classification performance and inference time of the original CNN models and the compressed CNN models on the image inputted by the camera are evaluated in the embedded board equipped with QCS605, which is a customized AI chip. In this paper, a few CNN models of MobileNetV2, ResNet50, and VGG-16 are compressed by applying the methods of pruning and matrix decomposition. The experimental results show that the compressed models give not only the model size reduction of 1.3~11.2 times at a classification performance loss of less than 2% compared to the original model, but also the inference time reduction of 1.2~2.21 times, and the memory reduction of 1.2~3.8 times in the embedded board.

Compression of CNN Using Low-Rank Approximation and CP Decomposition Methods (저계수 행렬 근사 및 CP 분해 기법을 이용한 CNN 압축)

  • Moon, HyeonCheol;Moon, Gihwa;Kim, Jae-Gon
    • Journal of Broadcast Engineering
    • /
    • v.26 no.2
    • /
    • pp.125-131
    • /
    • 2021
  • In recent years, Convolutional Neural Networks (CNNs) have achieved outstanding performance in the fields of computer vision such as image classification, object detection, visual quality enhancement, etc. However, as huge amount of computation and memory are required in CNN models, there is a limitation in the application of CNN to low-power environments such as mobile or IoT devices. Therefore, the need for neural network compression to reduce the model size while keeping the task performance as much as possible has been emerging. In this paper, we propose a method to compress CNN models by combining matrix decomposition methods of LR (Low-Rank) approximation and CP (Canonical Polyadic) decomposition. Unlike conventional methods that apply one matrix decomposition method to CNN models, we selectively apply two decomposition methods depending on the layer types of CNN to enhance the compression performance. To evaluate the performance of the proposed method, we use the models for image classification such as VGG-16, RestNet50 and MobileNetV2 models. The experimental results show that the proposed method gives improved classification performance at the same range of 1.5 to 12.1 times compression ratio than the existing method that applies only the LR approximation.

Food Detection by Fine-Tuning Pre-trained Convolutional Neural Network Using Noisy Labels

  • Alshomrani, Shroog;Aljoudi, Lina;Aljabri, Banan;Al-Shareef, Sarah
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.7
    • /
    • pp.182-190
    • /
    • 2021
  • Deep learning is an advanced technology for large-scale data analysis, with numerous promising cases like image processing, object detection and significantly more. It becomes customarily to use transfer learning and fine-tune a pre-trained CNN model for most image recognition tasks. Having people taking photos and tag themselves provides a valuable resource of in-data. However, these tags and labels might be noisy as people who annotate these images might not be experts. This paper aims to explore the impact of noisy labels on fine-tuning pre-trained CNN models. Such effect is measured on a food recognition task using Food101 as a benchmark. Four pre-trained CNN models are included in this study: InceptionV3, VGG19, MobileNetV2 and DenseNet121. Symmetric label noise will be added with different ratios. In all cases, models based on DenseNet121 outperformed the other models. When noisy labels were introduced to the data, the performance of all models degraded almost linearly with the amount of added noise.

Evaluation of Deep Learning Model for Scoliosis Pre-Screening Using Preprocessed Chest X-ray Images

  • Min Gu Jang;Jin Woong Yi;Hyun Ju Lee;Ki Sik Tae
    • Journal of Biomedical Engineering Research
    • /
    • v.44 no.4
    • /
    • pp.293-301
    • /
    • 2023
  • Scoliosis is a three-dimensional deformation of the spine that is a deformity induced by physical or disease-related causes as the spine is rotated abnormally. Early detection has a significant influence on the possibility of nonsurgical treatment. To train a deep learning model with preprocessed images and to evaluate the results with and without data augmentation to enable the diagnosis of scoliosis based only on a chest X-ray image. The preprocessed images in which only the spine, rib contours, and some hard tissues were left from the original chest image, were used for learning along with the original images, and three CNN(Convolutional Neural Networks) models (VGG16, ResNet152, and EfficientNet) were selected to proceed with training. The results obtained by training with the preprocessed images showed a superior accuracy to those obtained by training with the original image. When the scoliosis image was added through data augmentation, the accuracy was further improved, ultimately achieving a classification accuracy of 93.56% with the ResNet152 model using test data. Through supplementation with future research, the method proposed herein is expected to allow the early diagnosis of scoliosis as well as cost reduction by reducing the burden of additional radiographic imaging for disease detection.

Face Frontalization Model with A.I. Based on U-Net using Convolutional Neural Network (합성곱 신경망(CNN)을 이용한 U-Net 기반의 인공지능 안면 정면화 모델)

  • Lee, Sangmin;Son, Wonho;Jin, ChangGyun;Kim, Ji-Hyun;Kim, JiYun;Park, Naeun;Kim, Gaeun;Kwon, Jin young;Lee, Hye Yi;Kim, Jongwan;Oh, Dukshin
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.685-688
    • /
    • 2020
  • 안면 인식은 Face ID를 비롯하여 미아 찾기, 범죄자 추적 등의 분야에 도입되고 있다. 안면 인식은 최근 딥러닝을 통해 인식률이 향상되었으나, 측면에서의 인식률은 정면에 비해 특징 추출이 어려우므로 비교적 낮다. 이런 문제는 해당 인물의 정면이 없고 측면만 존재할 경우 안면 인식을 통한 신원확인이 어려워 단점으로 작용될 수 있다. 본 논문에서는 측면 이미지를 바탕으로 정면을 생성함으로써 안면 인식을 적용할 수 있는 상황을 확장하는 인공지능 기반의 안면 정면화 모델을 구현한다. 모델의 안면 특징 추출을 위해 VGG-Face를 사용하며 특징 추출에서 생길 수 있는 정보 손실을 막기 위해 U-Net 구조를 사용한다.