• Title/Summary/Keyword: Fine Tuning

Search Result 333, Processing Time 0.028 seconds

Compression of DNN Integer Weight using Video Encoder (비디오 인코더를 통한 딥러닝 모델의 정수 가중치 압축)

  • Kim, Seunghwan;Ryu, Eun-Seok
    • Journal of Broadcast Engineering
    • /
    • v.26 no.6
    • /
    • pp.778-789
    • /
    • 2021
  • Recently, various lightweight methods for using Convolutional Neural Network(CNN) models in mobile devices have emerged. Weight quantization, which lowers bit precision of weights, is a lightweight method that enables a model to be used through integer calculation in a mobile environment where GPU acceleration is unable. Weight quantization has already been used in various models as a lightweight method to reduce computational complexity and model size with a small loss of accuracy. Considering the size of memory and computing speed as well as the storage size of the device and the limited network environment, this paper proposes a method of compressing integer weights after quantization using a video codec as a method. To verify the performance of the proposed method, experiments were conducted on VGG16, Resnet50, and Resnet18 models trained with ImageNet and Places365 datasets. As a result, loss of accuracy less than 2% and high compression efficiency were achieved in various models. In addition, as a result of comparison with similar compression methods, it was verified that the compression efficiency was more than doubled.

Why Should I Ban You! : X-FDS (Explainable FDS) Model Based on Online Game Payment Log (X-FDS : 게임 결제 로그 기반 XAI적용 이상 거래탐지 모델 연구)

  • Lee, Young Hun;Kim, Huy Kang
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.1
    • /
    • pp.25-38
    • /
    • 2022
  • With the diversification of payment methods and games, related financial accidents are causing serious problems for users and game companies. Recently, game companies have introduced an Fraud Detection System (FDS) for game payment systems to prevent financial incident. However, FDS is ineffective and cannot provide major evidence based on judgment results, as it requires constant change of detection patterns. In this paper, we analyze abnormal transactions among payment log data of real game companies to generate related features. One of the unsupervised learning models, Autoencoder, was used to build a model to detect abnormal transactions, which resulted in over 85% accuracy. Using X-FDS (Explainable FDS) with XAI-SHAP, we could understand that the variables with the highest explanation for anomaly detection were the amount of transaction, transaction medium, and the age of users. Based on X-FDS, we derive an improved detection model with an accuracy of 94% was finally derived by fine-tuning the importance of features that adversely affect the proposed model.

Malaria Cell Image Recognition Based On VGG19 Using Transfer Learning (전이 학습을 이용한 VGG19 기반 말라리아셀 이미지 인식)

  • Peng, Xiangshen;Kim, Kangchul
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.3
    • /
    • pp.483-490
    • /
    • 2022
  • Malaria is a disease caused by a parasite and it is prevalent in all over the world. The usual method used to recognize malaria cells is a thick and thin blood smears examination methods, but this method requires a lot of manual calculation, so the efficiency and accuracy are very low as well as the lack of pathologists in impoverished country has led to high malaria mortality rates. In this paper, a malaria cell image recognition model using transfer learning is proposed, which consists in the feature extractor, the residual structure and the fully connected layers. When the pre-training parameters of the VGG-19 model are imported to the proposed model, the parameters of some convolutional layers model are frozen and the fine-tuning method is used to fit the data for the model. Also we implement another malaria cell recognition model without residual structure to compare with the proposed model. The simulation results shows that the model using the residual structure gets better performance than the other model without residual structure and the proposed model has the best accuracy of 97.33% compared to other recent papers.

A Lightweight Pedestrian Intrusion Detection and Warning Method for Intelligent Traffic Security

  • Yan, Xinyun;He, Zhengran;Huang, Youxiang;Xu, Xiaohu;Wang, Jie;Zhou, Xiaofeng;Wang, Chishe;Lu, Zhiyi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.12
    • /
    • pp.3904-3922
    • /
    • 2022
  • As a research hotspot, pedestrian detection has a wide range of applications in the field of computer vision in recent years. However, current pedestrian detection methods have problems such as insufficient detection accuracy and large models that are not suitable for large-scale deployment. In view of these problems mentioned above, a lightweight pedestrian detection and early warning method using a new model called you only look once (Yolov5) is proposed in this paper, which utilizing advantages of Yolov5s model to achieve accurate and fast pedestrian recognition. In addition, this paper also optimizes the loss function of the batch normalization (BN) layer. After sparsification, pruning and fine-tuning, got a lot of optimization, the size of the model on the edge of the computing power is lower equipment can be deployed. Finally, from the experimental data presented in this paper, under the training of the road pedestrian dataset that we collected and processed independently, the Yolov5s model has certain advantages in terms of precision and other indicators compared with traditional single shot multiBox detector (SSD) model and fast region-convolutional neural network (Fast R-CNN) model. After pruning and lightweight, the size of training model is greatly reduced without a significant reduction in accuracy, and the final precision reaches 87%, while the model size is reduced to 7,723 KB.

Bandgap Engineering in CZTSSe Thin Films via Controlling S/(S+Se) Ratio

  • Vijay C. Karade;Jun Sung Jang;Kuldeep Singh, Gour;Yeonwoo Park;Hyeonwook, Park;Jin Hyeok Kim;Jae Ho Yun
    • Current Photovoltaic Research
    • /
    • v.11 no.3
    • /
    • pp.67-74
    • /
    • 2023
  • The earth-abundant element-based Cu2ZnSn(S,Se)4 (CZTSSe) thin film solar cells (TFSCs) have attracted greater attention in the photovoltaic (PV) community due to their rapid development in device power conversion efficiency (PCE) >13%. In the present work, we demonstrated the fine-tuning of the bandgap in the CZTSSe TFSCs by altering the sulfur (S) to the selenium (Se) chalcogenide ratio. To achieve this, the CZTSSe absorber layers are fabricated with different S/(S+Se) ratios from 0.02 to 0.08 of their weight percentage. Further compositional, morphological, and optoelectronic properties are studied using various characterization techniques. It is observed that the change in the S/(S+Se) ratios has minimal impact on the overall Cu/(Zn+Sn) composition ratio. In contrast, the S and Se content within the CZTSSe absorber layer gets altered with a change in the S/(S+Se) ratio. It also influences the overall absorber quality and gets worse at higher S/(S+Se). Furthermore, the device performance evaluated for similar CZTSSe TFSCs showed a linear increase and decrease in the open circuit voltage (Voc) and short circuit current density (Jsc) of the device with an increasing S/(S+Se) ratio. The external quantum efficiency (EQE) measured also exhibited a linear blue shift in absorption edge, increasing the bandgap from 1.056 eV to 1.228 eV, respectively.

Study on the Application of Artificial Intelligence Model for CT Quality Control (CT 정도관리를 위한 인공지능 모델 적용에 관한 연구)

  • Ho Seong Hwang;Dong Hyun Kim;Ho Chul Kim
    • Journal of Biomedical Engineering Research
    • /
    • v.44 no.3
    • /
    • pp.182-189
    • /
    • 2023
  • CT is a medical device that acquires medical images based on Attenuation coefficient of human organs related to X-rays. In addition, using this theory, it can acquire sagittal and coronal planes and 3D images of the human body. Then, CT is essential device for universal diagnostic test. But Exposure of CT scan is so high that it is regulated and managed with special medical equipment. As the special medical equipment, CT must implement quality control. In detail of quality control, Spatial resolution of existing phantom imaging tests, Contrast resolution and clinical image evaluation are qualitative tests. These tests are not objective, so the reliability of the CT undermine trust. Therefore, by applying an artificial intelligence classification model, we wanted to confirm the possibility of quantitative evaluation of the qualitative evaluation part of the phantom test. We used intelligence classification models (VGG19, DenseNet201, EfficientNet B2, inception_resnet_v2, ResNet50V2, and Xception). And the fine-tuning process used for learning was additionally performed. As a result, in all classification models, the accuracy of spatial resolution was 0.9562 or higher, the precision was 0.9535, the recall was 1, the loss value was 0.1774, and the learning time was from a maximum of 14 minutes to a minimum of 8 minutes and 10 seconds. Through the experimental results, it was concluded that the artificial intelligence model can be applied to CT implements quality control in spatial resolution and contrast resolution.

Optimizing CNN Structure to Improve Accuracy of Artwork Artist Classification

  • Ji-Seon Park;So-Yeon Kim;Yeo-Chan Yoon;Soo Kyun Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.9
    • /
    • pp.9-15
    • /
    • 2023
  • Metaverse is a modern new technology that is advancing quickly. The goal of this study is to investigate this technique from the perspective of computer vision as well as general perspective. A thorough analysis of computer vision related Metaverse topics has been done in this study. Its history, method, architecture, benefits, and drawbacks are all covered. The Metaverse's future and the steps that must be taken to adapt to this technology are described. The concepts of Mixed Reality (MR), Augmented Reality (AR), Extended Reality (XR) and Virtual Reality (VR) are briefly discussed. The role of computer vision and its application, advantages and disadvantages and the future research areas are discussed.

Evaluating Korean Machine Reading Comprehension Generalization Performance using Cross and Blind Dataset Assessment (기계독해 데이터셋의 교차 평가 및 블라인드 평가를 통한 한국어 기계독해의 일반화 성능 평가)

  • Lim, Joon-Ho;Kim, Hyunki
    • Annual Conference on Human and Language Technology
    • /
    • 2019.10a
    • /
    • pp.213-218
    • /
    • 2019
  • 기계독해는 자연어로 표현된 질문과 단락이 주어졌을 때, 해당 단락 내에 표현된 정답을 찾는 태스크이다. 최근 기계독해 태스크도 다른 자연어처리 태스크와 유사하게 BERT, XLNet, RoBERTa와 같이 사전에 학습한 언어모델을 이용하고 질문과 단락이 입력되었을 경우 정답의 경계를 추가 학습(fine-tuning)하는 방법이 우수한 성능을 보이고 있으며, 특히 KorQuAD v1.0 데이터셋에서 학습 및 평가하였을 경우 94% F1 이상의 높은 성능을 보이고 있다. 본 논문에서는 현재 최고 수준의 기계독해 기술이 학습셋과 유사한 평가셋이 아닌 일반적인 질문과 단락 쌍에 대해서 가지는 일반화 능력을 평가하고자 한다. 이를 위하여 첫번째로 한국어에 대해서 공개된 KorQuAD v1.0 데이터셋과 NIA v2017 데이터셋, 그리고 엑소브레인 과제에서 구축한 엑소브레인 v2018 데이터셋을 이용하여 데이터셋 간의 교차 평가를 수행하였다. 교차 평가결과, 각 데이터셋의 정답의 길이, 질문과 단락 사이의 오버랩 비율과 같은 데이터셋 통계와 일반화 성능이 서로 관련이 있음을 확인하였다. 다음으로 KorBERT 사전 학습 언어모델과 학습 가능한 기계독해 데이터 셋 21만 건 전체를 이용하여 학습한 기계독해 모델에 대해 블라인드 평가셋 평가를 수행하였다. 블라인드 평가로 일반분야에서 학습한 기계독해 모델의 법률분야 평가셋에서의 일반화 성능을 평가하고, 정답 단락을 읽고 질문을 생성하지 않고 질문을 먼저 생성한 후 정답 단락을 검색한 평가셋에서의 기계독해 성능을 평가하였다. 블라인드 평가 결과, 사전 학습 언어 모델을 사용하지 않은 기계독해 모델 대비 사전 학습 언어 모델을 사용하는 모델이 큰 폭의 일반화 성능을 보였으나, 정답의 길이가 길고 질문과 단락 사이 어휘 오버랩 비율이 낮은 평가셋에서는 아직 80%이하의 성능을 보임을 확인하였다. 본 논문의 실험 결과 기계 독해 태스크는 특성 상 질문과 정답 사이의 어휘 오버랩 및 정답의 길이에 따라 난이도 및 일반화 성능 차이가 발생함을 확인하였고, 일반적인 질문과 단락을 대상으로 하는 기계독해 모델 개발을 위해서는 다양한 유형의 평가셋에서 일반화 평가가 필요함을 확인하였다.

  • PDF

A Survey on Deep Learning-based Pre-Trained Language Models (딥러닝 기반 사전학습 언어모델에 대한 이해와 현황)

  • Sangun Park
    • The Journal of Bigdata
    • /
    • v.7 no.2
    • /
    • pp.11-29
    • /
    • 2022
  • Pre-trained language models are the most important and widely used tools in natural language processing tasks. Since those have been pre-trained for a large amount of corpus, high performance can be expected even with fine-tuning learning using a small number of data. Since the elements necessary for implementation, such as a pre-trained tokenizer and a deep learning model including pre-trained weights, are distributed together, the cost and period of natural language processing has been greatly reduced. Transformer variants are the most representative pre-trained language models that provide these advantages. Those are being actively used in other fields such as computer vision and audio applications. In order to make it easier for researchers to understand the pre-trained language model and apply it to natural language processing tasks, this paper describes the definition of the language model and the pre-learning language model, and discusses the development process of the pre-trained language model and especially representative Transformer variants.

Study on the Vulnerabilities of Automatic Speech Recognition Models in Military Environments (군사적 환경에서 음성인식 모델의 취약성에 관한 연구)

  • Elim Won;Seongjung Na;Youngjin Ko
    • Convergence Security Journal
    • /
    • v.24 no.2
    • /
    • pp.201-207
    • /
    • 2024
  • Voice is a critical element of human communication, and the development of speech recognition models is one of the significant achievements in artificial intelligence, which has recently been applied in various aspects of human life. The application of speech recognition models in the military field is also inevitable. However, before artificial intelligence models can be applied in the military, it is necessary to research their vulnerabilities. In this study, we evaluates the military applicability of the multilingual speech recognition model "Whisper" by examining its vulnerabilities to battlefield noise, white noise, and adversarial attacks. In experiments involving battlefield noise, Whisper showed significant performance degradation with an average Character Error Rate (CER) of 72.4%, indicating difficulties in military applications. In experiments with white noise, Whisper was robust to low-intensity noise but showed performance degradation under high-intensity noise. Adversarial attack experiments revealed vulnerabilities at specific epsilon values. Therefore, the Whisper model requires improvements through fine-tuning, adversarial training, and other methods.