• 제목/요약/키워드: Training Datasets

검색결과 330건 처리시간 0.031초

데이터 세트별 Post-Training을 통한 언어 모델 최적화 연구: 금융 감성 분석을 중심으로 (Optimizing Language Models through Dataset-Specific Post-Training: A Focus on Financial Sentiment Analysis)

  • 정희도;김재헌;장백철
    • 인터넷정보학회논문지
    • /
    • 제25권1호
    • /
    • pp.57-67
    • /
    • 2024
  • 본 연구는 금융 분야에서 중요한 증감 정보를 효과적으로 이해하고 감성을 정확하게 분류하기 위한 언어 모델의 학습 방법론을 탐구한다. 연구의 핵심 목표는 언어 모델이 금융과 관련된 증감 표현을 잘 이해할 수 있게 하기 위한 적절한 데이터 세트를 찾는 것이다. 이를 위해, Wall Street Journal에서 수집한 금융 뉴스 문장 중 증감 관련 단어를 포함하는 문장을 선별했고, 이와 함께 적절한 프롬프트를 사용해 GPT-3.5-turbo-1106으로 생성한 문장을 각각 post-training에 사용했다. Post-training에 사용한 데이터 세트가 언어 모델의 학습에 어떠한 영향을 미치는지 금융 감성 분석 벤치마크 데이터 세트인 Financial PhraseBank를 통해 성능을 비교하며 분석했으며, 그 결과 금융 분야에 특화된 언어 모델인 FinBERT를 추가 학습한 결과가 일반적인 도메인에서 사전 학습된 모델인 BERT를 추가 학습한 것보다 더 높은 성능을 보였다. 또 금융 뉴스로 post-training을 진행한 것이 생성한 문장을 post-training을 진행한 것에 비해 전반적으로 성능이 높음을 보였으나, 일반화가 더욱 요구되는 환경에서는 생성된 문장으로 추가 학습한 모델이 더 높은 성능을 보였다. 이러한 결과는 개선하고자 하는 부분의 도메인이 사용하고자 하는 언어 모델과의 도메인과 일치해야 한다는 것과 적절한 데이터 세트의 선택이 언어 모델의 이해도 및 예측 성능 향상에 중요함을 시사한다. 연구 결과는 특히 금융 분야에서 감성 분석과 관련된 과제를 수행할 때 언어 모델의 성능을 최적화하기 위한 방법론을 제시하며, 향후 금융 분야에서의 더욱 정교한 언어 이해 및 감성분석을 위한 연구 방향을 제시한다. 이러한 연구는 금융 분야 뿐만 아니라 다른 도메인에서의 언어 모델 학습에도 의미 있는 통찰을 제공할 수 있다.

딥러닝 기반 연기추출을 위한 구름 데이터셋의 전이학습에 대한 연구 (A Study on Transferring Cloud Dataset for Smoke Extraction Based on Deep Learning)

  • 김지용;곽태홍;김용일
    • 대한원격탐사학회지
    • /
    • 제38권5_2호
    • /
    • pp.695-706
    • /
    • 2022
  • 중, 고해상도 광학위성은 산불발생지역의 탐지에 대해 그 효용성이 입증되었다. 그러나 산불과 함께 발생하는 연기는 지표에 입사하는 가시광선을 산란시키므로 산불발생지역의 모니터링에 방해가 되며 따라서 연기를 사전에 추출하는 기술이 필요하다. 딥러닝 기술은 연기추출의 정확도를 향상시킬 수 있으나, 학습용 데이터셋의 부족으로 인해 적용에 한계가 있다. 반면에 연기와 유사하게 가시광선을 산란시키는 성질을 지닌 구름은 현재까지 다량의 학습용 데이터셋이 축적되었다. 본 연구는 딥러닝을 활용하여 연기추출을 고도화하는 것이 그 목적이며, 그 과정에서 데이터셋의 부족에 따른 연기추출의 한계점을 구름을 활용한 전이학습으로 해결했다. 전이학습의 효율성 확인을 위해 본 연구에서는 Landsat-8 위성영상을 기반으로 연기추출 학습용 데이터셋을 소규모로 제작한 후, 공공 구름 데이터셋을 활용하여 전이학습을 적용하기 전과 후의 연기추출 성능을 비교하였다. 그 결과 가시광선 파장대역 뿐만이 아니라 근적외선(NIR)과 단파장 적외선(SWIR) 영역에도 전이학습시 성능이 뚜렷하게 향상됨을 확인할 수 있었다. 본 연구결과를 통해서 연기추출의 데이터셋의 부족을 해결할 수 있을 것으로 보이며, 더 나아가 연기추출의 고도화를 통해서 산불발생지역의 모니터링에 이점을 제시할 수 있을 것이다.

Small Sample Face Recognition Algorithm Based on Novel Siamese Network

  • Zhang, Jianming;Jin, Xiaokang;Liu, Yukai;Sangaiah, Arun Kumar;Wang, Jin
    • Journal of Information Processing Systems
    • /
    • 제14권6호
    • /
    • pp.1464-1479
    • /
    • 2018
  • In face recognition, sometimes the number of available training samples for single category is insufficient. Therefore, the performances of models trained by convolutional neural network are not ideal. The small sample face recognition algorithm based on novel Siamese network is proposed in this paper, which doesn't need rich samples for training. The algorithm designs and realizes a new Siamese network model, SiameseFacel, which uses pairs of face images as inputs and maps them to target space so that the $L_2$ norm distance in target space can represent the semantic distance in input space. The mapping is represented by the neural network in supervised learning. Moreover, a more lightweight Siamese network model, SiameseFace2, is designed to reduce the network parameters without losing accuracy. We also present a new method to generate training data and expand the number of training samples for single category in AR and labeled faces in the wild (LFW) datasets, which improves the recognition accuracy of the models. Four loss functions are adopted to carry out experiments on AR and LFW datasets. The results show that the contrastive loss function combined with new Siamese network model in this paper can effectively improve the accuracy of face recognition.

소량 및 불균형 능동소나 데이터세트에 대한 딥러닝 기반 표적식별기의 종합적인 분석 (Comprehensive analysis of deep learning-based target classifiers in small and imbalanced active sonar datasets)

  • 김근환;황용상;신성진;김주호;황수복;추영민
    • 한국음향학회지
    • /
    • 제42권4호
    • /
    • pp.329-344
    • /
    • 2023
  • 본 논문에서는 소량 및 불균형 능동소나 데이터세트에 적용된 다양한 딥러닝 기반 표적식별기의 일반화 성능을 종합적으로 분석하였다. 서로 다른 시간과 해역에서 수집된 능동소나 실험 데이터를 이용하여 두 가지 능동소나 데이터세트를 생성하였다. 데이터세트의 각 샘플은 탐지 처리 이후 탐지된 오디오 신호로부터 추출된 시간-주파수 영역 이미지이다. 표적식별기의 신경망 모델은 다양한 구조를 가지는 22개의 Convolutional Neural Networks(CNN) 모델을 사용하였다. 실험에서 두 가지 데이터세트는 학습/검증 데이터세트와 테스트 데이터세트로 번갈아 가며 사용되었으며, 표적식별기 출력의 변동성을 계산하기 위해 학습/검증/테스트를 10번 반복하고 표적식별 성능을 분석하였다. 이때 학습을 위한 초매개변수는 베이지안 최적화를 이용하여 최적화하였다. 실험 결과 본 논문에서 설계한 얕은 층을 가지는 CNN 모델이 대부분의 깊은 층을 가지는 CNN 모델보다 견실하면서 우수한 일반화 성능을 가지는 것을 확인하였다. 본 논문은 향후 딥러닝 기반 능동소나 표적식별 연구에 대한 방향성을 설정할 때 유용하게 사용될 수 있다.

On the Performance of Cuckoo Search and Bat Algorithms Based Instance Selection Techniques for SVM Speed Optimization with Application to e-Fraud Detection

  • AKINYELU, Andronicus Ayobami;ADEWUMI, Aderemi Oluyinka
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권3호
    • /
    • pp.1348-1375
    • /
    • 2018
  • Support Vector Machine (SVM) is a well-known machine learning classification algorithm, which has been widely applied to many data mining problems, with good accuracy. However, SVM classification speed decreases with increase in dataset size. Some applications, like video surveillance and intrusion detection, requires a classifier to be trained very quickly, and on large datasets. Hence, this paper introduces two filter-based instance selection techniques for optimizing SVM training speed. Fast classification is often achieved at the expense of classification accuracy, and some applications, such as phishing and spam email classifiers, are very sensitive to slight drop in classification accuracy. Hence, this paper also introduces two wrapper-based instance selection techniques for improving SVM predictive accuracy and training speed. The wrapper and filter based techniques are inspired by Cuckoo Search Algorithm and Bat Algorithm. The proposed techniques are validated on three popular e-fraud types: credit card fraud, spam email and phishing email. In addition, the proposed techniques are validated on 20 other datasets provided by UCI data repository. Moreover, statistical analysis is performed and experimental results reveals that the filter-based and wrapper-based techniques significantly improved SVM classification speed. Also, results reveal that the wrapper-based techniques improved SVM predictive accuracy in most cases.

Robust URL Phishing Detection Based on Deep Learning

  • Al-Alyan, Abdullah;Al-Ahmadi, Saad
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권7호
    • /
    • pp.2752-2768
    • /
    • 2020
  • Phishing websites can have devastating effects on governmental, financial, and social services, as well as on individual privacy. Currently, many phishing detection solutions are evaluated using small datasets and, thus, are prone to sampling issues, such as representing legitimate websites by only high-ranking websites, which could make their evaluation less relevant in practice. Phishing detection solutions which depend only on the URL are attractive, as they can be used in limited systems, such as with firewalls. In this paper, we present a URL-only phishing detection solution based on a convolutional neural network (CNN) model. The proposed CNN takes the URL as the input, rather than using predetermined features such as URL length. For training and evaluation, we have collected over two million URLs in a massive URL phishing detection (MUPD) dataset. We split MUPD into training, validation and testing datasets. The proposed CNN achieves approximately 96% accuracy on the testing dataset; this accuracy is achieved with URL schemes (such as HTTP and HTTPS) removed from the URL. Our proposed solution achieved better accuracy compared to an existing state-of-the-art URL-only model on a published dataset. Finally, the results of our experiment suggest keeping the CNN up-to-date for better results in practice.

An overview of deep learning in the field of dentistry

  • Hwang, Jae-Joon;Jung, Yun-Hoa;Cho, Bong-Hae;Heo, Min-Suk
    • Imaging Science in Dentistry
    • /
    • 제49권1호
    • /
    • pp.1-7
    • /
    • 2019
  • Purpose: Artificial intelligence (AI), represented by deep learning, can be used for real-life problems and is applied across all sectors of society including medical and dental field. The purpose of this study is to review articles about deep learning that were applied to the field of oral and maxillofacial radiology. Materials and Methods: A systematic review was performed using Pubmed, Scopus, and IEEE explore databases to identify articles using deep learning in English literature. The variables from 25 articles included network architecture, number of training data, evaluation result, pros and cons, study object and imaging modality. Results: Convolutional Neural network (CNN) was used as a main network component. The number of published paper and training datasets tended to increase, dealing with various field of dentistry. Conclusion: Dental public datasets need to be constructed and data standardization is necessary for clinical application of deep learning in dental field.

Aerial Dataset Integration For Vehicle Detection Based on YOLOv4

  • Omar, Wael;Oh, Youngon;Chung, Jinwoo;Lee, Impyeong
    • 대한원격탐사학회지
    • /
    • 제37권4호
    • /
    • pp.747-761
    • /
    • 2021
  • With the increasing application of UAVs in intelligent transportation systems, vehicle detection for aerial images has become an essential engineering technology and has academic research significance. In this paper, a vehicle detection method for aerial images based on the YOLOv4 deep learning algorithm is presented. At present, the most known datasets are VOC (The PASCAL Visual Object Classes Challenge), ImageNet, and COCO (Microsoft Common Objects in Context), which comply with the vehicle detection from UAV. An integrated dataset not only reflects its quantity and photo quality but also its diversity which affects the detection accuracy. The method integrates three public aerial image datasets VAID, UAVD, DOTA suitable for YOLOv4. The training model presents good test results especially for small objects, rotating objects, as well as compact and dense objects, and meets the real-time detection requirements. For future work, we will integrate one more aerial image dataset acquired by our lab to increase the number and diversity of training samples, at the same time, while meeting the real-time requirements.

Privacy-Preserving Deep Learning using Collaborative Learning of Neural Network Model

  • Hye-Kyeong Ko
    • International journal of advanced smart convergence
    • /
    • 제12권2호
    • /
    • pp.56-66
    • /
    • 2023
  • The goal of deep learning is to extract complex features from multidimensional data use the features to create models that connect input and output. Deep learning is a process of learning nonlinear features and functions from complex data, and the user data that is employed to train deep learning models has become the focus of privacy concerns. Companies that collect user's sensitive personal information, such as users' images and voices, own this data for indefinite period of times. Users cannot delete their personal information, and they cannot limit the purposes for which the data is used. The study has designed a deep learning method that employs privacy protection technology that uses distributed collaborative learning so that multiple participants can use neural network models collaboratively without sharing the input datasets. To prevent direct leaks of personal information, participants are not shown the training datasets during the model training process, unlike traditional deep learning so that the personal information in the data can be protected. The study used a method that can selectively share subsets via an optimization algorithm that is based on modified distributed stochastic gradient descent, and the result showed that it was possible to learn with improved learning accuracy while protecting personal information.

Video augmentation technique for human action recognition using genetic algorithm

  • Nida, Nudrat;Yousaf, Muhammad Haroon;Irtaza, Aun;Velastin, Sergio A.
    • ETRI Journal
    • /
    • 제44권2호
    • /
    • pp.327-338
    • /
    • 2022
  • Classification models for human action recognition require robust features and large training sets for good generalization. However, data augmentation methods are employed for imbalanced training sets to achieve higher accuracy. These samples generated using data augmentation only reflect existing samples within the training set, their feature representations are less diverse and hence, contribute to less precise classification. This paper presents new data augmentation and action representation approaches to grow training sets. The proposed approach is based on two fundamental concepts: virtual video generation for augmentation and representation of the action videos through robust features. Virtual videos are generated from the motion history templates of action videos, which are convolved using a convolutional neural network, to generate deep features. Furthermore, by observing an objective function of the genetic algorithm, the spatiotemporal features of different samples are combined, to generate the representations of the virtual videos and then classified through an extreme learning machine classifier on MuHAVi-Uncut, iXMAS, and IAVID-1 datasets.