• Title/Summary/Keyword: AI Training Data

Search Result 276, Processing Time 0.027 seconds

KB-BERT: Training and Application of Korean Pre-trained Language Model in Financial Domain (KB-BERT: 금융 특화 한국어 사전학습 언어모델과 그 응용)

  • Kim, Donggyu;Lee, Dongwook;Park, Jangwon;Oh, Sungwoo;Kwon, Sungjun;Lee, Inyong;Choi, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.191-206
    • /
    • 2022
  • Recently, it is a de-facto approach to utilize a pre-trained language model(PLM) to achieve the state-of-the-art performance for various natural language tasks(called downstream tasks) such as sentiment analysis and question answering. However, similar to any other machine learning method, PLM tends to depend on the data distribution seen during the training phase and shows worse performance on the unseen (Out-of-Distribution) domain. Due to the aforementioned reason, there have been many efforts to develop domain-specified PLM for various fields such as medical and legal industries. In this paper, we discuss the training of a finance domain-specified PLM for the Korean language and its applications. Our finance domain-specified PLM, KB-BERT, is trained on a carefully curated financial corpus that includes domain-specific documents such as financial reports. We provide extensive performance evaluation results on three natural language tasks, topic classification, sentiment analysis, and question answering. Compared to the state-of-the-art Korean PLM models such as KoELECTRA and KLUE-RoBERTa, KB-BERT shows comparable performance on general datasets based on common corpora like Wikipedia and news articles. Moreover, KB-BERT outperforms compared models on finance domain datasets that require finance-specific knowledge to solve given problems.

Application of adaptive neuro-fuzzy system in prediction of nanoscale and grain size effects on formability

  • Nan Yang;Meldi Suhatril;Khidhair Jasim Mohammed;H. Elhosiny Ali
    • Advances in nano research
    • /
    • v.14 no.2
    • /
    • pp.155-164
    • /
    • 2023
  • Grain size in sheet metals in one of the main parameters in determining formability. Grain size control in industry requires delicate process control and equipment. In the present study, effects of grain size on the formability of steel sheets is investigated. Experimental investigation of effect of grain size is a cumbersome method which due to existence of many other effective parameters are not conclusive in some cases. On the other hand, since the average grain size of a crystalline material is a statistical parameter, using traditional methods are not sufficient for find the optimum grain size to maximize formability. Therefore, design of experiment (DoE) and artificial intelligence (AI) methods are coupled together in this study to find the optimum conditions for formability in terms of grain size and to predict forming limits of sheet metals under bi-stretch loading conditions. In this regard, a set of experiment is conducted to provide initial data for training and testing DoE and AI. Afterwards, the using response surface method (RSM) optimum grain size is calculated. Moreover, trained neural network is used to predict formability in the calculated optimum condition and the results compared to the experimental results. The findings of the present study show that DoE and AI could be a great aid in the design, determination and prediction of optimum grain size for maximizing sheet formability.

Improvement of PM Forecasting Performance by Outlier Data Removing (Outlier 데이터 제거를 통한 미세먼지 예보성능의 향상)

  • Jeon, Young Tae;Yu, Suk Hyun;Kwon, Hee Yong
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.6
    • /
    • pp.747-755
    • /
    • 2020
  • In this paper, we deal with outlier data problems that occur when constructing a PM2.5 fine dust forecasting system using a neural network. In general, when learning a neural network, some of the data are not helpful for learning, but rather disturbing. Those are called outlier data. When they are included in the training data, various problems such as overfitting occur. In building a PM2.5 fine dust concentration forecasting system using neural network, we have found several outlier data in the training data. We, therefore, remove them, and then make learning 3 ways. Over_outlier model removes outlier data that target concentration is low, but the model forecast is high. Under_outlier model removes outliers data that target concentration is high, but the model forecast is low. All_outlier model removes both Over_outlier and Under_outlier data. We compare 3 models with a conventional outlier removal model and non-removal model. Our outlier removal model shows better performance than the others.

Deep Learning Model Validation Method Based on Image Data Feature Coverage (영상 데이터 특징 커버리지 기반 딥러닝 모델 검증 기법)

  • Lim, Chang-Nam;Park, Ye-Seul;Lee, Jung-Won
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.9
    • /
    • pp.375-384
    • /
    • 2021
  • Deep learning techniques have been proven to have high performance in image processing and are applied in various fields. The most widely used methods for validating a deep learning model include a holdout verification method, a k-fold cross verification method, and a bootstrap method. These legacy methods consider the balance of the ratio between classes in the process of dividing the data set, but do not consider the ratio of various features that exist within the same class. If these features are not considered, verification results may be biased toward some features. Therefore, we propose a deep learning model validation method based on data feature coverage for image classification by improving the legacy methods. The proposed technique proposes a data feature coverage that can be measured numerically how much the training data set for training and validation of the deep learning model and the evaluation data set reflects the features of the entire data set. In this method, the data set can be divided by ensuring coverage to include all features of the entire data set, and the evaluation result of the model can be analyzed in units of feature clusters. As a result, by providing feature cluster information for the evaluation result of the trained model, feature information of data that affects the trained model can be provided.

Deep Learning Model Parallelism (딥러닝 모델 병렬 처리)

  • Park, Y.M.;Ahn, S.Y.;Lim, E.J.;Choi, Y.S.;Woo, Y.C.;Choi, W.
    • Electronics and Telecommunications Trends
    • /
    • v.33 no.4
    • /
    • pp.1-13
    • /
    • 2018
  • Deep learning (DL) models have been widely applied to AI applications such image recognition and language translation with big data. Recently, DL models have becomes larger and more complicated, and have merged together. For the accelerated training of a large-scale deep learning model, model parallelism that partitions the model parameters for non-shared parallel access and updates across multiple machines was provided by a few distributed deep learning frameworks. Model parallelism as a training acceleration method, however, is not as commonly used as data parallelism owing to the difficulty of efficient model parallelism. This paper provides a comprehensive survey of the state of the art in model parallelism by comparing the implementation technologies in several deep learning frameworks that support model parallelism, and suggests a future research directions for improving model parallelism technology.

Design of e-commerce business model through AI price prediction of agricultural products (농산물 AI 가격 예측을 통한 전자거래 비즈니스 모델 설계)

  • Han, Nam-Gyu;Kim, Bong-Hyun
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.12
    • /
    • pp.83-91
    • /
    • 2021
  • For agricultural products, supply is irregular due to changes in meteorological conditions, and it has high price elasticity. For example, if the supply decreases by 10%, the price increases by 50%. Due to these fluctuations in the prices of agricultural products, the Korean government guarantees the safety of prices to producers through small merchants' auctions. However, when prices plummet due to overproduction, protection measures for producers are insufficient. Therefore, in this paper, we designed a business model that can be used in the electronic transaction system by predicting the price of agricultural products with an artificial intelligence algorithm. To this end, the trained model with the training pattern pairs and a predictive model was designed by applying ARIMA, SARIMA, RNN, and CNN. Finally, the agricultural product forecast price data was classified into short-term forecast and medium-term forecast and verified. As a result of verification, based on 2018 data, the actual price and predicted price showed an accuracy of 91.08%.

Model Type Inference Attack Using Output of Black-Box AI Model (블랙 박스 모델의 출력값을 이용한 AI 모델 종류 추론 공격)

  • An, Yoonsoo;Choi, Daeseon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.5
    • /
    • pp.817-826
    • /
    • 2022
  • AI technology is being successfully introduced in many fields, and models deployed as a service are deployed with black box environment that does not expose the model's information to protect intellectual property rights and data. In a black box environment, attackers try to steal data or parameters used during training by using model output. This paper proposes a method of inferring the type of model to directly find out the composition of layer of the target model, based on the fact that there is no attack to infer the information about the type of model from the deep learning model. With ResNet, VGGNet, AlexNet, and simple convolutional neural network models trained with MNIST datasets, we show that the types of models can be inferred using the output values in the gray box and black box environments of the each model. In addition, we inferred the type of model with approximately 83% accuracy in the black box environment if we train the big and small relationship feature that proposed in this paper together, the results show that the model type can be infrerred even in situations where only partial information is given to attackers, not raw probability vectors.

Development of the Artificial Intelligence Literacy Education Program for Preservice Secondary Teachers (예비 중등교사를 위한 인공지능 리터러시 교육 프로그램 개발)

  • Bong Seok Jang
    • Journal of Practical Engineering Education
    • /
    • v.16 no.1_spc
    • /
    • pp.65-70
    • /
    • 2024
  • As the interest in AI education grows, researchers have made efforts to implement AI education programs. However, research targeting pre-service teachers has been limited thus far. Therefore, this study was conducted to develop an AI literacy education program for preservice secondary teachers. The research results revealed that the weekly topics included the definition and applications of AI, analysis of intelligent agents, the importance of data, understanding machine learning, hands-on exercises on prediction and classification, hands-on exercises on clustering and classification, hands-on exercises on unstructured data, understanding deep learning, application of deep learning algorithms, fairness, transparency, accountability, safety, and social integration. Through this research, it is hoped that AI literacy education programs for preservice teachers will be expanded. In the future, it is anticipated that follow-up studies will be conducted to implement relevant education in teacher training institutions and analyze its effectiveness.

An Improvement Of Efficiency For kNN By Using A Heuristic (휴리스틱을 이용한 kNN의 효율성 개선)

  • Lee, Jae-Moon
    • The KIPS Transactions:PartB
    • /
    • v.10B no.6
    • /
    • pp.719-724
    • /
    • 2003
  • This paper proposed a heuristic to enhance the speed of kNN without loss of its accuracy. The proposed heuristic minimizes the computation of the similarity between two documents which is the dominant factor in kNN. To do this, the paper proposes a method to calculate the upper limit of the similarity and to sort the training documents. The proposed heuristic was implemented on the existing framework of the text categorization, so called, AI :: Categorizer and it was compared with the conventional kNN with the well-known data, Router-21578. The comparisons show that the proposed heuristic outperforms kNN about 30∼40% with respect to the execution time.

Enhanced ACGAN based on Progressive Step Training and Weight Transfer

  • Jinmo Byeon;Inshil Doh;Dana Yang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.3
    • /
    • pp.11-20
    • /
    • 2024
  • Among the generative models in Artificial Intelligence (AI), especially Generative Adversarial Network (GAN) has been successful in various applications such as image processing, density estimation, and style transfer. While the GAN models including Conditional GAN (CGAN), CycleGAN, BigGAN, have been extended and improved, researchers face challenges in real-world applications in specific domains such as disaster simulation, healthcare, and urban planning due to data scarcity and unstable learning causing Image distortion. This paper proposes a new progressive learning methodology called Progressive Step Training (PST) based on the Auxiliary Classifier GAN (ACGAN) that discriminates class labels, leveraging the progressive learning approach of the Progressive Growing of GAN (PGGAN). The PST model achieves 70.82% faster stabilization, 51.3% lower standard deviation, stable convergence of loss values in the later high resolution stages, and a 94.6% faster loss reduction compared to conventional methods.