• Title/Summary/Keyword: Artificial-data-generation

Search Result 220, Processing Time 0.025 seconds

TAGS: Text Augmentation with Generation and Selection (생성-선정을 통한 텍스트 증강 프레임워크)

  • Kim Kyung Min;Dong Hwan Kim;Seongung Jo;Heung-Seon Oh;Myeong-Ha Hwang
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.10
    • /
    • pp.455-460
    • /
    • 2023
  • Text augmentation is a methodology that creates new augmented texts by transforming or generating original texts for the purpose of improving the performance of NLP models. However existing text augmentation techniques have limitations such as lack of expressive diversity semantic distortion and limited number of augmented texts. Recently text augmentation using large language models and few-shot learning can overcome these limitations but there is also a risk of noise generation due to incorrect generation. In this paper, we propose a text augmentation method called TAGS that generates multiple candidate texts and selects the appropriate text as the augmented text. TAGS generates various expressions using few-shot learning while effectively selecting suitable data even with a small amount of original text by using contrastive learning and similarity comparison. We applied this method to task-oriented chatbot data and achieved more than sixty times quantitative improvement. We also analyzed the generated texts to confirm that they produced semantically and expressively diverse texts compared to the original texts. Moreover, we trained and evaluated a classification model using the augmented texts and showed that it improved the performance by more than 0.1915, confirming that it helps to improve the actual model performance.

A Date Mining Approach to Intelligent College Road Map Advice Service (데이터 마이닝을 이용한 지능형 전공지도시스템 연구)

  • Choe, Deok-Won;Jo, Gyeong-Pil;Sin, Jin-Gyu
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2005.05a
    • /
    • pp.266-273
    • /
    • 2005
  • Data mining techniques enable us to generate useful information for decision support from the data sources which are generated and accumulated in the process of routine organizational management activities. College administration system is a typical example that produces a warehouse of student records as each and every student enters a college and undertakes the curricular and extracurricular activities. So far, these data have been utilized to a very limited student service purposes, such as issuance of transcripts, graduation evaluation, GPA calculation, etc. In this paper, we utilize Holland career search test results, TOEIC score, course work list, and GPA score as the input for data mining and generation the student advisory information. Factor analysis, AHP(Analytic Hierarchy Process), artificial neural net, and CART(Classification And Regression Tree) techniques are deployed in the data mining process. Since these data mining techniques are very powerful in processing and discovering useful knowledge and information from large scale student databases, we can expect a highly sophisticated student advisory knowledge and services which may not be obtained with the human student advice experts.

  • PDF

Automated Verification of Livestock Manure Transfer Management System Handover Document using Gradient Boosting (Gradient Boosting을 이용한 가축분뇨 인계관리시스템 인계서 자동 검증)

  • Jonghwi Hwang;Hwakyung Kim;Jaehak Ryu;Taeho Kim;Yongtae Shin
    • Journal of Information Technology Services
    • /
    • v.22 no.4
    • /
    • pp.97-110
    • /
    • 2023
  • In this study, we propose a technique to automatically generate transfer documents using sensor data from livestock manure transfer systems. The research involves analyzing sensor data and applying machine learning techniques to derive optimized outcomes for livestock manure transfer documents. By comparing and contrasting with existing documents, we present a method for automatic document generation. Specifically, we propose the utilization of Gradient Boosting, a machine learning algorithm. The objective of this research is to enhance the efficiency of livestock manure and liquid byproduct management. Currently, stakeholders including producers, transporters, and processors manually input data into the livestock manure transfer management system during the disposal of manure and liquid byproducts. This manual process consumes additional labor, leads to data inconsistency, and complicates the management of distribution and treatment. Therefore, the aim of this study is to leverage data to automatically generate transfer documents, thereby increasing the efficiency of livestock manure and liquid byproduct management. By utilizing sensor data from livestock manure and liquid byproduct transport vehicles and employing machine learning algorithms, we establish a system that automates the validation of transfer documents, reducing the burden on producers, transporters, and processors. This efficient management system is anticipated to create a transparent environment for the distribution and treatment of livestock manure and liquid byproducts.

Radiation Prediction Based on Multi Deep Learning Model Using Weather Data and Weather Satellites Image (기상 데이터와 기상 위성 영상을 이용한 다중 딥러닝 모델 기반 일사량 예측)

  • Jae-Jung Kim;Yong-Hun You;Chang-Bok Kim
    • Journal of Advanced Navigation Technology
    • /
    • v.25 no.6
    • /
    • pp.569-575
    • /
    • 2021
  • Deep learning shows differences in prediction performance depending on data quality and model. This study uses various input data and multiple deep learning models to build an optimal deep learning model for predicting solar radiation, which has the most influence on power generation prediction. did. As the input data, the weather data of the Korea Meteorological Administration and the clairvoyant meteorological image were used by segmenting the image of the Korea Meteorological Agency. , comparative evaluation, and predicting solar radiation by constructing multiple deep learning models connecting the models with the best error rate in each model. As an experimental result, the RMSE of model A, which is a multiple deep learning model, was 0.0637, the RMSE of model B was 0.07062, and the RMSE of model C was 0.06052, so the error rate of model A and model C was better than that of a single model. In this study, the model that connected two or more models through experiments showed improved prediction rates and stable learning results.

A Predictive Model of the Generator Output Based on the Learning of Performance Data in Power Plant (발전플랜트 성능데이터 학습에 의한 발전기 출력 추정 모델)

  • Yang, HacJin;Kim, Seong Kun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.12
    • /
    • pp.8753-8759
    • /
    • 2015
  • Establishment of analysis procedures and validated performance measurements for generator output is required to maintain stable management of generator output in turbine power generation cycle. We developed turbine expansion model and measurement validation model for the performance calculation of generator using turbine output based on ASME (American Society of Mechanical Engineers) PTC (Performance Test Code). We also developed verification model for uncertain measurement data related to the turbine and generator output. Although the model in previous researches was developed using artificial neural network and kernel regression, the verification model in this paper was based on algorithms through Support Vector Machine (SVM) model to overcome the problems of unmeasured data. The selection procedures of related variables and data window for verification learning was also developed. The model reveals suitability in the estimation procss as the learning error was in the range of about 1%. The learning model can provide validated estimations for corrective performance analysis of turbine cycle output using the predictions of measurement data loss.

A Study on NOx Emission Control Methods in the Cement Firing Process Using Data Mining Techniques (데이터 마이닝을 이용한 시멘트 소성공정 질소산화물(NOx)배출 관리 방법에 관한 연구)

  • Park, Chul Hong;Kim, Yong Soo
    • Journal of Korean Society for Quality Management
    • /
    • v.46 no.3
    • /
    • pp.739-752
    • /
    • 2018
  • Purpose: The purpose of this study was to investigate the relationship between kiln processing parameters and NOx emissions that occur in the sintering and calcination steps of the cement manufacturing process and to derive the main factors responsible for producing emissions outside emission limit criteria, as determined by category models and classification rules, using data mining techniques. The results from this study are expected to be useful as guidelines for NOx emission control standards. Methods: Data were collected from Precalciner Kiln No.3 used in one of the domestic cement plants in Korea. Thirty-four independent variables affecting NOx generation and dependent variables that exceeded or were below the NOx emiision limit (>1 and <0, respectively) were examined during kiln processing. These data were used to construct a detection model of NOx emission, in which emissions exceeded or were below the set limits. The model was validated using SPSS MODELER 18.0, artificial neural network, decision treee (C5.0), and logistic regression analysis data mining techniques. Results: The decision tree (C5.0) algorithm best represented NOx emission behavior and was used to identify 10 processing variables that resulted in NOx emissions outside limit criteria. Conclusion: The results of this study indicate that the decision tree (C5.0) can be applied for real-time monitoring and management of NOx emissions during the cement firing process to satisfy NOx emission control standards and to provide for a more eco-friendly cement product.

Orbital Lifetime Analysis of Space Objects (우주물체 궤도수명 분석)

  • Seong, Jae-Dong;Kim, Hae-Dong
    • Aerospace Engineering and Technology
    • /
    • v.13 no.1
    • /
    • pp.184-192
    • /
    • 2014
  • In this paper, the lifetime of the artificial space objects in the LEO is analysed by using TLE data, which is provided by JSpOC. We observed the change of the number of space objects from 1957 and determined the reason of space debris generation. And then, we performed the analysis about present condition of space debris environment. The lifetime analysis includes a total of 11,792 artificial space objects and performed until the year 2050 by orbit propagation. We analyze the annual reentry frequency for the high RCS objects such as nonoperational satellites and rocket bodies, which have the possibility of earth ground impact through STK/Lifetime Tool for accurate and effective calculation. The results show that 9 payloads or rocket bodies will be decayed annually and 2 or 3 objects of total value have the possibility of ground impact. In addition, it is also shown that the 40% of a total analysed objects have the lifetime over 200 years.

Design and Implementation of RSSI-based Intelligent Location Estimation System (RSSI기반 지능형 위치 추정 시스템 설계 및 구현)

  • Lim, Chang Gyoon;Kang, O Seong Andrew;Lee, Chang Young;Kim, Kang Chul
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.9-18
    • /
    • 2013
  • In this paper, we design and implement an intelligent system for finding objects with RFID(Radio Frequency IDentification) tag in which an mobile robot can do. The system we developed is a learning system of artificial neural network that uses RSSI(Received Signal Strength Indicator) value as input and absolute coordination value as target. Although a passive RFID is used for location estimation, we consider an active RFID for expansion of recognition distance. We design the proposed system and construct the environment for indoor location estimation. The designed system is implemented with software and the result related learning is shown at test bed. We show various experiment results with similar environment of real one from earning data generation to real time location estimation. The accuracy of location estimation is verified by simulating the proposed method with allowable error. We prepare local test bed for indoor experiments and build a mobile robot that can find the objects user want.

Enhanced ACGAN based on Progressive Step Training and Weight Transfer

  • Jinmo Byeon;Inshil Doh;Dana Yang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.3
    • /
    • pp.11-20
    • /
    • 2024
  • Among the generative models in Artificial Intelligence (AI), especially Generative Adversarial Network (GAN) has been successful in various applications such as image processing, density estimation, and style transfer. While the GAN models including Conditional GAN (CGAN), CycleGAN, BigGAN, have been extended and improved, researchers face challenges in real-world applications in specific domains such as disaster simulation, healthcare, and urban planning due to data scarcity and unstable learning causing Image distortion. This paper proposes a new progressive learning methodology called Progressive Step Training (PST) based on the Auxiliary Classifier GAN (ACGAN) that discriminates class labels, leveraging the progressive learning approach of the Progressive Growing of GAN (PGGAN). The PST model achieves 70.82% faster stabilization, 51.3% lower standard deviation, stable convergence of loss values in the later high resolution stages, and a 94.6% faster loss reduction compared to conventional methods.

Research on Process Technology of Molded Bridge Die on Substrate (MBoS) for Advanced Package (Advanced Package용 Molded Bridge Die on Substrate(MBoS) 공정 기술 연구)

  • Jaeyoung Jeon;Donggyu Kim;Wonseok Choi;Yonggyu Jang;Sanggyu Jang;Yong-Nam Koh
    • Journal of the Microelectronics and Packaging Society
    • /
    • v.31 no.2
    • /
    • pp.16-22
    • /
    • 2024
  • With advances of artificial intelligence (AI) technology, the demand is increasing for high-end semiconductors in various places such as data centers. In order to improve the performance of semiconductors, reducing the pitch of patterns and increasing density of I/Os are required. For this issue, 2.5dimension(D) packaging is gaining attention as a promising solution. The core technologies used in 2.5D packaging include microbump, interposer, and bridge die. These technologies enable the implementation of a larger number of I/Os than conventional methods, enabling a large amount of information to be transmitted and received simultaneously. This paper proposes the Molded Bridge die on Substrate (MBoS) process technology, which combines molding and Redistribution Layer (RDL) processes. The proposed MBoS technology is expected to contribute to the popularization of next-generation packaging technology due to its easy adaption and wide application areas.