• Title/Summary/Keyword: Multi-Model Training

Search Result 352, Processing Time 0.023 seconds

A WWMBERT-based Method for Improving Chinese Text Classification Task (중국어 텍스트 분류 작업의 개선을 위한 WWMBERT 기반 방식)

  • Wang, Xinyuan;Joe, Inwhee
    • Annual Conference of KIPS
    • /
    • 2021.05a
    • /
    • pp.408-410
    • /
    • 2021
  • In the NLP field, the pre-training model BERT launched by the Google team in 2018 has shown amazing results in various tasks in the NLP field. Subsequently, many variant models have been derived based on the original BERT, such as RoBERTa, ERNIEBERT and so on. In this paper, the WWMBERT (Whole Word Masking BERT) model suitable for Chinese text tasks was used as the baseline model of our experiment. The experiment is mainly for "Text-level Chinese text classification tasks" are improved, which mainly combines Tapt (Task-Adaptive Pretraining) and "Multi-Sample Dropout method" to improve the model, and compare the experimental results, experimental data sets and model scoring standards Both are consistent with the official WWMBERT model using Accuracy as the scoring standard. The official WWMBERT model uses the maximum and average values of multiple experimental results as the experimental scores. The development set was 97.70% (97.50%) on the "text-level Chinese text classification task". and 97.70% (97.50%) of the test set. After comparing the results of the experiments in this paper, the development set increased by 0.35% (0.5%) and the test set increased by 0.31% (0.48%). The original baseline model has been significantly improved.

ELM based short-term Water Demand Prediction for Effective Operation of Water Treatment Plant (정수장 운영효율 향상을 위한 ELM 기반 단기 물 수요 예측)

  • Choi, Gee-Seon;Lee, Dong-Hoon;Kim, Sung-Hwan;Lee, Kyung-Woo;Chun, Myung-Geun
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.23 no.9
    • /
    • pp.108-116
    • /
    • 2009
  • In this paper, we develop an ELM(Extreme Learning Machine) based short-tenn water demand prediction algorithm which solves overfitting problem of MLP(Multi Layer Perceptron) and has quick training time. To show effectiveness of proposed method, we analyzed time series data collected in A water treatment plant at Chung-Nam province during $2007{\sim}2008$ years and used the selected data for the verification of developed algorithm. According to the experimental results, MLP model showed 5.82[%], but the proposed ELM based model showed 5.61[%] with respect to MAPE, respectively. Also, MLP model needed 7.57s training time, but ELM based model was 0.09s. Therefore, the proposed ELM based short-term water demand prediction model can be used to operate the water treatment plant effectively.

Imbalanced sample fault diagnosis method for rotating machinery in nuclear power plants based on deep convolutional conditional generative adversarial network

  • Zhichao Wang;Hong Xia;Jiyu Zhang;Bo Yang;Wenzhe Yin
    • Nuclear Engineering and Technology
    • /
    • v.55 no.6
    • /
    • pp.2096-2106
    • /
    • 2023
  • Rotating machinery is widely applied in important equipment of nuclear power plants (NPPs), such as pumps and valves. The research on intelligent fault diagnosis of rotating machinery is crucial to ensure the safe operation of related equipment in NPPs. However, in practical applications, data-driven fault diagnosis faces the problem of small and imbalanced samples, resulting in low model training efficiency and poor generalization performance. Therefore, a deep convolutional conditional generative adversarial network (DCCGAN) is constructed to mitigate the impact of imbalanced samples on fault diagnosis. First, a conditional generative adversarial model is designed based on convolutional neural networks to effectively augment imbalanced samples. The original sample features can be effectively extracted by the model based on conditional generative adversarial strategy and appropriate number of filters. In addition, high-quality generated samples are ensured through the visualization of model training process and samples features. Then, a deep convolutional neural network (DCNN) is designed to extract features of mixed samples and implement intelligent fault diagnosis. Finally, based on multi-fault experimental data of motor and bearing, the performance of DCCGAN model for data augmentation and intelligent fault diagnosis is verified. The proposed method effectively alleviates the problem of imbalanced samples, and shows its application value in intelligent fault diagnosis of actual NPPs.

Weakly-supervised Semantic Segmentation using Exclusive Multi-Classifier Deep Learning Model (독점 멀티 분류기의 심층 학습 모델을 사용한 약지도 시맨틱 분할)

  • Choi, Hyeon-Joon;Kang, Dong-Joong
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.6
    • /
    • pp.227-233
    • /
    • 2019
  • Recently, along with the recent development of deep learning technique, neural networks are achieving success in computer vision filed. Convolutional neural network have shown outstanding performance in not only for a simple image classification task, but also for tasks with high difficulty such as object segmentation and detection. However many such deep learning models are based on supervised-learning, which requires more annotation labels than image-level label. Especially image semantic segmentation model requires pixel-level annotations for training, which is very. To solve these problems, this paper proposes a weakly-supervised semantic segmentation method which requires only image level label to train network. Existing weakly-supervised learning methods have limitations in detecting only specific area of object. In this paper, on the other hand, we use multi-classifier deep learning architecture so that our model recognizes more different parts of objects. The proposed method is evaluated using VOC 2012 validation dataset.

Intrusion Detection System Based on Multi-Class SVM (다중 클래스 SVM기반의 침입탐지 시스템)

  • Lee Hansung;Song Jiyoung;Kim Eunyoung;Lee Chulho;Park Daihee
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.15 no.3
    • /
    • pp.282-288
    • /
    • 2005
  • In this paper, we propose a new intrusion detection model, which keeps advantages of existing misuse detection model and anomaly detection model and resolves their problems. This new intrusion detection system, named to MMIDS, was designed to satisfy all the following requirements : 1) Fast detection of new types of attack unknown to the system; 2) Provision of detail information about the detected types of attack; 3) cost-effective maintenance due to fast and efficient learning and update; 4) incrementality and scalability of system. The fast and efficient training and updating faculties of proposed novel multi-class SVM which is a core component of MMIDS provide cost-effective maintenance of intrusion detection system. According to the experimental results, our method can provide superior performance in separating similar patterns and detailed separation capability of MMIDS is relatively good.

Comparison of Retaining Wall Displacement Prediction Performance Using Sensor Data (센서 데이터를 활용한 옹벽 변위 예측 성능 비교)

  • Sheilla Wesonga;Jang-Sik Park
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.5
    • /
    • pp.1035-1040
    • /
    • 2024
  • The main objective of inspecting structures is to ensure the safety of all entities that utilize these structures as cracks in structures if not attended to could lead to serious calamities. With that objective in mind, artificial intelligence (AI) based technologies to assist human inspectors are needed especially for retaining walls in structures. In this paper, we predict the crack displacement of retaining walls using an Polynomial Regressive (PR) analysis model, as well as Long Short Term Memory (LSTM) and Gated Recurrent Unit (GRU) deep learning models, and compare their performance. For the performance comparison, we apply multi-variable feature inputs, by utilizing temperature and rainfall data that may affect the crack displacement of the retaining wall. The training and inference data were collected through measuring sensors such as inclinometers, thermometers, and rain gauges. The results show that the multi-variable feature model had a MAE of 0.00186, 0.00450 and 0.00842, which outperformed the single variable feature model at 0.00393, 0.00556 and 0.00929 for the polynomial regression model, LSTM model and the GRU model respectively from the evaluation performed.

Factors Affecting Patient Moving for Medical Service Using Multi-level Analysis (환자이동에 영향을 미치는 개인 및 병원요인 분석)

  • Kim, Sun Hee;Lee, Hae Jong;Lee, Kwang Soo;Shin, Hyun Woung
    • Korea Journal of Hospital Management
    • /
    • v.19 no.4
    • /
    • pp.9-20
    • /
    • 2014
  • The purpose of this study is to find out factors affecting patient moving to receive medical service. This study is analyzed by multi-level model with patient and hospital level by using SAS 9.3. Total number of patients is 600,000 persons for inpatients and 550,000 patients for outpatients. The degree of the factors, which is combined with personnel factor and hospital factor, can be analyzed by Intra-Class Correlation (ICC). The percentage of group(hospital) level variance of the total variance for out-bound moving case are 30.6% at inpatients, and 28.3% at outpatients. And the percentage of hospital level variance of the total variance for moving distance, are 26.7%, 32,5% respectively. Conclusionally, although the main factor of moving is patient level, hospital is also very important factor to make decision to go out-bound. It contributed to about 1/3 for hospital choice. And, when the one make decision, he will consider the hospital type, number of bed, and training institute in hospital level. Through this study to find out hospital factors affecting patient moving for medical service, it must be continued to find out which factors have more influence to choice the hospital among disease type after this.

  • PDF

Prediction of Remaining Useful Life of Lithium-ion Battery based on Multi-kernel Support Vector Machine with Particle Swarm Optimization

  • Gao, Dong;Huang, Miaohua
    • Journal of Power Electronics
    • /
    • v.17 no.5
    • /
    • pp.1288-1297
    • /
    • 2017
  • The estimation of the remaining useful life (RUL) of lithium-ion (Li-ion) batteries is important for intelligent battery management system (BMS). Data mining technology is becoming increasingly mature, and the RUL estimation of Li-ion batteries based on data-driven prognostics is more accurate with the arrival of the era of big data. However, the support vector machine (SVM), which is applied to predict the RUL of Li-ion batteries, uses the traditional single-radial basis kernel function. This type of classifier has weak generalization ability, and it easily shows the problem of data migration, which results in inaccurate prediction of the RUL of Li-ion batteries. In this study, a novel multi-kernel SVM (MSVM) based on polynomial kernel and radial basis kernel function is proposed. Moreover, the particle swarm optimization algorithm is used to search the kernel parameters, penalty factor, and weight coefficient of the MSVM model. Finally, this paper utilizes the NASA battery dataset to form the observed data sequence for regression prediction. Results show that the improved algorithm not only has better prediction accuracy and stronger generalization ability but also decreases training time and computational complexity.

Improvement of Initial Weight Dependency of the Neural Network Model for Determination of Preconsolidation Pressure from Piezocone Test Result (피에조콘을 이용한 선행압밀하중 결정 신경망 모델의 초기 연결강도 의존성 개선)

  • Park, Sol-Ji;Joo, No-Ah;Park, Hyun-Il;Kim, Young-Sang
    • Proceedings of the Korean Geotechical Society Conference
    • /
    • 2009.03a
    • /
    • pp.456-463
    • /
    • 2009
  • The preconsolidation pressure has been commonly determined by oedometer test. However, it can also be determined by in-situ test, such as piezocone test with theoretical and(or) empirical correlations. Recently, Neural Network(NN) theory was applied and some models were proposed to estimate the preconsolidation pressure or OCR. However, since the optimization process of synaptic weights of NN model is dependent on the initial synaptic weights, NN models which are trained with different initial weights can't avoid the variability on prediction result for new database even though they have same structure and use same transfer function. In this study, Committee Neural Network(CNN) model is proposed to improve the initial weight dependency of multi-layered neural network model on the prediction of preconsolidation pressure of soft clay from piezocone test result. It was found that even though the NN model has the optimized structure for given training data set, it still has the initial weight dependency, while the proposed CNN model can improve the initial weight dependency of the NN model and provide a consistent and precise inference result than existing NN models.

  • PDF

Korean Ironic Expression Detector (한국어 반어 표현 탐지기)

  • Seung Ju Bang;Yo-Han Park;Jee Eun Kim;Kong Joo Lee
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.3
    • /
    • pp.148-155
    • /
    • 2024
  • Despite the increasing importance of irony and sarcasm detection in the field of natural language processing, research on the Korean language is relatively scarce compared to other languages. This study aims to experiment with various models for irony detection in Korean text. The study conducted irony detection experiments using KoBERT, a BERT-based model, and ChatGPT. For KoBERT, two methods of additional training on sentiment data were applied (Transfer Learning and MultiTask Learning). Additionally, for ChatGPT, the Few-Shot Learning technique was applied by increasing the number of example sentences entered as prompts. The results of the experiments showed that the Transfer Learning and MultiTask Learning models, which were trained with additional sentiment data, outperformed the baseline model without additional sentiment data. On the other hand, ChatGPT exhibited significantly lower performance compared to KoBERT, and increasing the number of example sentences did not lead to a noticeable improvement in performance. In conclusion, this study suggests that a model based on KoBERT is more suitable for irony detection than ChatGPT, and it highlights the potential contribution of additional training on sentiment data to improve irony detection performance.