Search | Korea Science

Language Model Adaptation Based on Topic Probability of Latent Dirichlet Allocation

Jeon, Hyung-Bae;Lee, Soo-Young
- ETRI Journal
- /
- v.38 no.3
- /
- pp.487-493
- /
- 2016
Two new methods are proposed for an unsupervised adaptation of a language model (LM) with a single sentence for automatic transcription tasks. At the training phase, training documents are clustered by a method known as Latent Dirichlet allocation (LDA), and then a domain-specific LM is trained for each cluster. At the test phase, an adapted LM is presented as a linear mixture of the now trained domain-specific LMs. Unlike previous adaptation methods, the proposed methods fully utilize a trained LDA model for the estimation of weight values, which are then to be assigned to the now trained domain-specific LMs; therefore, the clustering and weight-estimation algorithms of the trained LDA model are reliable. For the continuous speech recognition benchmark tests, the proposed methods outperform other unsupervised LM adaptation methods based on latent semantic analysis, non-negative matrix factorization, and LDA with n-gram counting.
https://doi.org/10.4218/etrij.16.0115.0499 인용 PDF KSCI

A Survey on Deep Learning-based Pre-Trained Language Models (딥러닝 기반 사전학습 언어모델에 대한 이해와 현황)

Sangun Park
- The Journal of Bigdata
- /
- v.7 no.2
- /
- pp.11-29
- /
- 2022
Pre-trained language models are the most important and widely used tools in natural language processing tasks. Since those have been pre-trained for a large amount of corpus, high performance can be expected even with fine-tuning learning using a small number of data. Since the elements necessary for implementation, such as a pre-trained tokenizer and a deep learning model including pre-trained weights, are distributed together, the cost and period of natural language processing has been greatly reduced. Transformer variants are the most representative pre-trained language models that provide these advantages. Those are being actively used in other fields such as computer vision and audio applications. In order to make it easier for researchers to understand the pre-trained language model and apply it to natural language processing tasks, this paper describes the definition of the language model and the pre-learning language model, and discusses the development process of the pre-trained language model and especially representative Transformer variants.
https://doi.org/10.36498/kbigdt.2022.7.2.11 인용 PDF KSCI

A Study of Fine Tuning Pre-Trained Korean BERT for Question Answering Performance Development (사전 학습된 한국어 BERT의 전이학습을 통한 한국어 기계독해 성능개선에 관한 연구)

Lee, Chi Hoon;Lee, Yeon Ji;Lee, Dong Hee
- Journal of Information Technology Services
- /
- v.19 no.5
- /
- pp.83-91
- /
- 2020
Language Models such as BERT has been an important factor of deep learning-based natural language processing. Pre-training the transformer-based language models would be computationally expensive since they are consist of deep and broad architecture and layers using an attention mechanism and also require huge amount of data to train. Hence, it became mandatory to do fine-tuning large pre-trained language models which are trained by Google or some companies can afford the resources and cost. There are various techniques for fine tuning the language models and this paper examines three techniques, which are data augmentation, tuning the hyper paramters and partly re-constructing the neural networks. For data augmentation, we use no-answer augmentation and back-translation method. Also, some useful combinations of hyper parameters are observed by conducting a number of experiments. Finally, we have GRU, LSTM networks to boost our model performance with adding those networks to BERT pre-trained model. We do fine-tuning the pre-trained korean-based language model through the methods mentioned above and push the F1 score from baseline up to 89.66. Moreover, some failure attempts give us important lessons and tell us the further direction in a good way.
https://doi.org/10.9716/KITS.2020.19.5.083 인용 PDF KSCI

Medical Image Classification using Pre-trained Convolutional Neural Networks and Support Vector Machine

Ahmed, Ali
- International Journal of Computer Science & Network Security
- /
- v.21 no.6
- /
- pp.1-6
- /
- 2021
Recently, pre-trained convolutional neural network CNNs have been widely used and applied for medical image classification. These models can utilised in three different ways, for feature extraction, to use the architecture of the pre-trained model and to train some layers while freezing others. In this study, the ResNet18 pre-trained CNNs model is used for feature extraction, followed by the support vector machine for multiple classes to classify medical images from multi-classes, which is used as the main classifier. Our proposed classification method was implemented on Kvasir and PH2 medical image datasets. The overall accuracy was 93.38% and 91.67% for Kvasir and PH2 datasets, respectively. The classification results and performance of our proposed method outperformed some of the related similar methods in this area of study.
https://doi.org/10.22937/IJCSNS.2021.21.6.1 인용 PDF KSCI

A Protein-Protein Interaction Extraction Approach Based on Large Pre-trained Language Model and Adversarial Training

Tang, Zhan;Guo, Xuchao;Bai, Zhao;Diao, Lei;Lu, Shuhan;Li, Lin
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.16 no.3
- /
- pp.771-791
- /
- 2022
Protein-protein interaction (PPI) extraction from original text is important for revealing the molecular mechanism of biological processes. With the rapid growth of biomedical literature, manually extracting PPI has become more time-consuming and laborious. Therefore, the automatic PPI extraction from the raw literature through natural language processing technology has attracted the attention of the majority of researchers. We propose a PPI extraction model based on the large pre-trained language model and adversarial training. It enhances the learning of semantic and syntactic features using BioBERT pre-trained weights, which are built on large-scale domain corpora, and adversarial perturbations are applied to the embedding layer to improve the robustness of the model. Experimental results showed that the proposed model achieved the highest F1 scores (83.93% and 90.31%) on two corpora with large sample sizes, namely, AIMed and BioInfer, respectively, compared with the previous method. It also achieved comparable performance on three corpora with small sample sizes, namely, HPRD50, IEPA, and LLL.
https://doi.org/10.3837/tiis.2022.03.002 인용 PDF KSCI HTML

Robust Sentiment Classification of Metaverse Services Using a Pre-trained Language Model with Soft Voting

Haein Lee;Hae Sun Jung;Seon Hong Lee;Jang Hyun Kim
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.17 no.9
- /
- pp.2334-2347
- /
- 2023
Metaverse services generate text data, data of ubiquitous computing, in real-time to analyze user emotions. Analysis of user emotions is an important task in metaverse services. This study aims to classify user sentiments using deep learning and pre-trained language models based on the transformer structure. Previous studies collected data from a single platform, whereas the current study incorporated the review data as "Metaverse" keyword from the YouTube and Google Play Store platforms for general utilization. As a result, the Bidirectional Encoder Representations from Transformers (BERT) and Robustly optimized BERT approach (RoBERTa) models using the soft voting mechanism achieved a highest accuracy of 88.57%. In addition, the area under the curve (AUC) score of the ensemble model comprising RoBERTa, BERT, and A Lite BERT (ALBERT) was 0.9458. The results demonstrate that the ensemble combined with the RoBERTa model exhibits good performance. Therefore, the RoBERTa model can be applied on platforms that provide metaverse services. The findings contribute to the advancement of natural language processing techniques in metaverse services, which are increasingly important in digital platforms and virtual environments. Overall, this study provides empirical evidence that sentiment analysis using deep learning and pre-trained language models is a promising approach to improving user experiences in metaverse services.
https://doi.org/10.3837/tiis.2023.09.002 인용 PDF HTML

Performance Evaluation of a Dynamic Inverse Model with EnergyPlus Model Simulation for Building Cooling Loads (건물냉방부하에 대한 동적 인버스 모델링기법의 EnergyPlus 건물모델 적용을 통한 성능평가)

Lee, Kyoung-Ho;Braun, James E.
- Korean Journal of Air-Conditioning and Refrigeration Engineering
- /
- v.20 no.3
- /
- pp.205-212
- /
- 2008
This paper describes the application of an inverse building model to a calibrated forward building model using EnergyPlus program. Typically, inverse models are trained using measured data. However, in this study, an inverse building model was trained using data generated by an EnergyPlus model for an actual office building. The EnergyPlus model was calibrated using field data for the building. A training data set for a month of July was generated from the EnergyPlus model to train the inverse model. Cooling load prediction of the trained inverse model was tested using another data set from the EnergyPlus model for a month of August. Predicted cooling loads showed good agreement with cooling loads from the EnergyPlus model with root-mean square errors of 4.11%. In addition, different control strategies with dynamic cooling setpoint variation were simulated using the inverse model. Peak cooling loads and daily cooling loads were compared for the dynamic simulation.
PDF KSCI

Integration of WFST Language Model in Pre-trained Korean E2E ASR Model

Junseok Oh;Eunsoo Cho;Ji-Hwan Kim
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.18 no.6
- /
- pp.1692-1705
- /
- 2024
In this paper, we present a method that integrates a Grammar Transducer as an external language model to enhance the accuracy of the pre-trained Korean End-to-end (E2E) Automatic Speech Recognition (ASR) model. The E2E ASR model utilizes the Connectionist Temporal Classification (CTC) loss function to derive hypothesis sentences from input audio. However, this method reveals a limitation inherent in the CTC approach, as it fails to capture language information from transcript data directly. To overcome this limitation, we propose a fusion approach that combines a clause-level n-gram language model, transformed into a Weighted Finite-State Transducer (WFST), with the E2E ASR model. This approach enhances the model's accuracy and allows for domain adaptation using just additional text data, avoiding the need for further intensive training of the extensive pre-trained ASR model. This is particularly advantageous for Korean, characterized as a low-resource language, which confronts a significant challenge due to limited resources of speech data and available ASR models. Initially, we validate the efficacy of training the n-gram model at the clause-level by contrasting its inference accuracy with that of the E2E ASR model when merged with language models trained on smaller lexical units. We then demonstrate that our approach achieves enhanced domain adaptation accuracy compared to Shallow Fusion, a previously devised method for merging an external language model with an E2E ASR model without necessitating additional training.
https://doi.org/10.3837/tiis.2024.06.015 인용 PDF HTML

KorPatELECTRA : A Pre-trained Language Model for Korean Patent Literature to improve performance in the field of natural language processing(Korean Patent ELECTRA)

Jang, Ji-Mo;Min, Jae-Ok;Noh, Han-Sung
- Journal of the Korea Society of Computer and Information
- /
- v.27 no.2
- /
- pp.15-23
- /
- 2022
In the field of patents, as NLP(Natural Language Processing) is a challenging task due to the linguistic specificity of patent literature, there is an urgent need to research a language model optimized for Korean patent literature. Recently, in the field of NLP, there have been continuous attempts to establish a pre-trained language model for specific domains to improve performance in various tasks of related fields. Among them, ELECTRA is a pre-trained language model by Google using a new method called RTD(Replaced Token Detection), after BERT, for increasing training efficiency. The purpose of this paper is to propose KorPatELECTRA pre-trained on a large amount of Korean patent literature data. In addition, optimal pre-training was conducted by preprocessing the training corpus according to the characteristics of the patent literature and applying patent vocabulary and tokenizer. In order to confirm the performance, KorPatELECTRA was tested for NER(Named Entity Recognition), MRC(Machine Reading Comprehension), and patent classification tasks using actual patent data, and the most excellent performance was verified in all the three tasks compared to comparative general-purpose language models.
https://doi.org/10.9708/jksci.2022.27.02.015 인용 PDF KSCI HTML

Robust Tracking Control Based on Intelligent Sliding-Mode Model-Following Position Controllers for PMSM Servo Drives

El-Sousy Fayez F.M.
- Journal of Power Electronics
- /
- v.7 no.2
- /
- pp.159-173
- /
- 2007
In this paper, an intelligent sliding-mode position controller (ISMC) for achieving favorable decoupling control and high precision position tracking performance of permanent-magnet synchronous motor (PMSM) servo drives is proposed. The intelligent position controller consists of a sliding-mode position controller (SMC) in the position feed-back loop in addition to an on-line trained fuzzy-neural-network model-following controller (FNNMFC) in the feedforward loop. The intelligent position controller combines the merits of the SMC with robust characteristics and the FNNMFC with on-line learning ability for periodic command tracking of a PMSM servo drive. The theoretical analyses of the sliding-mode position controller are described with a second order switching surface (PID) which is insensitive to parameter uncertainties and external load disturbances. To realize high dynamic performance in disturbance rejection and tracking characteristics, an on-line trained FNNMFC is proposed. The connective weights and membership functions of the FNNMFC are trained on-line according to the model-following error between the outputs of the reference model and the PMSM servo drive system. The FNNMFC generates an adaptive control signal which is added to the SMC output to attain robust model-following characteristics under different operating conditions regardless of parameter uncertainties and load disturbances. A computer simulation is developed to demonstrate the effectiveness of the proposed intelligent sliding mode position controller. The results confirm that the proposed ISMC grants robust performance and precise response to the reference model regardless of load disturbances and PMSM parameter uncertainties.
PDF KSCI

Search Result 1,593, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)