• Title/Summary/Keyword: Pretraining

Search Result 33, Processing Time 0.032 seconds

Molecular Property Prediction with Deep-learning and Pretraining Strategy (사전학습 전략과 딥러닝을 활용한 분자의 특성 예측)

  • Lee, Seungbeom;Kim, Jiye;Kim, Dongwoo;Park, Jaesik;Ahn, Sungsoo
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.07a
    • /
    • pp.63-66
    • /
    • 2022
  • 본 논문에서는 분자의 특성을 정확하게 예측하기 위해 효과적인 사전학습(pretraining) 전략과 트랜스포머(Transformer) 모델을 활용한 방법을 제시한다. 딥러닝을 활용한 분자의 성능을 예측하는 연구는 그동안 레이블이 부족한 분자데이터의 특성에 의해 학습 때 사용된 데이터이외의 분자데이터에 대해 일반화 능력이 떨어지는 어려움을 겪었다. 이 논문에서 제시한 모델은 사전학습(pretraining)을 수행할 때 자기지도학습(self-supervised training)을 사용하여 부족한 레이블에 의한 문제점을 피할 수 있다. 대규모 분자 데이터셋으로부터 학습된 이 모델은 4가지 다운스트림 데이터셋에 대해 모두 우수한 성능을 보여주어 일반화 성능이 뛰어나며 효과적인 분자표현을 얻을 수 있음을 보인다.

  • PDF

Reinforcement learning-based control with application to the once-through steam generator system

  • Cheng Li;Ren Yu;Wenmin Yu;Tianshu Wang
    • Nuclear Engineering and Technology
    • /
    • v.55 no.10
    • /
    • pp.3515-3524
    • /
    • 2023
  • A reinforcement learning framework is proposed for the control problem of outlet steam pressure of the once-through steam generator(OTSG) in this paper. The double-layer controller using Proximal Policy Optimization(PPO) algorithm is applied in the control structure of the OTSG. The PPO algorithm can train the neural networks continuously according to the process of interaction with the environment and then the trained controller can realize better control for the OTSG. Meanwhile, reinforcement learning has the characteristic of difficult application in real-world objects, this paper proposes an innovative pretraining method to solve this problem. The difficulty in the application of reinforcement learning lies in training. The optimal strategy of each step is summed up through trial and error, and the training cost is very high. In this paper, the LSTM model is adopted as the training environment for pretraining, which saves training time and improves efficiency. The experimental results show that this method can realize the self-adjustment of control parameters under various working conditions, and the control effect has the advantages of small overshoot, fast stabilization speed, and strong adaptive ability.

Effect of 8 weekas aerobic dance training on the body composition, cardiopulmonary function and blood cholesterol concentration in young women (젊은여성에서 8주간의 aerobic dance 훈련이 체구성, 심폐기능, 혈중 콜레스테롤 농도에 미치는 효과)

  • 최명애
    • Journal of Korean Academy of Nursing
    • /
    • v.18 no.2
    • /
    • pp.105-117
    • /
    • 1988
  • To evaluate training effect, aerobic dance was performed by eight female collegestudents for 8 weeks. Body composition, cardiopulmonary function at rest and during maximal exercise, blood cholesterol concentration at rest were determined before and after 8weeks of aerobic dance training. Maximal exercise was performed on the treadmill according of Bruce protocol. Pre to post training differences were evaluated. The results obtained were as follows : 1. After the training, skinfold thickness and total body fat decreased significantly(p<0.1) while lean body mass increased with significance (p<0.1). 2. Heart rate and arterial blood pressure at rest decreased without sinificance after the training. 3. As a result of training, forced vital capacity and forced expiratory volume for a second increased significantly (p<0.01, p<0.1). 4. After the training period, heart rate at 3, 6, and 9 min. during treadmill exercise was significantly lower than those of pretraining (p<0.05). 5. After the training, systolic and diastolic blood pressure at 6 and 9 min during the exercise was significantly lower than those of pretraining (p<0.025, p<0.1). 6. After the training, oxygen uptake at 3 and 6 min during the exercise was significantly greater than those of pretraining (p<0.05). 7. As a result of training, the maximal oxygen uptake increased significantly during the exercise (p<0.1). 8. After the training, expired air volume for a minute at 3 and 6 min during the exercise was signigicantly grerter than those of pretraining (p<0.1). 9. After the training, repiratory quotient during the exercise was lower than pretaining without significance. 10. After the training, blood HDL -cholesterol concentration incereased with significance, (p<0.1) blood total cholesterol and triglycerids concentration decreasedsignificantly (p<0.1). From these results, it may be concluded that 8 week aerobic dance training reduces skinfold thickness and body fat contents, improves the cardiopulmonary function and tissue oxygen utilization, reduces blood cholesterol and triglyceride concentration and brings about the increase of blood HDL-cholesterol concentriation.

  • PDF

Korean automatic spacing using pretrained transformer encoder and analysis

  • Hwang, Taewook;Jung, Sangkeun;Roh, Yoon-Hyung
    • ETRI Journal
    • /
    • v.43 no.6
    • /
    • pp.1049-1057
    • /
    • 2021
  • Automatic spacing in Korean is used to correct spacing units in a given input sentence. The demand for automatic spacing has been increasing owing to frequent incorrect spacing in recent media, such as the Internet and mobile networks. Therefore, herein, we propose a transformer encoder that reads a sentence bidirectionally and can be pretrained using an out-of-task corpus. Notably, our model exhibited the highest character accuracy (98.42%) among the existing automatic spacing models for Korean. We experimentally validated the effectiveness of bidirectional encoding and pretraining for automatic spacing in Korean. Moreover, we conclude that pretraining is more important than fine-tuning and data size.

PC-SAN: Pretraining-Based Contextual Self-Attention Model for Topic Essay Generation

  • Lin, Fuqiang;Ma, Xingkong;Chen, Yaofeng;Zhou, Jiajun;Liu, Bo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.8
    • /
    • pp.3168-3186
    • /
    • 2020
  • Automatic topic essay generation (TEG) is a controllable text generation task that aims to generate informative, diverse, and topic-consistent essays based on multiple topics. To make the generated essays of high quality, a reasonable method should consider both diversity and topic-consistency. Another essential issue is the intrinsic link of the topics, which contributes to making the essays closely surround the semantics of provided topics. However, it remains challenging for TEG to fill the semantic gap between source topic words and target output, and a more powerful model is needed to capture the semantics of given topics. To this end, we propose a pretraining-based contextual self-attention (PC-SAN) model that is built upon the seq2seq framework. For the encoder of our model, we employ a dynamic weight sum of layers from BERT to fully utilize the semantics of topics, which is of great help to fill the gap and improve the quality of the generated essays. In the decoding phase, we also transform the target-side contextual history information into the query layers to alleviate the lack of context in typical self-attention networks (SANs). Experimental results on large-scale paragraph-level Chinese corpora verify that our model is capable of generating diverse, topic-consistent text and essentially makes improvements as compare to strong baselines. Furthermore, extensive analysis validates the effectiveness of contextual embeddings from BERT and contextual history information in SANs.

Seismic Data Processing Using BERT-Based Pretraining: Comparison of Shotgather Arrays (BERT 기반 사전학습을 이용한 탄성파 자료처리: 송신원 모음 배열 비교)

  • Youngjae Shin
    • Geophysics and Geophysical Exploration
    • /
    • v.27 no.3
    • /
    • pp.171-180
    • /
    • 2024
  • The processing of seismic data involves analyzing earthquake wave data to understand the internal structure and characteristics of the Earth, which requires high computational power. Recently, machine learning (ML) techniques have been introduced to address these challenges and have been utilized in various tasks such as noise reduction and velocity model construction. However, most studies have focused on specific seismic data processing tasks, limiting the full utilization of similar features and structures inherent in the datasets. In this study, we compared the efficacy of using receiver-wise time-series data ("receiver array") and synchronized receiver signals ("time array") from shotgathers for pretraining a Bidirectional Encoder Representations from Transformers (BERT) model. To this end, shotgather data generated from a synthetic model containing faults was used to perform noise reduction, velocity prediction, and fault detection tasks. In the task of random noise reduction, both the receiver and time arrays showed good performance. However, for tasks requiring the identification of spatial distributions, such as velocity estimation and fault detection, the results from the time array were superior.

The Comparion of study Isokinetic Evalution between shoulder muscles Dominant and Non-dominant in the Normal adults (정상성인의 견관절 우성 근력과 비우성 근력 비교 연구 - Cybex II + Isokinetic Dynamometer를 이용한 평가를 기준으로 -)

  • Moon Sung-Gi
    • The Journal of Korean Physical Therapy
    • /
    • v.11 no.1
    • /
    • pp.111-119
    • /
    • 1999
  • The object of study who healthufl thirty persons have been enforced Isokinetic exercise of non-dominant muscular strength. The next same that each dominant muscular strength and non-dominant strength, peak torque and total work have been comparative analysis 1, Shoulder muscles comparion increase peak torque at low speed from pretraining Isokinetic exercise of non-dominant strength side to ten week of post-training.. Flexor and extensor come out high and statistically significant 6, 8, 10 week than pretraining. Adductor and abductor come out high and statistically significant 4, 6, 8. 10 week then pre-training, Internal rotator and external rotator come out statistically significant 2, 4, 6, 8, 10 week then pre-ttraining. 2. Shoulder muscles comparion increase peak torque at high speed from pre-training Isokinetic execise of non-dominant strenght side to ten week of post-training. Flexor and extensor come out high and statistically significant 4, 6, 8, 10 week then pie-training. Adductor and abductor come out high and statistically significant 2, 4, 6, 8, 10 week than pre-training. Futernal rotator and external rotator come out statistically significant 4. 6, 8, 10 week than pre-training.

  • PDF

The Bi-Cross Pretraining Method to Enhance Language Representation (Bi-Cross 사전 학습을 통한 자연어 이해 성능 향상)

  • Kim, Sung-ju;Kim, Seonhoon;Park, Jinseong;Yoo, Kang Min;Kang, Inho
    • Annual Conference on Human and Language Technology
    • /
    • 2021.10a
    • /
    • pp.320-325
    • /
    • 2021
  • BERT는 사전 학습 단계에서 다음 문장 예측 문제와 마스킹된 단어에 대한 예측 문제를 학습하여 여러 자연어 다운스트림 태스크에서 높은 성능을 보였다. 본 연구에서는 BERT의 사전 학습 문제 중 다음 문장 예측 문제에 대해 주목했다. 다음 문장 예측 문제는 자연어 추론 문제와 질의 응답 문제와 같이 임의의 두 문장 사이의 관계를 모델링하는 문제들에 성능 향상을 위해 사용되었다. 하지만 BERT의 다음 문장 예측 문제는 두 문장을 특수 토큰으로 분리하여 단일 문자열 형태로 모델에 입력으로 주어지는 cross-encoding 방식만을 학습하기 때문에 문장을 각각 인코딩하는 bi-encoding 방식의 다운스트림 태스크를 고려하지 않은 점에서 아쉬움이 있다. 본 논문에서는 기존 BERT의 다음 문장 예측 문제를 확장하여 bi-encoding 방식의 다음 문장 예측 문제를 추가적으로 사전 학습하여 단일 문장 분류 문제와 문장 임베딩을 활용하는 문제에서 성능을 향상 시키는 Bi-Cross 사전 학습 기법을 소개한다. Bi-Cross 학습 기법은 영화 리뷰 감성 분류 데이터 셋인 NSMC 데이터 셋에 대해 학습 데이터의 0.1%만 사용하는 학습 환경에서 Bi-Cross 사전 학습 기법 적용 전 모델 대비 5점 가량의 성능 향상이 있었다. 또한 KorSTS의 bi-encoding 방식의 문장 임베딩 성능 평가에서 Bi-Cross 사전 학습 기법 적용 전 모델 대비 1.5점의 성능 향상을 보였다.

  • PDF

A Study on the Analysis of Variables Affecting Teacher Librarians' Practice Teaching in Korea (사서교사 교유실습의 영향요인에 관한 연구)

  • Kim, Sung-Jun
    • Journal of Korean Library and Information Science Society
    • /
    • v.42 no.1
    • /
    • pp.183-203
    • /
    • 2011
  • The purpose of this study is to propose an effective development of teacher librarians' practice teaching and to identify how much teacher librarians' practice teaching affects their expertise in Korea. For this, two research models are constructed. The first model is to analyze the relationship of variables between the effect of practice teaching and the following variables: pretraining, practice program, advisory teacher, training condition and organizational atmosphere. The second model is to analyze how much the practice teaching affects their role perception and expertise compared with training courses and field experiences. The result is as follow: It is the training program that greatly affects teacher librarians' practice teaching among those variables. And the practice teaching has a positive effect on the ability of role performance. According to the above result, the practice teaching is effective in developing teacher librarians' expertise. And more systematic practice teaching program, enough pretraining and the efforts of advisory teacher are necessary to enhance the effect of teacher librarians' practice teaching.

Prediction of multipurpose dam inflow utilizing catchment attributes with LSTM and transformer models (유역정보 기반 Transformer및 LSTM을 활용한 다목적댐 일 단위 유입량 예측)

  • Kim, Hyung Ju;Song, Young Hoon;Chung, Eun Sung
    • Journal of Korea Water Resources Association
    • /
    • v.57 no.7
    • /
    • pp.437-449
    • /
    • 2024
  • Rainfall-runoff prediction studies using deep learning while considering catchment attributes have been gaining attention. In this study, we selected two models: the Transformer model, which is suitable for large-scale data training through the self-attention mechanism, and the LSTM-based multi-state-vector sequence-to-sequence (LSTM-MSV-S2S) model with an encoder-decoder structure. These models were constructed to incorporate catchment attributes and predict the inflow of 10 multi-purpose dam watersheds in South Korea. The experimental design consisted of three training methods: Single-basin Training (ST), Pretraining (PT), and Pretraining-Finetuning (PT-FT). The input data for the models included 10 selected watershed attributes along with meteorological data. The inflow prediction performance was compared based on the training methods. The results showed that the Transformer model outperformed the LSTM-MSV-S2S model when using the PT and PT-FT methods, with the PT-FT method yielding the highest performance. The LSTM-MSV-S2S model showed better performance than the Transformer when using the ST method; however, it showed lower performance when using the PT and PT-FT methods. Additionally, the embedding layer activation vectors and raw catchment attributes were used to cluster watersheds and analyze whether the models learned the similarities between them. The Transformer model demonstrated improved performance among watersheds with similar activation vectors, proving that utilizing information from other pre-trained watersheds enhances the prediction performance. This study compared the suitable models and training methods for each multi-purpose dam and highlighted the necessity of constructing deep learning models using PT and PT-FT methods for domestic watersheds. Furthermore, the results confirmed that the Transformer model outperforms the LSTM-MSV-S2S model when applying PT and PT-FT methods.