• Title/Summary/Keyword: Annual training

Search Result 198, Processing Time 0.026 seconds

NMT Training Method for Korean-English Idiom Machine Translation (한-영 관용구 기계번역을 위한 NMT 학습 방법)

  • Choi, Min-Joo;Lee, Chang-Ki
    • Annual Conference on Human and Language Technology
    • /
    • 2020.10a
    • /
    • pp.353-356
    • /
    • 2020
  • 관용구는 둘 이상의 단어가 결합하여 특정한 뜻을 생성한 어구로 기계번역 시 종종 오역이 발생한다. 이는 관용구가 지닌 함축적인 의미를 정확하게 번역할 수 없는 기계번역의 한계를 드러낸다. 따라서 신경망 기계 번역(Neural Machine Translation)에서 관용구를 효과적으로 학습하려면 관용구에 특화된 번역 쌍 데이터셋과 학습 방법이 필요하다. 본 논문에서는 한-영 관용구 기계번역에 특화된 데이터셋을 이용하여 신경망 기계번역 모델에 관용구를 효과적으로 학습시키기 위해 특정 토큰을 삽입하여 문장에 포함된 관용구의 위치를 나타내는 방법을 제안한다. 실험 결과, 제안한 방법을 이용하여 학습하였을 때 대부분의 신경망 기계 번역 모델에서 관용구 번역 품질의 향상이 있음을 보였다.

  • PDF

Methodology for Overcoming the Problem of Position Embedding Length Limitation in Pre-training Models (사전 학습 모델의 위치 임베딩 길이 제한 문제를 극복하기 위한 방법론)

  • Minsu Jeong;Tak-Sung Heo;Juhwan Lee;Jisu Kim;Kyounguk Lee;Kyungsun Kim
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.463-467
    • /
    • 2023
  • 사전 학습 모델을 특정 데이터에 미세 조정할 때, 최대 길이는 사전 학습에 사용한 최대 길이 파라미터를 그대로 사용해야 한다. 이는 상대적으로 긴 시퀀스의 처리를 요구하는 일부 작업에서 단점으로 작용한다. 본 연구는 상대적으로 긴 시퀀스의 처리를 요구하는 질의 응답(Question Answering, QA) 작업에서 사전 학습 모델을 활용할 때 발생하는 시퀀스 길이 제한에 따른 성능 저하 문제를 극복하는 방법론을 제시한다. KorQuAD v1.0과 AIHub에서 확보한 데이터셋 4종에 대하여 BERT와 RoBERTa를 이용해 성능을 검증하였으며, 실험 결과, 평균적으로 길이가 긴 문서를 보유한 데이터에 대해 성능이 향상됨을 확인할 수 있었다.

  • PDF

Claim Detection and Stance Classification through Pattern Extraction Learning in Korean (패턴 추출 학습을 통한 한국어 주장 탐지 및 입장 분류)

  • Woojin Lee;Seokwon Jeong;Tae-il Kim;Sung-won Choi;Harksoo Kim
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.234-238
    • /
    • 2023
  • 미세 조정은 대부분의 연구에서 사전학습 모델을 위한 표준 기법으로 활용되고 있으나, 최근 초거대 모델의 등장과 환경 오염 등의 문제로 인해 더 효율적인 사전학습 모델 활용 방법이 요구되고 있다. 패턴 추출 학습은 사전학습 모델을 효율적으로 활용하기 위해 제안된 방법으로, 본 논문에서는 한국어 주장 탐지 및 입장 분류를 위해 패턴 추출 학습을 활용하는 모델을 구현하였다. 우리는 기존 미세 조정 방식 모델과의 비교 실험을 통해 본 논문에서 구현한 한국어 주장 탐지 및 입장 분류 모델이 사전학습 단계에서 학습한 모델의 내부 지식을 효과적으로 활용할 수 있음을 보였다.

  • PDF

A Study about Efficient Method for Training the Reward Model in RLHF (인간 피드백 기반 강화학습 (RLHF)에서 보상 모델의 효과적인 훈련 방법에 관한 연구)

  • Jeongwook Kim;Imatitikua Danielle Aiyanyo;Heuiseok Lim
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.245-250
    • /
    • 2023
  • RLHF(Reinforcement Learning from Human Feedback, 인간 피드백 기반 강화학습) 방법론이 최근 고성능 언어 모델에 많이 적용되고 있다. 이 방법은 보상 모델과 사람의 피드백을 활용하여 언어 모델로 하여금 사람이 선호할 가능성이 높은 응답을 생성하도록 한다. 하지만 상업용 언어 모델에 적용된 RLHF의 경우 구현 방법에 대하여 정확히 밝히고 있지 않다. 특히 강화학습에서 환경(environment)을 담당하는 보상 모델을 어떻게 설정하는지가 가장 중요하지만 그 부분에 대하여 오픈소스 모델들의 구현은 각각 다른 실정이다. 본 연구에서는 보상 모델을 훈련하는 큰 두 가지 갈래인 '순위 기반 훈련 방법'과 '분류 기반 훈련 방법'에 대하여 어떤 방법이 더 효율적인지 실험한다. 또한 실험 결과 분석을 근거로 효율성의 차이가 나는 이유에 대하여 추정한다.

  • PDF

Synonyms/Antonyms-Based Data Augmentation For Training TOEIC Problems Solving Model (토익 문제 풀이 모델 학습을 위한 유의어/반의어 기반 데이터 증강 기법)

  • Jeongwoo Lee;Aiyanyo Imatitikua Danielle;Heuiseok Lim
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.333-335
    • /
    • 2023
  • 최근 글을 이해하고 답을 추론하는 연구들이 많이 이루어지고 있으며, 대표적으로 기계 독해 연구가 존재한다. 기계 독해와 관련하여 다양한 데이터셋이 공개되어 있지만, 과거에서부터 현재까지 사람의 영어 능력 평가를 위해 많이 사용되고 있는 토익에 대해서는 공식적으로 공개된 데이터셋도 거의 존재하지 않으며, 이를 위한 연구 또한 활발히 진행되고 있지 않다. 이에 본 연구에서는 현재와 같이 데이터가 부족한 상황에서 기계 독해 모델의 성능을 향상시키기 위한 데이터 증강 기법을 제안하고자 한다. 제안하는 방법은 WordNet을 이용하여 유의어 및 반의어를 기반으로 굉장히 간단하면서도 효율적으로 실제 토익 문제와 유사하게 데이터를 증강하는 것이며, 실험을 통해 해당 방법의 유의미함을 확인하였다. 우리는 본 연구를 통해 토익에 대한 데이터 부족 문제를 해소하고, 사람 수준의 우수한 성능을 얻을 수 있도록 한다.

  • PDF

Evaluation on Hearing Conservation Program in the Noisy Industries (소음발생 산업장에서의 청력보존프로그램 평가)

  • Kwak, M.S.;Lee, J.T.;Kim, J.H.;Urm, S.H.;Kim, D.H.;Shon, B.C.;Lee, C.H.
    • Journal of Preventive Medicine and Public Health
    • /
    • v.30 no.4 s.59
    • /
    • pp.815-829
    • /
    • 1997
  • This study was performed to assist the employer to establish the effective program for hearing conservation of noisy industry. The study subjects were health care managers of an industry and the study industries were devided into two groups(Group I, 37 industries; have the workers diagnosed as noise-induced hearing loss, Group II, 41 industries; not have the workers diagnosed as noise-induced hearing loss) and the question method carried out through the face to face interview. The contents of questionnaire for OSHA's hearing conservation program(HCP) consisted of seven components: 5 questions of monitoring of employee noise exposures(component 1), 6 questions of the institution of engineering, work practice, and administrative controls for excessive noise(component 2), 8 questions of the provision of each overexposed employee with an individually fitted hearing protector with an adequate noise reduction rating(component 3), 14 questions of employee training and education regarding noise hazards and protection measures(component 4), 9 questions of baseline and annual audiometry(component 5), 3 questions of procedures for preventing further occupational hearning loss by an employee whenever such an event has been identified(component 6), and 1 question of recording keeping(component 7), thus total numbers of questions was 46. The numbers of statistially significant difference(p<0.05) between two groups were 2(25.0%) among 8 questions of component 3, 10(71.4%) among 14 questions of component 4, 3(33.3%) among 8 questions of component 5, 2(6.7%) among 3 questions of component 6, and 17(37.0%) among total 46 questions of questionnaire. Above results showed that the level of HCP acceptance in group I was lower than in group II. Thus employer's understanding about HCP should be precede for the effective hearing conservation program of employee and the adequate hearing protector, training and education, baseline and annual audiometry, and procedures for preventing further occupational hearning loss for hearing conservation would be more emphasized.

  • PDF

Factors Affecting Burnout of Staff in Emergency Medical Service (Focusing on 119 rescuers in Busan and Gyeongnam) (응급의료종사자의 소진 영향요인 (부산, 경남 지역 119구급대원 중심으로))

  • Hong, Hui-Jeong;Sung, Mi Hae
    • Korean Journal of Occupational Health Nursing
    • /
    • v.21 no.2
    • /
    • pp.164-173
    • /
    • 2012
  • Purpose: This study aims to investigate the degree of burnout of 119 rescuers, to determine factors influencing their burnout, and to provide basic data for development of intervention programs to prevent the burnout. Methods: Subjects of this study were all 119 rescuers working at fire stations located in Busan and Gyeongnam. The data were collected from May 1, 2010 to June 30, 2010. The collected data were analyzed with SPSS/WIN 17.0. Results: When the difference in the degree of burnout by general characteristics was investigated, the means of burnout were significantly different depending on age, period of service, position, license, annual salary, desire to work continuously and types of working hours. Job stress, job satisfaction, social support, self efficacy and self esteem showed a statistically significant correlation with burnout. The degree of burnout became higher with more job stress and lower job satisfaction, social support, self efficacy and self esteem. As a significant factor affecting the degree of burnout of 119 rescuers, job stress showed 47.3% of explanatory power. The explanatory power of all of job stress, job satisfaction, self efficacy, types of working hours, annual salary and license was 62% and the power of job stress was the highest. Conclusion: From these results, job stress, job satisfaction and self-efficacy were found to be factors affecting the burnout of 119 rescuers. Therefore, intervention programs to reduce job stress and to improve job satisfaction, social support, self efficacy and self esteem should be developed to lower the degree of burnout of 119 rescuers. In addition, further researches to analyze works of 119 rescuers and legal and institutional strategies to improve their treatment are necessary and supplementary training in various practices by different circumstances based on standardized protocols should be conducted.

Relation to use of oral hygiene devices in the adults (연령층별 성인의 개인구강관리보조용품 사용 여부와의 관련성)

  • Moon, Jung-Eun;Lee, Eun-Ju
    • Journal of Korean society of Dental Hygiene
    • /
    • v.16 no.3
    • /
    • pp.427-434
    • /
    • 2016
  • Objectives: This study aims to investigate the elements to affect the usage of individual oral hygiene devices in adults by the age group, to make the community inhabitants keep their healthy dental hygiene status, and to provide them with the educational materials for the dental hygiene and the basic data for the program development. The purpose of the study is to investigate the relation ot use of oral hygiene devices in the adults. Methods: The subjects were 9,073 adults from the sixth KNHANES from January, 2013 to December, 2014. The study consisted of questionnaire survey and direct physical examination. The questionnaire included genral characteristics of the subjects and oral health characteristics. The general characteristics consisted of subjective perception of health and chronic diseases. The oral health characteristics consisted of subjective oral health perception, dental caries, periodontal disease, annual oral examination, toothbrushing, prosthetics, implant surgery, and use of individual oral hygiene devices. Results: Those within 40 to 64 years old were the top users of oral hygiene devices. They perceived their dental hygiene was normal because they did not have periodontal disease but most of them had dental caries. They used oral hygiene devices three times a day and brushed teeth more than three times a day. They took annual dental checkup. Conclusions: It is necessary to promote the use of oral hygiene devices to prevent the dental caries and periodontal disease. The continuous training for the dental hygienists is very important because the dental hygienists is the first line of the prevention of dental caries and periodontal disease.

Towards Conservation of Omani Local Chicken: Phenotypic Characteristics, Management Practices and Performance Traits

  • Al-Qamashoui, B.;Mahgoub, O.;Kadim, I.;Schlecht, E.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.27 no.6
    • /
    • pp.767-777
    • /
    • 2014
  • Characterizing local chicken types and their mostly rural production systems is prerequisite for designing and implementing development and conservation programs. This study evaluated the management practices of small-scale chicken keepers and the phenotypic and production traits of their chickens in Oman, where conservation programs for local livestock breeds have currently started. Free-range scavenging was the dominant production system, and logistic regression analysis showed that socio-economic factors such as training in poultry keeping, household income, income from farming and gender of chicken owners influenced feeding, housing, and health care practices (p<0.05). A large variation in plumage and shank colors, comb types and other phenotypic traits within and between Omani chicken populations were observed. Male and female body weight differed (p<0.05), being $1.3{\pm}0.65$ kg and $1.1{\pm}0.86$ kg respectively. Flock size averaged $22{\pm}7.7$ birds per household with 4.8 hens per cock. Clutch size was $12.3{\pm}2.85$ and annual production $64.5{\pm}2.85$ eggs per hen. Egg hatchability averaged $88{\pm}6.0%$ and annual chicken mortality across all age and sex categories was $16{\pm}1.4%$. The strong involvement of women in chicken keeping makes them key stakeholders in future development and conservation programs, but the latter should be preceded by a comprehensive study of the genetic diversity of the Omani chicken populations.

Estimation of extreme sea levels at tide-dominated coastal zone (조석이 지배적인 해역의 극치해면 산정)

  • Kang, Ju Whan;Kim, Yang-Seon;Cho, Hongyeon;Shim, Jae-Seol
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.24 no.6
    • /
    • pp.381-389
    • /
    • 2012
  • An EST-based method which is applicable for estimating extreme sea levels from short sea-level records in a tide dominated coastal zone was developed. Via the method, annual maximum tidal level is chosen from the simulated 1-yr tidal data which are constituted by the independent daily high water levels, short term and long term surge heights and typhoon-induced surge heights. The high water levels are generated considering not only spring/neap tides and annual tide but also 18.6-year lunar nodal cycle. Typhoon-induced surges are selected from the training set which is constructed by observed or simulated surge heights. This yearly simulation is repeated many hundred years to yield the extreme tidal levels, and the whole process is carried out many hundred times repeatedly to get robust statistics of the levels. In addition, validation of the method is also shown by comparing the result with other researches with the tidal data of Mokpo Harbor.