Search | Korea Science

Evaluating Korean Machine Reading Comprehension Generalization Performance using Cross and Blind Dataset Assessment (기계독해 데이터셋의 교차 평가 및 블라인드 평가를 통한 한국어 기계독해의 일반화 성능 평가)

Lim, Joon-Ho;Kim, Hyunki
- Annual Conference on Human and Language Technology
- /
- 2019.10a
- /
- pp.213-218
- /
- 2019
기계독해는 자연어로 표현된 질문과 단락이 주어졌을 때, 해당 단락 내에 표현된 정답을 찾는 태스크이다. 최근 기계독해 태스크도 다른 자연어처리 태스크와 유사하게 BERT, XLNet, RoBERTa와 같이 사전에 학습한 언어모델을 이용하고 질문과 단락이 입력되었을 경우 정답의 경계를 추가 학습(fine-tuning)하는 방법이 우수한 성능을 보이고 있으며, 특히 KorQuAD v1.0 데이터셋에서 학습 및 평가하였을 경우 94% F1 이상의 높은 성능을 보이고 있다. 본 논문에서는 현재 최고 수준의 기계독해 기술이 학습셋과 유사한 평가셋이 아닌 일반적인 질문과 단락 쌍에 대해서 가지는 일반화 능력을 평가하고자 한다. 이를 위하여 첫번째로 한국어에 대해서 공개된 KorQuAD v1.0 데이터셋과 NIA v2017 데이터셋, 그리고 엑소브레인 과제에서 구축한 엑소브레인 v2018 데이터셋을 이용하여 데이터셋 간의 교차 평가를 수행하였다. 교차 평가결과, 각 데이터셋의 정답의 길이, 질문과 단락 사이의 오버랩 비율과 같은 데이터셋 통계와 일반화 성능이 서로 관련이 있음을 확인하였다. 다음으로 KorBERT 사전 학습 언어모델과 학습 가능한 기계독해 데이터 셋 21만 건 전체를 이용하여 학습한 기계독해 모델에 대해 블라인드 평가셋 평가를 수행하였다. 블라인드 평가로 일반분야에서 학습한 기계독해 모델의 법률분야 평가셋에서의 일반화 성능을 평가하고, 정답 단락을 읽고 질문을 생성하지 않고 질문을 먼저 생성한 후 정답 단락을 검색한 평가셋에서의 기계독해 성능을 평가하였다. 블라인드 평가 결과, 사전 학습 언어 모델을 사용하지 않은 기계독해 모델 대비 사전 학습 언어 모델을 사용하는 모델이 큰 폭의 일반화 성능을 보였으나, 정답의 길이가 길고 질문과 단락 사이 어휘 오버랩 비율이 낮은 평가셋에서는 아직 80%이하의 성능을 보임을 확인하였다. 본 논문의 실험 결과 기계 독해 태스크는 특성 상 질문과 정답 사이의 어휘 오버랩 및 정답의 길이에 따라 난이도 및 일반화 성능 차이가 발생함을 확인하였고, 일반적인 질문과 단락을 대상으로 하는 기계독해 모델 개발을 위해서는 다양한 유형의 평가셋에서 일반화 평가가 필요함을 확인하였다.
PDF

A Training Feasibility Evaluation of Nuclear Safeguards Terms for the Large Language Model (LLM) (거대언어모델에 대한 원자력 안전조치 용어 적용 가능성 평가)

Sung-Ho Yoon
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2024.01a
- /
- pp.479-480
- /
- 2024
본 논문에서는 원자력 안전조치 용어를 미세조정(fine tuning) 알고리즘을 활용해 추가 학습한 공개 거대 언어모델(Large Language Model, LLM)이 안전조치 관련 질문에 대해 답변한 결과를 정성적으로 평가하였다. 평가 결과, 학습 데이터 범위 내 질문에 대해 학습 모델은 기반 모델 답변에 추가 학습 데이터를 활용한 낮은 수준의 추론을 수행한 답변을 출력하였다. 평가 결과를 통해 추가 학습 개선 방향을 도출하였으며 저비용 전문 분야 언어 모델 구축에 활용할 수 있을 것으로 보인다.
PDF

Data modeling and algorithms design for implementing Competency-based Learning Outcomes Assessment System (역량기반 학습성과 평가 시스템 구현을 위한 데이터 모델링 및 알고리즘 설계)

Chung, Hyun-Sook;Kim, Jung-Min
- Journal of Convergence for Information Technology
- /
- v.11 no.11
- /
- pp.335-344
- /
- 2021
The purpose of this paper is the development of course data models and learning achievement computation algorithms for enabling the course-embedded assessment(CEA), which is essential of competency-based education in higher education. The previous works related CEA have weakness in the development of the systematic solution for CEA computation. In this paper, we propose data models and algorithms to implement competency-based assessment system. Our data models are composed of a layered architecture of learning outcomes, learning modules and activities, and an associative matrix of learning outcomes and activities. The proposed methods can be applied to the development of the course-embedded assessment system as core modules. We evaluated the effectiveness of our proposed models through applying the models to a practical course, Java Programing. From the result of the experiments we found that our models can be used in the assessment system as a core module.
https://doi.org/10.22156/CS4SMB.2021.11.11.335 인용 PDF KSCI

Measurement of Political Polarization in Korean Language Model by Quantitative Indicator (한국어 언어 모델의 정치 편향성 검증 및 정량적 지표 제안)

Jeongwook Kim;Gyeongmin Kim;Imatitikua Danielle Aiyanyo;Heuiseok Lim
- Annual Conference on Human and Language Technology
- /
- 2022.10a
- /
- pp.16-21
- /
- 2022
사전학습 말뭉치는 위키백과 문서 뿐만 아니라 인터넷 커뮤니티의 텍스트 데이터를 포함한다. 이는 언어적 관념 및 사회적 편향된 정보를 포함하므로 사전학습된 언어 모델과 파인튜닝한 언어 모델은 편향성을 내포한다. 이에 따라 언어 모델의 중립성을 평가할 수 있는 지표의 필요성이 대두되었으나, 아직까지 언어 인공지능 모델의 정치적 중립성에 대해 정량적으로 평가할 수 있는 척도는 존재하지 않는다. 본 연구에서는 언어 모델의 정치적 편향도를 정량적으로 평가할 수 있는 지표를 제시하고 한국어 언어 모델에 대해 평가를 수행한다. 실험 결과, 위키피디아로 학습된 언어 모델이 가장 정치 중립적인 경향성을 나타내었고, 뉴스 댓글과 소셜 리뷰 데이터로 학습된 언어 모델의 경우 정치 보수적, 그리고 뉴스 기사를 기반으로 학습된 언어 모델에서 정치 진보적인 경향성을 나타냈다. 또한, 본 논문에서 제안하는 평가 방법의 안정성 검증은 각 언어 모델의 정치적 편향 평가 결과가 일관됨을 입증한다.
PDF

A Study on the Quality Evaluation Model for Cyber Education Supporting System (가상학습 지원시스템의 품질평가 모델에 관한 연구)

강호영;박만곤
- Proceedings of the Korea Multimedia Society Conference
- /
- 2000.11a
- /
- pp.432-436
- /
- 2000
가상학습은 지식 정보화시대를 맞이하여 국내외으로 새로운 교육훈련 패러다임으로서 시간과 공간을 초월한 얼린 학습공간으로 부각되고 있다. 특히 정보처리 기술의 급속한 발전으로 국내외 각종 정규·사회 교육기관에서는 인터넷을 이용한 가상학습 교육훈련을 전극 도입하고 있는 추세이다. 가상학습을 기반으로 하는 코스웨어 개발이냐 학습평가에 관한 연구는 많으나 품질평가에 관한 연구는 아직까지 미비하여 본 논문에서는 가상학습 지원시스템의 품질평가 모델을 제시하여 교육 현장에서 가상학습 지원시스템 구축 시 품질평가 기준으로 활용하고자 한다.
PDF

Analyze GPT sentence generation performance based on Image by training data capacity and number of iterations (학습 데이터 용량 및 반복 학습 횟수에 따른 이미지 기반 GPT 문장생성 및 성능 분석)

Dong-Hee Lee;Bong-Jun Choi
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2023.07a
- /
- pp.363-364
- /
- 2023
현재 많은 사람이 GPT를 통해 다양한 활동 및 연구를 진행하고 있다. 사람들은 GPT를 통해 문장생성 시 문장에 대한 정확도를 중요하게 생각한다. 하지만 용도에 따라 GPT를 통해 생성하는 문장의 문체와 같은 표현방식이 다르다. 그래서 생성된 문장이 유의미한 문장이라는 것에 판단이 매우 주관적이기 때문에 수치적 평가가 어렵다. 본 논문에서는 자연어처리 모델이 생성한 문장의 유의미함을 판단하기 위해 각 모델을 학습하는 데이터 용량과 반복 학습의 횟수에 따른 결과물을 비교하였다. 본 연구에서는 Fine-Tuning을 통해 총 4개의 GPT 모델을 구축하였다. 각 모델로 생성 문장을 BLEU 평가지표를 통해 평가한 결과 본 연구에 BLEU 모델은 부적합하다는 결과를 도출하였다. 이를 해결하기 위해 본 연구에서는 생성된 모델을 평가하고자 설문지를 만들어 평가를 진행하였다. 그 결과 사람에게 긍정적인 평가를 받는 결과를 얻을 수 있었다.
PDF

Study on Course-Embedded Learning Achievement Evaluation and Adaptive Feedback (교과기반 학습성취 평가 및 적응형 피드백 시스템 설계)

Chung, Hyun-Sook;Kim, Jung-Min
- The Journal of the Convergence on Culture Technology
- /
- v.8 no.6
- /
- pp.553-560
- /
- 2022
The research of course-embedded learning evaluation method, which can be used to measure the competency of learners by evaluation of learning outcomes, has been performed for competency-based education in the university. In this paper, we propose an learning evaluation and adaptive feedback model based on learning outcomes, learning subjects, learning concepts graph, and an evaluation matrix. Firstly, we define the layered learning outcomes, a graph of learning subjects and concepts, and two association matric. Secondly, we define algorithms to calculate the level of learning achievement and the learning feedback to learners. We applied the proposed method to a specific course, "Java Programing", to validate the effectiveness of our method. The experimental results show that our proposed method can be useful to measure the learning achievement of learners and provide adaptive feedbacks to them.
https://doi.org/10.17703/JCCT.2022.8.6.553 인용 PDF KSCI

Learning Contextual Meaning Representations of Named Entities for Correcting Factual Inconsistent Summary (개체명 문맥의미표현 학습을 통한 기계 요약의 사실 불일치 교정)

Park, Junmo;Noh, Yunseok;Park, Seyoung
- Annual Conference on Human and Language Technology
- /
- 2020.10a
- /
- pp.54-59
- /
- 2020
사실 불일치 교정은 기계 요약 시스템이 요약한 결과를 실제 사실과 일치하도록 만드는 작업이다. 실제 요약 생성연구에서 가장 공통적인 문제점은 요약을 생성할 때 잘못된 사실을 생성하는 것이다. 이는 요약 모델이 실제 서비스로 상용화 하는데 큰 걸림돌이 되는 부분 중 하나이다. 본 논문에서는 원문으로부터 개체명을 가져와 사실과 일치하는 문장으로 고치는 방법을 제안한다. 이를 위해서 언어 모델이 개체명에 대한 문맥적 표현을 잘 생성할 수 있도록 학습시킨다. 그리고 학습된 모델을 이용하여 원문과 요약문에 등장한 개체명들의 문맥적 표현 비교를 통해 적절한 단어로 교체함으로써 요약문의 사실 불일치를 해소한다. 제안 모델을 평가하기 위해 추상 요약 데이터를 이용해 학습데이터를 만들어 학습하고, 실제 시나리오에서 적용가능성을 검증하기 위해 모델이 요약한 요약문을 이용해 실험을 수행했다. 실험 결과, 자동 평가와 사람 평가에서 제안 모델이 비교 모델보다 높은 성능을 보여주었다.
PDF

Theoretical Exploration of a Process-centered Assessment Model for STEAM Competency Based on Learning Progressions (학습발달과정에 근거한 과정중심 STEAM 역량 평가 모델에 대한 이론적 탐색)

Ryu, Suna;Kwak, Youngsun;Yang, Sung Ho
- Journal of Science Education
- /
- v.42 no.2
- /
- pp.132-147
- /
- 2018
The goal of this research is to suggest a theoretical process-centered assessment model based on Learning Progressions of key competencies in the context of STEAM instructions. The "Process-Products Combined Module-type (P2CM) STEAM Assessment Model (P2CM STEAM Assessment Model, hereafter) can be used both as an instructional model and as an assesment model, applicable for various STEAM topics and instructional types. consists of 3 axes. The first X axis stands for 4C competencies that should be emphasized through STEAM instruction. The second Y axis stands for the types and the hierarchy of STEAM instructions. The third Z axis stands for the assessment standards based on LP. We also exemplified an assessment module combined creativity competency with creativity-based instruction based on . Based on the research results, we suggested elaboration of assessment models based on Korean LP research outcomes, development and supply of formative assessment models through field-based in-depth research, modification of formative assessment models with the participation of teacher communities and in-service teachers, and the necessity of further research on assessment models for tracking LP.
https://doi.org/10.21796/jse.2018.42.2.132 인용 PDF KSCI

Attendance Appraisal for Learner Participation Degree Based Virtual Lecture (학습자 참여도 정보기반 가상강좌 출석평가 모델)

Kim, Hyun-Ju
- Journal of the Korea Society of Computer and Information
- /
- v.14 no.4
- /
- pp.119-129
- /
- 2009
In The increasing use of computers and high-speed Internet network has greatly influenced education, causing a veering away from the typical and traditional way of delivering instruction. Specifically, the various kinds of Web-based multimedia technology, the interactive activities on the Internet, and satellite broadcasting technology are accelerating the emergence of a virtual-lectures-based educational model, which transcends time and space. Such virtual lectures make it possible for the entire teaching-learning process to be done in a virtual learning environment, thus giving rise to problem regarding learning guidance, feedback, and appraisal. In this paper, we propose a system for attendance appraisal for learner participation degree based virtual lecture, an appraisal element in virtual learning environments. This appraisal model can set the elements of virtual learning environments in such a way as to reflect in the attendance appraisal of the opened virtual learning environment information regarding the learner's participation in class. In addition, this model motivates the learners to actively participate in the virtual learning environment and to support instructors by accomplishing the activities that are needed for attendance appraisal.
https://doi.org/10.9708/jksci.2009.14.4.119 인용 PDF

Search Result 1,451, Processing Time 0.031 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)