• Title/Summary/Keyword: GPT-4기반의 ChatGPT

Search Result 19, Processing Time 0.021 seconds

Exploring automatic scoring of mathematical descriptive assessment using prompt engineering with the GPT-4 model: Focused on permutations and combinations (프롬프트 엔지니어링을 통한 GPT-4 모델의 수학 서술형 평가 자동 채점 탐색: 순열과 조합을 중심으로)

  • Byoungchul Shin;Junsu Lee;Yunjoo Yoo
    • The Mathematical Education
    • /
    • v.63 no.2
    • /
    • pp.187-207
    • /
    • 2024
  • In this study, we explored the feasibility of automatically scoring descriptive assessment items using GPT-4 based ChatGPT by comparing and analyzing the scoring results between teachers and GPT-4 based ChatGPT. For this purpose, three descriptive items from the permutation and combination unit for first-year high school students were selected from the KICE (Korea Institute for Curriculum and Evaluation) website. Items 1 and 2 had only one problem-solving strategy, while Item 3 had more than two strategies. Two teachers, each with over eight years of educational experience, graded answers from 204 students and compared these with the results from GPT-4 based ChatGPT. Various techniques such as Few-Shot-CoT, SC, structured, and Iteratively prompts were utilized to construct prompts for scoring, which were then inputted into GPT-4 based ChatGPT for scoring. The scoring results for Items 1 and 2 showed a strong correlation between the teachers' and GPT-4's scoring. For Item 3, which involved multiple problem-solving strategies, the student answers were first classified according to their strategies using prompts inputted into GPT-4 based ChatGPT. Following this classification, scoring prompts tailored to each type were applied and inputted into GPT-4 based ChatGPT for scoring, and these results also showed a strong correlation with the teachers' scoring. Through this, the potential for GPT-4 models utilizing prompt engineering to assist in teachers' scoring was confirmed, and the limitations of this study and directions for future research were presented.

A Study on A Study on the University Education Plan Using ChatGPTfor University Students (ChatGPT를 활용한 대학 교육 방안 연구)

  • Hyun-ju Kim;Jinyoung Lee
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.71-79
    • /
    • 2024
  • ChatGPT, an interactive artificial intelligence (AI) chatbot developed by Open AI in the U.S., gaining popularity with great repercussions around the world. Some academia are concerned that ChatGPT can be used by students for plagiarism, but ChatGPT is also widely used in a positive direction, such as being used to write marketing phrases or website phrases. There is also an opinion that ChatGPT could be a new future for "search," and some analysts say that the focus should be on fostering rather than excessive regulation. This study analyzed consciousness about ChatGPT for college students through a survey of their perception of ChatGPT. And, plagiarism inspection systems were prepared to establish an education support model using ChatGPT and ChatGPT. Based on this, a university education support model using ChatGPT was constructed. The education model using ChatGPT established an education model based on text, digital, and art, and then composed of detailed strategies necessary for the era of the 4th industrial revolution below it. In addition, it was configured to guide students to use ChatGPT within the permitted range by using the ChatGPT detection function provided by the plagiarism inspection system, after the instructor of the class determined the allowable range of content generated by ChatGPT according to the learning goal. By linking and utilizing ChatGPT and the plagiarism inspection system in this way, it is expected to prevent situations in which ChatGPT's excellent ability is abused in education.

A Qualitative Research on Exploring Consideration Factors for Educational Use of ChatGPT (ChatGPT의 교육적 활용 고려 요소 탐색을 위한 질적 연구)

  • Hyeongjong Han
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.4
    • /
    • pp.659-666
    • /
    • 2023
  • Among the tools based on generative artificial intelligence, the possibility of using ChatGPT is being explored. However, studies that have confirmed what factors should be considered when using it educationally based on learners' actual perceptions are insufficient. Through qualitative research method, this study was to derive consideration factors when using ChatGPT in the education. The results showed that there were five key factors as follows: critical thinking on generated information, recognizing it as a tool to support learning and avoiding dependent use, conducting prior training on ethical usage, generating clear and appropriate questions, and reviewing and synthesizing answers. It is necessary to develop an instructional design model that comprehensively composes the above elements.

Can ChatGPT Pass the National Korean Occupational Therapy Licensure Examination? (ChatGPT는 한국작업치료사면허시험에 합격할 수 있을까?)

  • Hong, Junhwa;Kim, Nayeon;Min, Hyemin;Yang, Hamin;Lee, Sihyun;Choi, Seojin;Park, Jin-Hyuck
    • Therapeutic Science for Rehabilitation
    • /
    • v.13 no.1
    • /
    • pp.65-74
    • /
    • 2024
  • Objective : This study assessed ChatGPT, an artificial intelligence system based on a large language model, for its ability to pass the National Korean Occupational Therapy Licensure Examination (NKOTLE). Methods : Using NKOTLE questions from 2018 to 2022, provided by the Korea Health and Medical Personnel Examination Institute, this study employed English prompts to determine the accuracy of ChatGPT in providing correct answers. Two researchers independently conducted the entire process, and the average accuracy of both researchers was used to determine whether ChatGPT passed over the 5-year period. The degree of agreement between ChatGPT answers of the two researchers was assessed. Results : ChatGPT passed the 2020 examination but failed to pass the other 4 years' examination. Specifically, its accuracy in questions related to medical regulations ranged from 25% to 57%, whereas its accuracy in other questions exceeded 60%. ChatGPT exhibited a strong agreement between researchers, except for medical regulation questions, and this agreement was significantly correlated with accuracy. Conclusion : There are still limitations to the application of ChatGPT to answer questions influenced by language or culture. Future studies should explore its potential as an educational tool for students majoring in occupational therapy through optimized prompts and continuous learning from the data.

A Study on Evaluating Summarization Performance using Generative Al Model (생성형 AI 모델을 활용한 요약 성능 평가 연구 )

  • Gyuri Choi;Seoyoon Park;Yejee Kang;Hansaem Kim
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.228-233
    • /
    • 2023
  • 인간의 수동 평가 시 시간과 비용의 소모, 주석자 간의 의견 불일치, 평가 결과의 품질 등 불가피한 한계가 발생한다. 본 논문에서는 맥락을 고려하고 긴 문장 입출력이 가능한 ChatGPT를 활용한 한국어 요약문 평가가 인간 평가를 대체하거나 보조하는 것이 가능한가에 대해 살펴보았다. 이를 위해 ChatGPT가 생성한 요약문에 정량적 평가와 정성적 평가를 진행하였으며 정량적 지표로 BERTScore, 정성적 지표로는 일관성, 관련성, 문법성, 유창성을 사용하였다. 평가 결과 ChatGPT4의 경우 인간 수동 평가를 보조할 수 있는 가능성이 있음을 확인하였다. ChatGPT가 영어 기반으로 학습된 모델임을 고려하여 오류 발견 성능을 검증하고자 한국어 오류 요약문으로 추가 평가를 진행하였다. 그 결과 ChatGPT3.5와 ChatGPT4의 오류 요약 평가 성능은 불안정하여 인간을 보조하기에는 아직 어려움이 있음을 확인하였다.

  • PDF

A Study on the Evaluation of LLM's Gameplay Capabilities in Interactive Text-Based Games (대화형 텍스트 기반 게임에서 LLM의 게임플레이 기능 평가에 관한 연구)

  • Dongcheul Lee
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.3
    • /
    • pp.87-94
    • /
    • 2024
  • We investigated the feasibility of utilizing Large Language Models (LLMs) to perform text-based games without training on game data in advance. We adopted ChatGPT-3.5 and its state-of-the-art, ChatGPT-4, as the systems that implemented LLM. In addition, we added the persistent memory feature proposed in this paper to ChatGPT-4 to create three game player agents. We used Zork, one of the most famous text-based games, to see if the agents could navigate through complex locations, gather information, and solve puzzles. The results showed that the agent with persistent memory had the widest range of exploration and the best score among the three agents. However, all three agents were limited in solving puzzles, indicating that LLM is vulnerable to problems that require multi-level reasoning. Nevertheless, the proposed agent was still able to visit 37.3% of the total locations and collect all the items in the locations it visited, demonstrating the potential of LLM.

Development of ChatGPT-based Medical Text Augmentation Tool for Synthetic Text Generation (합성 텍스트 생성을 위한 ChatGPT 기반 의료 텍스트 증강 도구 개발)

  • Jin-Woo Kong;Gi-Youn Kim;Yu-Seop Kim;Byoung-Doo Oh
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.07a
    • /
    • pp.3-4
    • /
    • 2023
  • 자연어처리는 수많은 정보가 수집된 전자의무기록의 비정형 데이터에서 유의미한 정보나 패턴 등을 추출해 의료진의 의사결정을 지원하고, 환자에게 더 나은 진단이나 치료 등을 지원할 수 있어 큰 잠재력을 가지고 있다. 그러나 전자의무기록은 개인정보와 같은 민감한 정보가 다수 포함되어 있어 접근하기 어렵고, 이로 인해 충분한 양의 데이터를 확보하기 어렵다. 따라서 본 논문에서는 신뢰할 수 있는 의료 합성 텍스트를 생성하기 위해 ChatGPT 기반 의료 텍스트 증강 도구를 개발하였다. 이는 사용자가 입력한 실제 의료 텍스트로 의료 합성 데이터를 생성한다. 이를 위해, 적합한 프롬프트와 의료 텍스트에 대한 전처리 방법을 탐색하였다. ChatGPT 기반 의료 텍스트 증강 도구는 입력 텍스트의 핵심 키워드를 잘 유지하였고, 사실에 기반한 의료 합성 텍스트를 생성할 수 있다는 것을 확인할 수 있었다.

  • PDF

An Exploratory Study on the Trustworthiness Analysis of Generative AI (생성형 AI의 신뢰도에 대한 탐색적 연구)

  • Soyon Kim;Ji Yeon Cho;Bong Gyou Lee
    • Journal of Internet Computing and Services
    • /
    • v.25 no.1
    • /
    • pp.79-90
    • /
    • 2024
  • This study focused on user trust in ChatGPT, a generative AI technology, and explored the factors that affect usage status and intention to continue using, and whether the influence of trust varies depending on the purpose. For this purpose, the survey was conducted targeting people in their 20s and 30s who use ChatGPT the most. The statistical analysis deploying IBM SPSS 27 and SmartPLS 4.0. A structural equation model was formulated on the foundation of Bhattacherjee's Expectation-Confirmation Model (ECM), employing path analysis and Multi-Group Analysis (MGA) for hypothesis validation. The main findings are as follows: Firstly, ChatGPT is mainly used for specific needs or objectives rather than as a daily tool. The majority of users are cognizant of its hallucination effects; however, this did not hinder its use. Secondly, the hypothesis testing indicated that independent variables such as expectation- confirmation, perceived usefulness, and user satisfaction all exert a positive influence on the dependent variable, the intention for continuance intention. Thirdly, the influence of trust varied depending on the user's purpose in utilizing ChatGPT. trust was significant when ChatGPT is used for information retrieval but not for creative purposes. This study will be used to solve reliability problems in the process of introducing generative AI in society and companies in the future and to establish policies and derive improvement measures for successful employment.

Analysis of Prompt Engineering Methodologies and Research Status to Improve Inference Capability of ChatGPT and Other Large Language Models (ChatGPT 및 거대언어모델의 추론 능력 향상을 위한 프롬프트 엔지니어링 방법론 및 연구 현황 분석)

  • Sangun Park;Juyoung Kang
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.4
    • /
    • pp.287-308
    • /
    • 2023
  • After launching its service in November 2022, ChatGPT has rapidly increased the number of users and is having a significant impact on all aspects of society, bringing a major turning point in the history of artificial intelligence. In particular, the inference ability of large language models such as ChatGPT is improving at a rapid pace through prompt engineering techniques. This reasoning ability can be considered as an important factor for companies that want to adopt artificial intelligence into their workflows or for individuals looking to utilize it. In this paper, we begin with an understanding of in-context learning that enables inference in large language models, explain the concept of prompt engineering, inference with in-context learning, and benchmark data. Moreover, we investigate the prompt engineering techniques that have rapidly improved the inference performance of large language models, and the relationship between the techniques.

Token-Based Classification and Dataset Construction for Detecting Modified Profanity (변형된 비속어 탐지를 위한 토큰 기반의 분류 및 데이터셋)

  • Sungmin Ko;Youhyun Shin
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.4
    • /
    • pp.181-188
    • /
    • 2024
  • Traditional profanity detection methods have limitations in identifying intentionally altered profanities. This paper introduces a new method based on Named Entity Recognition, a subfield of Natural Language Processing. We developed a profanity detection technique using sequence labeling, for which we constructed a dataset by labeling some profanities in Korean malicious comments and conducted experiments. Additionally, to enhance the model's performance, we augmented the dataset by labeling parts of a Korean hate speech dataset using one of the large language models, ChatGPT, and conducted training. During this process, we confirmed that filtering the dataset created by the large language model by humans alone could improve performance. This suggests that human oversight is still necessary in the dataset augmentation process.