DOI QR코드

DOI QR Code

Exploration on the Feasibility of Utilization and Teacher Perceptions of Using ChatGPT for Student Assessment in Science

과학 교과의 학생 평가에서 ChatGPT의 활용 가능성 및 교사 인식 탐색

  • Dongwon Lee (Korea Institute for Curriculum and Evaluation) ;
  • Hyeon-Pyo Shim (Korea Institute for Curriculum and Evaluation) ;
  • Jongho Baek (Korea Institute for Curriculum and Evaluation)
  • Received : 2024.01.22
  • Accepted : 2024.02.08
  • Published : 2024.02.29

Abstract

This study explores the possibility of using a generative artificial intelligence, ChatGPT, for student assessment in science subjects. In order to achieve our goal, we developed assessment items, collected students' responses, and input them into ChatGPT to implement the assessment procedures. Subsequently, we shared the assessment results from ChatGPT with science teachers and compared them to the teachers' assessment process to investigate the use of ChatGPT in student assessment. Regarding the results, in terms of setting the scoring rubric, we found the rubric generated by ChatGPT to be generally appropriate. However, the consistency between the scoring results obtained from ChatGPT and those determined by the teachers was relatively low. This inconsistency was more pronounced in items with additional assessment components and a more intricate rubric. In regard to feedback on student responses, there were some instances where the feedback generated was scientifically incorrect or beyond the scope of the curriculum, but there were also some positives, such as the provision of exemplary answers to questions and additional examples that helped students learn further. From these results, the teachers perceived limitations in using ChatGPT to conduct assessment in terms of reliability, which is considered crucial in student assessment, but suggested that it could be used to support assessment. Finally, synthesizing these findings, implications for utilizing ChatGPT in student assessment were suggested.

본 연구는 과학 교과의 학생 평가에서 생성형 인공지능인 ChatGPT의 활용 가능성을 탐색하는 데 목적이 있다. 이를 위해 평가 문항을 개발하여 학생들로부터 문항에 대한 응답 자료를 수집하였고, 응답 결과를 ChatGPT에 입력하여 평가의 과정을 수행하도록 하였다. 또한 교사들에게 ChatGPT로부터 얻은 평가 결과를 공유하고, 교사들의 평가 과정과 비교하여 학생 평가에서의 ChatGPT의 활용 가능성을 탐색해 보았다. 연구 결과, 채점 기준의 설정 측면에서 ChatGPT가 생성해 내는 채점 기준이 전반적으로 타당성을 갖추고 있는 것으로 볼 수 있었다. 그러나 ChatGPT가 수행한 채점 결과는 교사들 채점한 결과와 비교하였을 때 일관성이 다소 낮았고, 특히 문항에 포함된 평가 요소가 많고, 평가 요소별 배점 기준이 복잡할수록 채점 결과의 일치도가 더 낮게 나타났다. 학생 응답에 대한 피드백 측면에서는 학생 응답의 과학적 타당성을 평가하는 과정에서 일부 잘못된 내용을 제공하거나 교육과정을 넘어서는 수준의 피드백이 생성되는 경우도 있었으나, 문항에 대한 올바른 답안을 알려주고, 추가적인 사례들을 제공하는 등의 긍정적인 부분도 확인할 수 있었다. 이러한 결과들을 바탕으로 교사들로부터 ChatGPT의 활용 가능성에 대한 인식을 살펴 보았을 때, 학생 평가에서 중요하게 인식되는 '신뢰성' 측면에서 부족한 면이 있기는 하지만, 교사들의 평가를 지원하는 측면에서 활용 가능성을 발견할 수도 있었다. 마지막으로, 이러한 연구 결과를 종합하여 학생 평가에서의 ChatGPT의 활용에 대한 시사점을 제언하였다.

Keywords

References

  1. Bhat, S., Nguyen, H., Moore, S., Stamper, J., Sakr, M., & Nyberg, E. (2022). Towards automated generation and evaluation of questions in educational domains. Paper presented at the 15th International Conference on Educational Data Mining, Durham.
  2. Byeon, J.-H., & Kwon, Y.-J. (2023). An investigation of generative AI in educational application: Focusing on the usage of ChatGPT for learning biology. Brain, Digital, & Learning, 13(1), 1-17.
  3. Chang, J., Park, J., & Park, J. (2021). An analysis on the trends of education research related to 'artificial intelligence chatbot' in Korea: Focusing on implications for use in science education. Journal of Learner-Centered Curriculum and Instruction, 21(13), 729-743.
  4. Choi, S.-W., & Nam, J.-H. (2019). The use of AI chatbot as an assistant tool for SW education. Journal of the Korea Institute of Information and Communication Engineering, 23(12), 1693-1699.
  5. Gonzalez-Calatayud, V., Prendes-Espinosa, P., & Roig-Vila, R. (2021). Artificial intelligence for student assessment: A systematic review. Applied Sciences, 11(12), 5467-5482. https://doi.org/10.3390/app11125467
  6. Holmes, W., Bialik, M., & Fadel, C. (2019). Artificial intelligence in education: Promise and implications for teaching and learning. Boston: Center for Curriculum Redesign.
  7. Hong, S., Cho, B., Choi, I., Park, K., Kim, H., Park, Y., & Park, J. (2020). Artificial Intelligence and EduTech in School Education. RRI 2020-2, Korea Institute for Curriculum and Evaluation.
  8. Huh, M., Bae, Y., Seok, H., & Lee, J. (2021). Domestic research trends of learning with AI. Journal of The Korean Association of Information Education, 25(6), 973-985. https://doi.org/10.14352/jkaie.2021.25.6.973
  9. Jang, H., & So, H.-J. (2023). The analysis of research trends and topics about the educational use of ChatGPT. Journal of Research in Curriculum & Instruction, 27(4), 387-401.
  10. Jho, H. (2023). Understanding of generative artificial intelligence based on textual data and discussion for its application in science education. Journal of the Korean Association for Science Education, 43(3), 307-319. https://doi.org/10.14697/JKASE.2023.43.3.307
  11. Kang, D. (2023). The advent of ChatGPT and the response of Korean language education. Korean Language and Literature, 82, 469-496.
  12. Kim, H. (2021). The artificial intelligence era and science education - With a focus on the autonomy and relatedness of artificial intelligence -. The Journal of Yeolin Education, 29(6), 1-23.
  13. Kim, H., Park, J., Hong, S., Park, Y., Kim, E., Choi, J, & Kim, Y. (2020). Teachers' perceptions of AI in school education. Journal of Educational Technology, 26(3), 905-930.
  14. Kim, J. I., & Yu, H. (2023). Exploring applications of ChatGPT for physics education: Focusing on high school and general physics class. School Science Journal, 17(3), 216-239.
  15. Moore, S., Nguyen, H. A., Bier, N., Domadia, T., & Stamper, J. (2022). Assessing the quality of student-generated short answer questions using GPT-3. Paper presented at the 17th European Conference on Technology Enhanced Learning, Toulouse.
  16. OpenAI. (2023). ChatGPT: Optimizing language models for dialogue. Retrieved December 05, 2023 from https://openai.com/blog/chatgpt/
  17. Park, J.-I., Lee, S., Song, M. H., Lee, M., Lee, M., & Choi, S. (2022). A Study on the Development of Automated Scoring Method for Computer-based Essay and Short Answer Question Type Assessment (I). RRE 2022-6, Korea Institute for Curriculum and Evaluation.
  18. Park, S.-Y., Lee, B., Lee, Y. Ham, E. H., & Lee, S. (2023). Exploring the possibility of science-inquiry competence assessment by ChatGPT-4: Comparisons with human evaluators. Korean Journal of Educational Research, 61(4), 299-332.
  19. Shin, D. (2019). Feasibility and constraints in applying an AI chatbot to english education. Brain, Digital, & Learning, 9(2), 29-40. https://doi.org/10.31216/BDL.2019.9.2.029
  20. Shin, D., Jung, H., & Lee, Y. (2023). Exploring the potential of using ChatGPT as a content-based English learning and teaching tool. Journal of the Korea English Education Society, 22(1), 171-192.
  21. Smutny, P., & Schreiberova, P. (2020). Chatbots for learning: A review of educational chatbots for the Facebook Messenger. Computers & Education, 151, 1-11.
  22. Son, T. (2023). Exploring the possibility of using ChatGPT in mathematics education: Focusing on student product and pre-service teachers' discourse related to fraction problems. Education of Primary School Mathematics, 26(2), 99-113.
  23. Zhai, X., Chu, X., Chai, C. S., Jong, M. S. Y., Istenic, A., Spector, M., Li, J., Yuan, J., & Li, Y. (2021). A review of Artificial Intelligence (AI) in education from 2010 to 2020. Complexity. 2021, 1-18.
  24. Zhou, C., Li, Q., Li, C., Yu, J., Liu, Y., Wang, G., Zhang, K., Ji, C., Yan, Q., He, L., Peng, H., Li, J., Wu, J., Liu, Z., Xie, P., Xiong, C., Pei, J., Yu, P. S., & Sun, L. (2023). A comprehensive survey on pretrained foundation models: A history from bert to chatgpt. arXiv preprint arXiv:2302.09419. Retrieved May 23, 2023 from https://arxiv.org/