DOI QR코드

DOI QR Code

Can ChatGPT Pass the National Korean Occupational Therapy Licensure Examination?

ChatGPT는 한국작업치료사면허시험에 합격할 수 있을까?

  • Hong, Junhwa (Dept. of Occupational Therapy, Soonchunhyang University) ;
  • Kim, Nayeon (Dept. of Occupational Therapy, Soonchunhyang University) ;
  • Min, Hyemin (Dept. of Occupational Therapy, Soonchunhyang University) ;
  • Yang, Hamin (Dept. of Occupational Therapy, Soonchunhyang University) ;
  • Lee, Sihyun (Dept. of Occupational Therapy, Soonchunhyang University) ;
  • Choi, Seojin (Dept. of Occupational Therapy, Soonchunhyang University) ;
  • Park, Jin-Hyuck (Dept. of Occupational Therapy, College of Medical Science, Soonchunhyang University)
  • 홍준화 (순천향대학교 작업치료학과) ;
  • 김나연 (순천향대학교 작업치료학과) ;
  • 민혜민 (순천향대학교 작업치료학과) ;
  • 양하민 (순천향대학교 작업치료학과) ;
  • 이시현 (순천향대학교 작업치료학과) ;
  • 최서진 (순천향대학교 작업치료학과) ;
  • 박진혁 (순천향대학교 의료과학대학 작업치료학과)
  • Received : 2023.09.25
  • Accepted : 2023.11.14
  • Published : 2024.02.28

Abstract

Objective : This study assessed ChatGPT, an artificial intelligence system based on a large language model, for its ability to pass the National Korean Occupational Therapy Licensure Examination (NKOTLE). Methods : Using NKOTLE questions from 2018 to 2022, provided by the Korea Health and Medical Personnel Examination Institute, this study employed English prompts to determine the accuracy of ChatGPT in providing correct answers. Two researchers independently conducted the entire process, and the average accuracy of both researchers was used to determine whether ChatGPT passed over the 5-year period. The degree of agreement between ChatGPT answers of the two researchers was assessed. Results : ChatGPT passed the 2020 examination but failed to pass the other 4 years' examination. Specifically, its accuracy in questions related to medical regulations ranged from 25% to 57%, whereas its accuracy in other questions exceeded 60%. ChatGPT exhibited a strong agreement between researchers, except for medical regulation questions, and this agreement was significantly correlated with accuracy. Conclusion : There are still limitations to the application of ChatGPT to answer questions influenced by language or culture. Future studies should explore its potential as an educational tool for students majoring in occupational therapy through optimized prompts and continuous learning from the data.

목적 : 본 연구는 대규모 언어 모델에 기반한 인공지능인 ChatGPT가 한국작업치료사면허시험에 통과할 수 있는지 알아보고자 하였다. 연구방법 : 한국보건의료인국가시험원에서 제공하는 2018년부터 2022년도까지의 한국작업치료사면허시험 문항 중 공개되지 않은 작업치료실기 문항을 제외하고 작업치료학기초, 의료관계법규, 작업치료학 문항을 활용하였다. 시험문항과 함께 가장 적절한 정답을 제시하도록 프롬프트를 영어로 구성하였고 이를 입력한 후 ChatGPT가 제시하는 답을 채점하였다. 2명의 연구자가 독립적으로 전체 과정을 진행하였으며, 2명의 연구자 채점한 정확도를 평균으로 5개년도의 시험에 대한 합격 여부를 확인하였고 연구자 간 ChatGPT 답에 대한 일치도를 확인하였다. 결과 : ChatGPT는 2020년에서만 합격하였고 나머지 4개년도 시험은 탈락권 점수를 보였다. 구체적으로 의료관계법규 문항의 정확도는 25~57% 범위를 보였고 다른 문항의 정확도는 모두 60% 이상을 기록하였다. 또한 의료관계법규 문항을 제외한 연구자 간 ChatGPT는 높은 일치도를 보였으며, 이는 정확도와 유의미한 상관관계를 보였다. 결론 : 언어나 문화권에 영향을 받는 문항의 경우 아직 ChatGPT를 적용하는 데 제한이 있음을 확인하였다. 추후 프롬프트의 최적화 작업과 함께 지속적인 데이터의 학습에 따라 작업치료학을 전공하는 학생들의 학습 도구로서 활용될 수 있는지에 대한 지속적인 연구가 필요하다.

Keywords

Acknowledgement

This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2021S1A3A2A02096338). This work was supported by the Soonchunhyang University Research Fund.

References

  1. American Occupational Therapy Association. (2020). Occupational therapy practice framework: Domain and process-Fourth edition. American Journal of Occupational Therapy, 74(Supplement_2), 7412410010p1-7412410010p87. https://doi.org/10.5014/ajot.2020.74S2001 
  2. Biswas, S. (2023). ChatGPT and the future of medical writing. Radiology, 307(2), e223312. https://doi.org/10.1148/radiol.223312 
  3. Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., ... Hesse, C. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877-1901. 
  4. Castelvecchi, D. (2022). Are ChatGPT and AlphaCode going to replace programmers? Nature. https://www.nature.com/articles/d41586-022-04383-z 
  5. Hacker, P., Engel, A., & Mauer, M. (2023). Regulating ChatGPT and other large generative AI models. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency (pp. 1112-1123). Association for Computing Machinery. https://doi.org/10.1145/3593013.3594067 
  6. Huh, S. (2023). Are ChatGPT's knowledge and interpretation ability comparable to those of medical students in Korea for taking a parasitology examination?: A descriptive study. Journal of Educational Evaluation for Health Professions, 20, 1. https://doi.org/10.3352/jeehp.2023.20.1 
  7. Jeblick, K., Schachtner, B., Dexl, J., Mittermeier, A., Stuber, A. T., Topalis, J., Weber, T., Wesp, P., Sabel, B. O., Ricke, J., & Ingrisch, M. (2023). ChatGPT makes medicine easy to swallow: An exploratory case study on simplified radiology reports. European Radiology, Advance online publication. https://doi.org/10.1007/s00330-023-10213-1 
  8. Kasai, J., Kasai, Y., Sakaguchi, K., Yamada, Y., & Radev, D. (2023). Evaluating GPT-4 and ChatGPT on Japanese medical licensing examinations. arXiv. https://doi.org/10.48550/arXiv.2303.18027 
  9. Korea Health Personnel Licensing Examination Institute. (2023a). Information on the national Korean occupational therapy licensing examination. https://www.kuksiwon.or.kr/subcnt/c_2015/1/view.do?seq=7&itm_seq=13 
  10. Korea Health Personnel Licensing Examination Institute. (2023b). Performance data. https://www.kuksiwon.or.kr/peryearPass/list.do?seq=13&srchWord=13 
  11. Kung, T. H., Cheatham, M., Medenilla, A., Sillos, C., De Leon, L., Elepano, C., Madriaga, M., Aggabao, R., Diaz-Candido, G., Maningo, J., & Tseng, V. (2023). Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digital Health, 2(2), e0000198. https://doi.org/10.1371/journal.pdig.0000198 
  12. Liu, Y., Gu, J., Goyal, N., Li, X., Edunov, S., Ghazvininejad, M., Lewis, M., & Zettlemoyer, L. (2020). Multilingual denoising pre-training for neural machine translation. Transactions of the Association for Computational Linguistics, 8, 726-742. https://doi.org/10.1162/tacl_a_00343 
  13. Neveol, A., Dalianis, H., Velupillai, S., Savova, G., & Zweigenbaum, P. (2018). Clinical Natural Language Processing in languages other than English: Opportunities and challenges. Journal of Biomedical Semantics, 9(1), 12. https://doi.org/10.1186/s13326-018-0179-8 
  14. Nisar, S., & Aslam, M. S. (2023). Is ChatGPT a good tool for T&CM students in studying pharmacology? SSRN. https://doi.org/10.2139/ssrn.4324310 
  15. Sallam, M. (2023). ChatGPT utility in healthcare education, research, and practice: Systematic review on the promising perspectives and valid concerns. Healthcare, 11(6), 887. https://doi.org/10.3390/healthcare11060887 
  16. Taira, K., Itaya, T., & Hanada, A. (2023). Performance of the large language model ChatGPT on the national nurse examinations in Japan: Evaluation study. Journal of Medical Internet Research Nursing, 6, e47305. https://doi.org/10.2196/47305 
  17. Tan, J., Westermann, H., & Benyekhlef, K. (2023). ChatGPT as an artificial lawyer. Artificial Intelligence for Access to Justice. https://ceur-ws.org/Vol-3435/short2.pdf 
  18. Wang, Y. M., Shen, H. W., & Chen, T. J. (2023). Performance of ChatGPT on the pharmacist licensing examination in Taiwan. Journal of the Chinese Medical Association, 86(7), 653-658. https://doi.org/10.1097/JCMA.0000000000000942 
  19. Wang, W., & Siau, K. (2019). Artificial intelligence, machine learning, automation, robotics, future of work and future of humanity: A review and research agenda. Journal of Database Management, 30(1), 61-79. http://doi.org/10.4018/JDM.2019010104