The Automated Scoring of Kinematics Graph Answers through the Design and Application of a Convolutional Neural Network-Based Scoring Model

Jae-Sang Han;Hyun-Joo Kim;

doi:10.14697/jkase.2023.43.3.237

한국과학교육학회지 (Journal of The Korean Association For Science Education)

제43권3호
/
Pages.237-251
/
2023
/
1226-5187(pISSN)
/
2288-8489(eISSN)

한국과학교육학회 (The Korean Association for Science Education)

DOI QR Code

합성곱 신경망 기반 채점 모델 설계 및 적용을 통한 운동학 그래프 답안 자동 채점

The Automated Scoring of Kinematics Graph Answers through the Design and Application of a Convolutional Neural Network-Based Scoring Model

한재상 (한국교원대학교) ;
김현주 (한국교원대학교)

Jae-Sang Han (Korea National University of Education) ;
Hyun-Joo Kim (Korea National University of Education)

투고 : 2023.04.14
심사 : 2023.06.08
발행 : 2023.06.30

https://doi.org/10.14697/jkase.2023.43.3.237 인용 PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

본 연구는 합성곱 신경망을 활용한 자동 채점 모델을 설계하고 학생의 운동학 그래프 답안에 적용함으로써, 과학 그래프 답안에 대한 자동 채점의 가능성을 탐색하였다. 연구자가 작성한 2,200개의 답안을 2,000개의 훈련 데이터와 200개의 검증 데이터로 데이터셋을 구성하고, 202개의 학생 답안을 100개의 훈련 데이터와 102개의 시험 데이터로 데이터셋을 구성하여 연구를 진행하였다. 먼저, 자동 채점모델을 설계하고 성능을 검증하는 과정에서는 연구자가 작성한 답안 데이터셋을 활용하여 그래프 이미지 분류에 최적화되도록 자동 채점모델을 완성하였다. 다음으로 자동 채점 모델에 훈련 데이터셋을 여러 유형으로 학습시키면서 학생의 시험 데이터셋에 대한 채점을 수행하여 훈련 데이터의 양이 많고 다양할수록 자동 채점 모델의 성능이 향상된다는 것을 확인하였고, 최종적으로 인간 채점과의 일치율은 97.06%, 카파 계수는 0.957, 가중 카파 계수는 0.968을 얻었다. 한편, 훈련 데이터로 학습되지 않은 유형의 답안의 경우 인간 채점자들 간에는 채점이 거의 일치하였으나, 자동 채점 모델은 일치하지 않게 채점하는 것을 확인하였다.

This study explores the possibility of automated scoring for scientific graph answers by designing an automated scoring model using convolutional neural networks and applying it to students' kinematics graph answers. The researchers prepared 2,200 answers, which were divided into 2,000 training data and 200 validation data. Additionally, 202 student answers were divided into 100 training data and 102 test data. First, in the process of designing an automated scoring model and validating its performance, the automated scoring model was optimized for graph image classification using the answer dataset prepared by the researchers. Next, the automated scoring model was trained using various types of training datasets, and it was used to score the student test dataset. The performance of the automated scoring model has been improved as the amount of training data increased in amount and diversity. Finally, compared to human scoring, the accuracy was 97.06%, the kappa coefficient was 0.957, and the weighted kappa coefficient was 0.968. On the other hand, in the case of answer types that were not included in the training data, the s coring was almos t identical among human s corers however, the automated scoring model performed inaccurately.

키워드

참고문헌

Abreu, S. (2019). Automated architecture design for deep neural networks. arXiv preprint arXiv:1908.10714.
Aggarwal, C. C. (2018). Neural networks and deep learning. Springer, 10(978), 3.
An, S., Lee, M., Park, S., Yang, H., & So, J. (2020). An ensemble of simple convolutional neural network models for MNIST digit recognition. arXiv preprint arXiv:2008.10400.
Bertolini, R., Finch, S. J., & Nehm, R. H. (2021). Testing the impact of novel assessment sources and machine learning methods on predictive outcome modeling in undergraduate biology. Journal of Science Education and Technology, 30, 193-209.
Brasell, H. M. (1990). Graphs, graphing, and graphers. What research says to the science teacher, 6, 69-85.
Carleo, G., Cirac, I., Cranmer, K., Daudet, L., Schuld, M., Tishby, N., ... & Zdeborova, L. (2019). Machine learning and the physical sciences. Reviews of Modern Physics, 91(4), 045002.
Cho, J., Park, H., & Kim, J. (2020). Development of Wearable Sensing Suit for Monitoring Wrist Joint Motions and Deep Neural Network-based Calibration Method. Journal of the Korean Society for Precision Engineering, 37(10), 765-771. https://doi.org/10.7736/JKSPE.020.020
Eshach, H. (2010, December). Re-examining the power of video motion analysis to promote the reading and creating of kinematic graphs. In Asia-Pacific Forum on Science Learning & Teaching (Vol. 11, No. 2).
Eshach, H. (2014). The use of intuitive rules in interpreting students' difficulties in reading and creating kinematic graphs. Canadian Journal of Physics, 92(1), 1-8. https://doi.org/10.1139/cjp-2013-0369
Goldt, S., & Seifert, U. (2017). Stochastic thermodynamics of learning. Physical review letters, 118(1), 010601.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT press.
Ha, M., Lee, G. G., Shin, S., Lee, J. K., Choi, S., Choo, J., ... & Park, J. (2019). Assessment as a Learning-Support Tool and Utilization of Artificial Intelligence: WA3I Project Case. School Science Journal, 13 (3), 271-282. https://doi.org/10.15737/SSJ.13.3.201908.271
Hubel, D. H., & Wiesel, T. N. (1959). Receptive fields of single neurones in the cat's striate cortex. The Journal of physiology, 148(3), 574.
Huang, F., Kwak, H., & An, J. (2023). Is chatgpt better than human annotators? potential and limitations of chatgpt in explaining implicit hate speech. arXiv preprint arXiv:2302.07736.
Jeong, D. (2017). Trend on artificial intelligence technology and its related industry. Korea Institute of Information Technology Magazine, 15(2), 21-28. https://doi.org/10.14801/jkiit.2017.15.5.21
Ju, K., Lee, M., Yang, H., & Ryu, D. (2017). The 4th industrial revolution and artificial intelligence: An introductory review. Journal of the Korean Operations Research and Management Science Society, 42(4), 1-14.
Kim, H. K., & Lee, N. R. (2013). Relationship between high-school student' science achievement level. New Physics: Sae Mulli, 63(3), 252-258. https://doi.org/10.3938/NPSM.63.252
Kim, T. S., & Kim, B. K. (2002). The comparison of graphing abilities of pupils in grades 7 to 12 based on TOGS (The test of graphing in science). Journal of the Korean Association for Science Education, 22(4), 768-778.
Kim, T. S., Bae, D. J., & Kim, B. K. (2002). The relationships of graphing abilities to logical thinking and science process skills of middle school students. Journal of the Korean Association for Science Education, 22(4), 725-739.
Kim, T. S., Ko, S. K., & Kim, B. K. (2005). Relationships of graphing ability to science-process skills and academic achievement of high school students. Journal of the Korean Association for Science Education, 25(5), 624-633.
Kim, Y. (2021). Research Subject Trend Analysis on AI Education with Network Text Analysis on Korean Journals. Journal of Educational Innovation Research, 31, 197-217.
Kowalek, P., Loch-Olszewska, H., & Szwabinski, J. (2019). Classification of diffusion modes in single-particle tracking data: Feature-based versus deep-learning approach. Physical Review E, 100(3), 032410.
Kwon, S. G. (1997). The Effect of Force and Motion Conceptions into the Kinematics Graph Construction. Journal of The Korean Association For Science Education, 17(4), 383-393.
LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., & Jackel, L. D. (1989). Backpropagation applied to handwritten zip code recognition. Neural computation, 1(4), 541-551. https://doi.org/10.1162/neco.1989.1.4.541
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature, 521(7553), 436-444. https://doi.org/10.1038/nature14539
Lee, G., & Ha, M. (2020). The Present and Future of AI-based Automated Evaluation: A Literature Review on Descriptive Assessment and Other Side. Journal of Educational Technology, 36(2), 353-382. https://doi.org/10.17232/KSET.36.2.353
Lee, J., Park, G., & Noh, T. (2019). A Study on Middle School Students' Problem Solving Processes for Scientific Graph Construction. Journal of The Korean Association For Science Education, 39(5), 655-668.
Lee, J. (2023). Exploring the Possibility of Automatic Scoring for Graphical Responses Using a Convolutional Neural Network. New Physics: Sae Mulli, 73(2), 138-149. https://doi.org/10.3938/NPSM.73.138
Lee, M., & Ryu, S. (2020). Automated Scoring of Scientific Argumentation Using Expert Morpheme Classification Approaches. Journal of The Korean Association For Science Education, 40(3), 321-336.
Lee, M., & Ryu, S. (2021). Automated Scoring of Argumentation Levels and Analysis of Argumentation Patterns Using Machine Learning. Journal of The Korean Association For Science Education, 41(3), 203-220. https://doi.org/10.14697/JKASE.2021.41.3.203
Li, Q., & Sompolinsky, H. (2021). Statistical mechanics of deep linear neural networks: The backpropagating kernel renormalization. Physical Review X, 11(3), 031059.
Li, Z., Liu, F., Yang, W., Peng, S., & Zhou, J. (2021). A survey of convolutional neural networks: analysis, applications, and prospects. IEEE transactions on neural networks and learning systems.
Lim, J., & Lee, B. (2015). Analysis of High School Students' Difficulties Related to Procedural Knowledge in Solving Classical Mechanics Problems. New Physics, 65(9), 888-899.
Linn, M. C., & Eylon, B. S. (2011). Science learning and instruction: Taking advantage of technology to promote knowledge integration. Routledge.
Liu, O. L., Rios, J. A., Heilman, M., Gerard, L., & Linn, M. C. (2016). Validation of automated scoring of science assessments. Journal of Research in Science Teaching, 53(2), 215-233. https://doi.org/10.1002/tea.21299
Mitnik, R., Recabarren, M., Nussbaum, M., & Soto, A. (2009). Collaborative robotic instruction: A graph teaching experience. Computers & Education, 53(2), 330-342.
Munoz-Gil, G., Volpe, G., Garcia-March, M. A., Aghion, E., Argun, A., Hong, C. B., ... & Manzo, C. (2021). Objective comparison of methods to decode anomalous diffusion. Nature communications, 12(1), 6253.
Nener, J., & Laguna, M. F. (2021). Wealth exchange models and machine learning: Finding optimal risk strategies in multiagent economic systems. Physical Review E, 104(1), 014305.
Obilor, E. I., & Amadi, E. C. (2018). Test for significance of Pearson's correlation coefficient. International Journal of Innovative Mathematics, Statistics & Energy Policies, 6(1), 11-23.
Papers with code. (n.d.). https://paperswithcode.com/
Park, Y. (2002). Teaching and learning of physics problem solving[물리문제 해결 학습과 지도]. In I. Kim, J. Park, K. Choi, J. Song & Y.Park (Eds.), General physics education II[물리교육학 총론 II]. (pp.69-136). Seoul: Bookshill.
Rawat, W., & Wang, Z. (2017). Deep convolutional neural networks for image classification: A comprehensive review. Neural computation, 29(9), 2352-2449. https://doi.org/10.1162/neco_a_00990
Saito, G., & Lee, B. Y. (2017). Deep learning from scratch. Daejeon: Hanbit Media.
Uzair, M., & Jamil, N. (2020, November). Effects of hidden layers on the efficiency of neural networks. In 2020 IEEE 23rd international multitopic conference (INMIC) (pp. 1-6). IEEE.
Vitale, J. M., Lai, K., & Linn, M. C. (2015). Taking advantage of automated assessment of student-constructed graphs in science. Journal of Research in Science Teaching, 52(10), 1426-1450. https://doi.org/10.1002/tea.21241
Warrens, M. J. (2015). Five ways to look at Cohen's kappa. Journal of Psychology & Psychotherapy, 5(4), 1.
Zhai, X., He, P., & Krajcik, J. (2022). Applying machine learning to automatically assess scientific models. Journal of Research in Science Teaching, 59(10), 1765-1794.

한국과학교육학회지 (Journal of The Korean Association For Science Education)

합성곱 신경망 기반 채점 모델 설계 및 적용을 통한 운동학 그래프 답안 자동 채점

The Automated Scoring of Kinematics Graph Answers through the Design and Application of a Convolutional Neural Network-Based Scoring Model

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)