DOI QR코드

DOI QR Code

Using ChatGPT as a proof assistant in a mathematics pathways course

  • Hyejin Park (Mathematics, Drake University) ;
  • Eric D. Manley (Computer Science, Drake University)
  • 투고 : 2024.04.08
  • 심사 : 2024.05.08
  • 발행 : 2024.05.31

초록

The purpose of this study is to examine the capabilities of ChatGPT as a tool for supporting students in generating mathematical arguments that can be considered proofs. To examine this, we engaged students enrolled in a mathematics pathways course in evaluating and revising their original arguments using ChatGPT feedback. Students attempted to find and prove a method for the area of a triangle given its side lengths. Instead of directly asking students to prove a formula, we asked them to explore a method to find the area of a triangle given the lengths of its sides and justify why their methods work. Students completed these ChatGPT-embedded proving activities as class homework. To investigate the capabilities of ChatGPT as a proof tutor, we used these student homework responses as data for this study. We analyzed and compared original and revised arguments students constructed with and without ChatGPT assistance. We also analyzed student-written responses about their perspectives on mathematical proof and proving and their thoughts on using ChatGPT as a proof assistant. Our analysis shows that our participants' approaches to constructing, evaluating, and revising their arguments aligned with their perspectives on proof and proving. They saw ChatGPT's evaluations of their arguments as similar to how they usually evaluate arguments of themselves and others. Mostly, they agreed with ChatGPT's suggestions to make their original arguments more proof-like. They, therefore, revised their original arguments following ChatGPT's suggestions, focusing on improving clarity, providing additional justifications, and showing the generality of their arguments. Further investigation is needed to explore how ChatGPT can be effectively used as a tool in teaching and learning mathematical proof and proof-writing.

키워드

참고문헌

  1. Alcock, L., & Weber, K. (2010). Referential and syntactic approaches to proving: Case studies from a transition-to-proof course. In F. Hitt, D. Holton, & P. Thompson (Eds.), Research in collegiate mathematics education VII (pp. 93-114). American Mathematical Society.
  2. Appel, K., & Haken, W. (1977). The solution of the four-color-map problem. Scientific American, 237(4), 108-121. https://www.jstor.org/stable/24953967
  3. Arnau, D., Arevalillo-Herraez, M., Puig, L., & Gonzalez-Calero, J. A. (2013). Fundamentals of the design and the operation of an intelligent tutoring system for the learning of the arithmetical and algebraic way of solving word problems. Computers & Education, 63, 119-130. https://doi.org/10.1016/j.compedu.2012.11.020
  4. Avigad, J. (2019). Learning logic and proof with an interactive theorem prover. In G. Hanna, M. de Villiers, & D. Reid (Eds.), Proof technology in mathematics research and teaching, Series: Mathematics education in the digital era, (Vol. 14, pp. 277-290). Springer.
  5. Bertot, Y., & Casteran, P. (2013). Interactive theorem proving and program development: Coq'Art: The calculus of inductive constructions. Springer Science & Business Media. https://doi.org/10.1007/978-3-662-07964-5
  6. Botana, F., Hohenwarter, M., Janicic, P., Kovacs, Z., Petrovic, I., Recio, T., & Weitzhofer, S. (2015). Automated theorem proving in GeoGebra: Current achievements. Journal of Automated Reasoning, 55(1), 39-59. https://doi.org/10.1007/s10817-015-9326-4
  7. Bozkurt, A., & Sharma, R. C. (2023). Generative AI and prompt engineering: The art of whispering to let the genie out of the algorithmic world. Asian Journal of Distance Education, 18(2), i-vii. http://www.asianjde.com/ojs/index.php/AsianJDE/article/view/749
  8. Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems (NIPS '20) (pp. 1877-1901). Curran Associates Inc. https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
  9. Butgereit, L., & Martinus, H. (2023). Prof Pi: Using WhatsApp bots and GPT-4 for tutoring mathematics in underserved areas. In A. Seeam, V. Ramsurrun, S. Juddoo, & A. Phokeer (Eds.), Proceedings of the International Conference on Innovations and Interdisciplinary Solutions for Underserved Areas (pp. 278-289). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-51849-2_19
  10. Cain, W. (2024). Prompting change: Exploring prompt engineering in large language model AI and its potential to transform education. TechTrends, 68(1), 47-57. https://doi.org/10.1007/s11528-023-00896-0
  11. Chazan, D. (1993). High school geometry students' justification for their views of empirical evidence and mathematical proof. Educational Studies in Mathematics, 24(4), 359-387. https://doi.org/10.1007/BF01273371
  12. Chen, M., Tworek, J., Jun, H., Yuan, Q., Pinto, H. P. D. O., Kaplan, J., ... & Zaremba, W. (2021). Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374. https://doi.org/10.48550/arXiv.2107.03374
  13. Coe, R., & Ruthven, K. (1994). Proof practices and constructs of advanced mathematics students. British Educational Research Journal, 20(1), 41-53. https://doi.org/10.1080/0141192940200105
  14. de Moura, L., Kong, S., Avigad, J., Van Doorn, F., & von Raumer, J. (2015). The Lean theorem prover (system description). In A. Felty, & A. Middeldorp (Eds.), Automated Deduction-CADE-25 (pp. 378-388). Springer International Publishing. https://doi.org/10.1007/978-3-319-21401-6_26
  15. de Villiers, M. (1990). The role and function of proof in mathematics. Pythagoras, 24(24), 17-24. 
  16. Epp, S. S. (2003). The role of logic in teaching proof. The American Mathematical Monthly, 110(10), 886-899. https://doi.org/10.1080/00029890.2003.11920029
  17. Finnie-Ansley, J., Denny, P., Becker, B. A., Luxton-Reilly, A., & Prather, J. (2022). The robots are coming: Exploring the implications of OpenAI Codex on introductory programming. In J. Sheard, & P. Denny (Eds.), Proceedings of the 24th Australasian Computing Education Conference (pp. 10-19). Association for Computing Machinery. https://doi.org/10.1145/3511861.3511863
  18. First, E., Rabe, M. N., Ringer, T., & Brun, Y. (2023). Baldur: Whole-proof generation and repair with large language models. In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (pp. 1229-1241). Association for Computing Machinery. https://doi.org/10.1145/3611643.3616243
  19. Fitting, M. (2012). First-order logic and automated theorem proving. Springer Science & Business Media.
  20. Font, L., Gagnon, M., Leduc, N., & Richard, P. R. (2022). Intelligence in QED-Tutrix: Balancing the interactions between the natural intelligence of the user and the artificial intelligence of the tutor software. In P. R. Richard, M. P. Velez, & S. Van Vaerenbergh (Eds.), Mathematics Education in the Age of Artificial Intelligence: How Artificial Intelligence Can Serve Mathematical Human Learning (pp. 45-76). Springer International Publishing. https://doi.org/10.1007/978-3-030-86909-0_3
  21. Font, L., Richard, P. R., & Gagnon, M. (2018). Improving QED-Tutrix by automating the generation of proofs. arXiv preprint arXiv:1803.01468. https://doi.org/10.48550/arXiv.1803.01468
  22. Frick, T. (2024). Are we dupes? Limitations of AI systems: What should educators do with them? TechTrends, 68(1), 14-26. https://doi.org/10.1007/s11528-023-00893-3
  23. Frieder, S., Pinchetti, L., Griffiths, R. R., Salvatori, T., Lukasiewicz, T., Petersen, P., & Berner, J. (2024). Mathematical capabilities of ChatGPT. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, & S. Levine (Eds.), Proceedings of the Advances in Neural Information Processing Systems 36 (NeurIPS 2023) Track on Datasets and Benchmarks. https://proceedings.neurips.cc/paper_files/paper/2023/file/58168e8a92994655d6da3939e7cc0918-Paper-Datasets_and_Benchmarks.pdf
  24. Gattupalli, S., Lee, W., Allessio, D., Crabtree, D., Arroyo, I., & Woolf, B. (2023, July 7). Exploring pre-service teachers' perceptions of large language models-generated hints in online mathematics learning [Virtual Presentation]. AIED2023 Empowering Education with LLMs - the Next-Gen Interface and Content Generation, Tokyo, Japan.
  25. Gonthier, G. (2008). Formal proof-the four-color theorem. Notices of the AMS, 55(11), 1382-1393.
  26. Hanna, G. (2000). Proof, explanation and exploration: An overview. Educational Studies in Mathematics, 44(1), 5-23. https://doi.org/10.1023/A:1012737223465
  27. Hanna, G., Reid, D., & de Villiers, M. (2019). Proof technology: Implications for teaching. In G. Hanna, D. A. Reid, & M. de Villiers (Eds.), Proof Technology in Mathematics Research and Teaching, Series: Mathematics Education in the Digital Era (Vol. 14, pp. 3-9). Springer. https://doi.org/10.1007/978-3-030-28483-1_1
  28. Harel, G., & Sowder, L. (1998). Students' proof schemes: Results from exploratory studies. In A. H. Schoenfeld, J. Kaput, & E. Dubinsky (Eds.), Research in collegiate mathematics education III (pp. 234-283). American Mathematical Society.
  29. Healy, C., & Hoyles, L. (2000). A study of proof conceptions in algebra. Journal for Research in Mathematics Education, 31(4), 396-428. https://doi.org/10.2307/749651
  30. Hohenwarter, M., Kovacs, Z., & Recio, T. (2019). Using GeoGebra automated reasoning tools to explore geometric statements and conjectures. In G. Hanna, M. de Villiers, & D. Reid (Eds.), Proof technology in mathematics research and teaching, Series: Mathematics Education in the Digital Era (Vol. 14, pp. 215-236). Springer.
  31. Jiang, A. Q., Li, W., Tworkowski, S., Czechowski, K., Odrzygozdz, T., Milos, P., ... & Jamnik, M. (2022). Thor: Wielding hammers to integrate language models and automated theorem provers. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho & A. Oh (Eds.), Proceedings of the Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (pp. 8360-8373). https://proceedings.neurips.cc/paper_files/paper/2022/file/377c25312668e48f2e531e2f2c422483-Paper-Conference.pdf
  32. Kang, Y. (2024). A study on the didactical application of ChatGPT for mathematical word problem solving. Communications of Mathematical Education, 38(1), 49-67. https://doi.org/10.7468/jksmee.2024.38.1.49
  33. Knapp, J. (2005). Learning to prove in order to prove to learn. https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=ed7046da768e6b39c38f7db472a0ff158230b075
  34. Knuth, E. J. (2002). Proof as a tool for learning mathematics. Mathematics Teacher, 95(7), 486-490. https://doi.org/10.5951/MT.95.7.0486
  35. Knuth, E. J., Choppin, J. M., & Bieda, K. N. (2009). Proof: Examples and beyond. Mathematics Teaching in the Middle School, 15(4), 206-211. https://doi.org/10.5951/MTMS.15.4.0206
  36. Koubaa, A. (2023). GPT-4 vs. GPT-3.5: A concise showdown. https://doi.org/10.20944/preprints202303.0422.v1
  37. Kovacs, Z. (2015). Computer based conjectures and proofs in teaching euclidean geometry [Doctoral dissertation, Johannes Kepler University].
  38. Li, Z., Liu, F., Yang, W., Peng, S., & Zhou, J. (2021). A survey of convolutional neural networks: analysis, applications, and prospects. IEEE Transactions on Nneural Networks and Learning Systems, 33(12), 6999-7019. https://doi.org/10.1109/TNNLS.2021.3084827
  39. Marty, R. H. (1986). Teaching proof techniques. Mathematics in College (Spring/Summer), 46-53.
  40. Moore, R. C. (1994). Making the transition to formal proof. Educational Studies in Mathematics, 27, 249-266. https://doi.org/10.1007/BF01273731
  41. National Academies of Sciences, Engineering, and Medicine. (2023). Artificial intelligence to assist mathematical reasoning: Proceedings of a workshop. The National Academies Press. https://doi.org/10.17226/27241
  42. Nipkow, T., Wenzel, M., & Paulson, L. C. (2002). Isabelle/HOL: A proof assistant for higher-order logic. Springer Berlin Heidelberg.
  43. Papadopoulos, D. (2016). Transitioning to proof with worked examples. Drexel University.
  44. Patero, J. L. (2023). Revolutionizing Math Education: Harnessing ChatGPT for student success. International Journal of Advanced Research in Science, Communication and Technology, 3(1). 807-813. https://doi.org/10.48175/IJARSCT-12375
  45. Selden, A. (2012). Transitions and proof and proving at tertiary level. In G. Hanna, & M. de Villiers (Eds.), Proof and proving in mathematics education (pp. 391-414). Springer. https://doi.org/10.1007/978-94-007-2129-6_17
  46. Selden, J., & Selden, A. (1995). Unpacking the logic of mathematical statements. Educational Studies in Mathematics, 29(2), 123-151. https://doi.org/10.1007/BF01274210
  47. Strauss, A. L., & Corbin, J. (1990). Basics of qualitative research: Grounded theory procedures and techniques. Sage.
  48. Stylianides, G. J., & Stylianides, A. J. (2009). Facilitating the transition from empirical argument to proof. Journal for Research in Mathematics Education, 40(3), 314-352. https://doi.org/10.5951/jresematheduc.40.3.0314
  49. Van Vaerenbergh, S., & Perez-Suay, A. (2022). A classification of artificial intelligence systems for mathematics education. In P. R. Richard, M. P. Velez, & S. Van Vaerenbergh (Eds.), Mathematics education in the age of artificial intelligence: How artificial intelligence can serve mathematical human learning (pp. 89-106). Springer International Publishing. https://doi.org/10.1007/978-3-030-86909-0_5
  50. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In U. von Luxburg, & I. Guyon (Eds.), Proceedings of the Advances in Neural Information Processing Systems 30 (NeurIPS 2017) (pp. 6000-6010). Curran Associates Inc. https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
  51. Volmink, J. D. (1990). The nature and role of proof in mathematics education. Pythagoras, 23, 7-10.
  52. Wardat, Y., Tashtoush, M. A., AlAli, R., & Jarrah, A. M. (2023). ChatGPT: A revolutionary tool for teaching and learning mathematics. Eurasia Journal of Mathematics, Science and Technology Education, 19(7), em2286. https://doi.org/10.29333/ejmste/13272
  53. Weber, K. (2001). Student difficulty in constructing proofs: The need for strategic knowledge. Educational Studies in Mathematics, 48(1), 101-119. https://doi.org/10.1023/A:1015535614355
  54. Weber, K., & Alcock, L. (2004). Semantic and syntactic proof productions. Educational Studies in Mathematics, 56, 209-234. https://doi.org/10.1023/B:EDUC.0000040410.57253.a1
  55. Yang, K., Swope, A. M., Gu, A., Chalamala, R., Song, P., Yu, S., Godil, S., Prenger, R., & Anandkumar, A. (2023). LeanDojo: Theorem proving with retrieval-augmented language models. In Proceedings of the 37th Conference on Neural Information Processing Systems (NIPS '23). https://doi.org/10.48550/arXiv.2306.15626
  56. Zafrullah, Z., Hakim, M. L., & Angga, M. (2023). ChatGPT Open AI: Analysis of mathematics education students learning interest. Journal of Technology Global, 1(1), 1-10. https://penaeducentre.com/index.php/JTeG/article/view/35/33