Using ChatGPT as a proof assistant in a mathematics pathways course

Hyejin Park;Eric D. Manley;

doi:10.7468/mathedu.2024.63.2.139

한국수학교육학회지시리즈A:수학교육 (The Mathematical Education)

제63권2호
/
Pages.139-163
/
2024
/
1225-1380(pISSN)
/
2287-9633(eISSN)

한국수학교육학회 (Korean Society of Mathematical Education)

DOI QR Code

Using ChatGPT as a proof assistant in a mathematics pathways course

Hyejin Park (Mathematics, Drake University) ;
Eric D. Manley (Computer Science, Drake University)

투고 : 2024.04.08
심사 : 2024.05.08
발행 : 2024.05.31

https://doi.org/10.7468/mathedu.2024.63.2.139 인용 PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

The purpose of this study is to examine the capabilities of ChatGPT as a tool for supporting students in generating mathematical arguments that can be considered proofs. To examine this, we engaged students enrolled in a mathematics pathways course in evaluating and revising their original arguments using ChatGPT feedback. Students attempted to find and prove a method for the area of a triangle given its side lengths. Instead of directly asking students to prove a formula, we asked them to explore a method to find the area of a triangle given the lengths of its sides and justify why their methods work. Students completed these ChatGPT-embedded proving activities as class homework. To investigate the capabilities of ChatGPT as a proof tutor, we used these student homework responses as data for this study. We analyzed and compared original and revised arguments students constructed with and without ChatGPT assistance. We also analyzed student-written responses about their perspectives on mathematical proof and proving and their thoughts on using ChatGPT as a proof assistant. Our analysis shows that our participants' approaches to constructing, evaluating, and revising their arguments aligned with their perspectives on proof and proving. They saw ChatGPT's evaluations of their arguments as similar to how they usually evaluate arguments of themselves and others. Mostly, they agreed with ChatGPT's suggestions to make their original arguments more proof-like. They, therefore, revised their original arguments following ChatGPT's suggestions, focusing on improving clarity, providing additional justifications, and showing the generality of their arguments. Further investigation is needed to explore how ChatGPT can be effectively used as a tool in teaching and learning mathematical proof and proof-writing.

키워드

참고문헌

Alcock, L., & Weber, K. (2010). Referential and syntactic approaches to proving: Case studies from a transition-to-proof course. In F. Hitt, D. Holton, & P. Thompson (Eds.), Research in collegiate mathematics education VII (pp. 93-114). American Mathematical Society.
Appel, K., & Haken, W. (1977). The solution of the four-color-map problem. Scientific American, 237(4), 108-121. https://www.jstor.org/stable/24953967
Arnau, D., Arevalillo-Herraez, M., Puig, L., & Gonzalez-Calero, J. A. (2013). Fundamentals of the design and the operation of an intelligent tutoring system for the learning of the arithmetical and algebraic way of solving word problems. Computers & Education, 63, 119-130. https://doi.org/10.1016/j.compedu.2012.11.020
Avigad, J. (2019). Learning logic and proof with an interactive theorem prover. In G. Hanna, M. de Villiers, & D. Reid (Eds.), Proof technology in mathematics research and teaching, Series: Mathematics education in the digital era, (Vol. 14, pp. 277-290). Springer.
Bertot, Y., & Casteran, P. (2013). Interactive theorem proving and program development: Coq'Art: The calculus of inductive constructions. Springer Science & Business Media. https://doi.org/10.1007/978-3-662-07964-5
Botana, F., Hohenwarter, M., Janicic, P., Kovacs, Z., Petrovic, I., Recio, T., & Weitzhofer, S. (2015). Automated theorem proving in GeoGebra: Current achievements. Journal of Automated Reasoning, 55(1), 39-59. https://doi.org/10.1007/s10817-015-9326-4
Bozkurt, A., & Sharma, R. C. (2023). Generative AI and prompt engineering: The art of whispering to let the genie out of the algorithmic world. Asian Journal of Distance Education, 18(2), i-vii. http://www.asianjde.com/ojs/index.php/AsianJDE/article/view/749
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems (NIPS '20) (pp. 1877-1901). Curran Associates Inc. https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
Butgereit, L., & Martinus, H. (2023). Prof Pi: Using WhatsApp bots and GPT-4 for tutoring mathematics in underserved areas. In A. Seeam, V. Ramsurrun, S. Juddoo, & A. Phokeer (Eds.), Proceedings of the International Conference on Innovations and Interdisciplinary Solutions for Underserved Areas (pp. 278-289). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-51849-2_19
Cain, W. (2024). Prompting change: Exploring prompt engineering in large language model AI and its potential to transform education. TechTrends, 68(1), 47-57. https://doi.org/10.1007/s11528-023-00896-0
Chazan, D. (1993). High school geometry students' justification for their views of empirical evidence and mathematical proof. Educational Studies in Mathematics, 24(4), 359-387. https://doi.org/10.1007/BF01273371
Chen, M., Tworek, J., Jun, H., Yuan, Q., Pinto, H. P. D. O., Kaplan, J., ... & Zaremba, W. (2021). Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374. https://doi.org/10.48550/arXiv.2107.03374
Coe, R., & Ruthven, K. (1994). Proof practices and constructs of advanced mathematics students. British Educational Research Journal, 20(1), 41-53. https://doi.org/10.1080/0141192940200105
de Moura, L., Kong, S., Avigad, J., Van Doorn, F., & von Raumer, J. (2015). The Lean theorem prover (system description). In A. Felty, & A. Middeldorp (Eds.), Automated Deduction-CADE-25 (pp. 378-388). Springer International Publishing. https://doi.org/10.1007/978-3-319-21401-6_26
de Villiers, M. (1990). The role and function of proof in mathematics. Pythagoras, 24(24), 17-24.
Epp, S. S. (2003). The role of logic in teaching proof. The American Mathematical Monthly, 110(10), 886-899. https://doi.org/10.1080/00029890.2003.11920029
Finnie-Ansley, J., Denny, P., Becker, B. A., Luxton-Reilly, A., & Prather, J. (2022). The robots are coming: Exploring the implications of OpenAI Codex on introductory programming. In J. Sheard, & P. Denny (Eds.), Proceedings of the 24th Australasian Computing Education Conference (pp. 10-19). Association for Computing Machinery. https://doi.org/10.1145/3511861.3511863
First, E., Rabe, M. N., Ringer, T., & Brun, Y. (2023). Baldur: Whole-proof generation and repair with large language models. In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (pp. 1229-1241). Association for Computing Machinery. https://doi.org/10.1145/3611643.3616243
Fitting, M. (2012). First-order logic and automated theorem proving. Springer Science & Business Media.
Font, L., Gagnon, M., Leduc, N., & Richard, P. R. (2022). Intelligence in QED-Tutrix: Balancing the interactions between the natural intelligence of the user and the artificial intelligence of the tutor software. In P. R. Richard, M. P. Velez, & S. Van Vaerenbergh (Eds.), Mathematics Education in the Age of Artificial Intelligence: How Artificial Intelligence Can Serve Mathematical Human Learning (pp. 45-76). Springer International Publishing. https://doi.org/10.1007/978-3-030-86909-0_3
Font, L., Richard, P. R., & Gagnon, M. (2018). Improving QED-Tutrix by automating the generation of proofs. arXiv preprint arXiv:1803.01468. https://doi.org/10.48550/arXiv.1803.01468
Frick, T. (2024). Are we dupes? Limitations of AI systems: What should educators do with them? TechTrends, 68(1), 14-26. https://doi.org/10.1007/s11528-023-00893-3
Frieder, S., Pinchetti, L., Griffiths, R. R., Salvatori, T., Lukasiewicz, T., Petersen, P., & Berner, J. (2024). Mathematical capabilities of ChatGPT. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, & S. Levine (Eds.), Proceedings of the Advances in Neural Information Processing Systems 36 (NeurIPS 2023) Track on Datasets and Benchmarks. https://proceedings.neurips.cc/paper_files/paper/2023/file/58168e8a92994655d6da3939e7cc0918-Paper-Datasets_and_Benchmarks.pdf
Gattupalli, S., Lee, W., Allessio, D., Crabtree, D., Arroyo, I., & Woolf, B. (2023, July 7). Exploring pre-service teachers' perceptions of large language models-generated hints in online mathematics learning [Virtual Presentation]. AIED2023 Empowering Education with LLMs - the Next-Gen Interface and Content Generation, Tokyo, Japan.
Gonthier, G. (2008). Formal proof-the four-color theorem. Notices of the AMS, 55(11), 1382-1393.
Hanna, G. (2000). Proof, explanation and exploration: An overview. Educational Studies in Mathematics, 44(1), 5-23. https://doi.org/10.1023/A:1012737223465
Hanna, G., Reid, D., & de Villiers, M. (2019). Proof technology: Implications for teaching. In G. Hanna, D. A. Reid, & M. de Villiers (Eds.), Proof Technology in Mathematics Research and Teaching, Series: Mathematics Education in the Digital Era (Vol. 14, pp. 3-9). Springer. https://doi.org/10.1007/978-3-030-28483-1_1
Harel, G., & Sowder, L. (1998). Students' proof schemes: Results from exploratory studies. In A. H. Schoenfeld, J. Kaput, & E. Dubinsky (Eds.), Research in collegiate mathematics education III (pp. 234-283). American Mathematical Society.
Healy, C., & Hoyles, L. (2000). A study of proof conceptions in algebra. Journal for Research in Mathematics Education, 31(4), 396-428. https://doi.org/10.2307/749651
Hohenwarter, M., Kovacs, Z., & Recio, T. (2019). Using GeoGebra automated reasoning tools to explore geometric statements and conjectures. In G. Hanna, M. de Villiers, & D. Reid (Eds.), Proof technology in mathematics research and teaching, Series: Mathematics Education in the Digital Era (Vol. 14, pp. 215-236). Springer.
Jiang, A. Q., Li, W., Tworkowski, S., Czechowski, K., Odrzygozdz, T., Milos, P., ... & Jamnik, M. (2022). Thor: Wielding hammers to integrate language models and automated theorem provers. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho & A. Oh (Eds.), Proceedings of the Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (pp. 8360-8373). https://proceedings.neurips.cc/paper_files/paper/2022/file/377c25312668e48f2e531e2f2c422483-Paper-Conference.pdf
Kang, Y. (2024). A study on the didactical application of ChatGPT for mathematical word problem solving. Communications of Mathematical Education, 38(1), 49-67. https://doi.org/10.7468/jksmee.2024.38.1.49
Knapp, J. (2005). Learning to prove in order to prove to learn. https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=ed7046da768e6b39c38f7db472a0ff158230b075
Knuth, E. J. (2002). Proof as a tool for learning mathematics. Mathematics Teacher, 95(7), 486-490. https://doi.org/10.5951/MT.95.7.0486
Knuth, E. J., Choppin, J. M., & Bieda, K. N. (2009). Proof: Examples and beyond. Mathematics Teaching in the Middle School, 15(4), 206-211. https://doi.org/10.5951/MTMS.15.4.0206
Koubaa, A. (2023). GPT-4 vs. GPT-3.5: A concise showdown. https://doi.org/10.20944/preprints202303.0422.v1
Kovacs, Z. (2015). Computer based conjectures and proofs in teaching euclidean geometry [Doctoral dissertation, Johannes Kepler University].
Li, Z., Liu, F., Yang, W., Peng, S., & Zhou, J. (2021). A survey of convolutional neural networks: analysis, applications, and prospects. IEEE Transactions on Nneural Networks and Learning Systems, 33(12), 6999-7019. https://doi.org/10.1109/TNNLS.2021.3084827
Marty, R. H. (1986). Teaching proof techniques. Mathematics in College (Spring/Summer), 46-53.
Moore, R. C. (1994). Making the transition to formal proof. Educational Studies in Mathematics, 27, 249-266. https://doi.org/10.1007/BF01273731
National Academies of Sciences, Engineering, and Medicine. (2023). Artificial intelligence to assist mathematical reasoning: Proceedings of a workshop. The National Academies Press. https://doi.org/10.17226/27241
Nipkow, T., Wenzel, M., & Paulson, L. C. (2002). Isabelle/HOL: A proof assistant for higher-order logic. Springer Berlin Heidelberg.
Papadopoulos, D. (2016). Transitioning to proof with worked examples. Drexel University.
Patero, J. L. (2023). Revolutionizing Math Education: Harnessing ChatGPT for student success. International Journal of Advanced Research in Science, Communication and Technology, 3(1). 807-813. https://doi.org/10.48175/IJARSCT-12375
Selden, A. (2012). Transitions and proof and proving at tertiary level. In G. Hanna, & M. de Villiers (Eds.), Proof and proving in mathematics education (pp. 391-414). Springer. https://doi.org/10.1007/978-94-007-2129-6_17
Selden, J., & Selden, A. (1995). Unpacking the logic of mathematical statements. Educational Studies in Mathematics, 29(2), 123-151. https://doi.org/10.1007/BF01274210
Strauss, A. L., & Corbin, J. (1990). Basics of qualitative research: Grounded theory procedures and techniques. Sage.
Stylianides, G. J., & Stylianides, A. J. (2009). Facilitating the transition from empirical argument to proof. Journal for Research in Mathematics Education, 40(3), 314-352. https://doi.org/10.5951/jresematheduc.40.3.0314
Van Vaerenbergh, S., & Perez-Suay, A. (2022). A classification of artificial intelligence systems for mathematics education. In P. R. Richard, M. P. Velez, & S. Van Vaerenbergh (Eds.), Mathematics education in the age of artificial intelligence: How artificial intelligence can serve mathematical human learning (pp. 89-106). Springer International Publishing. https://doi.org/10.1007/978-3-030-86909-0_5
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In U. von Luxburg, & I. Guyon (Eds.), Proceedings of the Advances in Neural Information Processing Systems 30 (NeurIPS 2017) (pp. 6000-6010). Curran Associates Inc. https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
Volmink, J. D. (1990). The nature and role of proof in mathematics education. Pythagoras, 23, 7-10.
Wardat, Y., Tashtoush, M. A., AlAli, R., & Jarrah, A. M. (2023). ChatGPT: A revolutionary tool for teaching and learning mathematics. Eurasia Journal of Mathematics, Science and Technology Education, 19(7), em2286. https://doi.org/10.29333/ejmste/13272
Weber, K. (2001). Student difficulty in constructing proofs: The need for strategic knowledge. Educational Studies in Mathematics, 48(1), 101-119. https://doi.org/10.1023/A:1015535614355
Weber, K., & Alcock, L. (2004). Semantic and syntactic proof productions. Educational Studies in Mathematics, 56, 209-234. https://doi.org/10.1023/B:EDUC.0000040410.57253.a1
Yang, K., Swope, A. M., Gu, A., Chalamala, R., Song, P., Yu, S., Godil, S., Prenger, R., & Anandkumar, A. (2023). LeanDojo: Theorem proving with retrieval-augmented language models. In Proceedings of the 37th Conference on Neural Information Processing Systems (NIPS '23). https://doi.org/10.48550/arXiv.2306.15626
Zafrullah, Z., Hakim, M. L., & Angga, M. (2023). ChatGPT Open AI: Analysis of mathematics education students learning interest. Journal of Technology Global, 1(1), 1-10. https://penaeducentre.com/index.php/JTeG/article/view/35/33

한국수학교육학회지시리즈A:수학교육 (The Mathematical Education)

Using ChatGPT as a proof assistant in a mathematics pathways course

초록

키워드

참고문헌

자세히 찾기