Browse > Article
http://dx.doi.org/10.6109/jkiice.2022.26.11.1608

Improvement of early prediction performance of under-performing students using anomaly data  

Hwang, Chul-Hyun (Department of Big Data, Hanyang Women's University)
Abstract
As competition between universities intensifies due to the recent decrease in the number of students, it is recognized as an essential task of universities to predict students who are underperforming at an early stage and to make various efforts to prevent dropouts. For this, a high-performance model that accurately predicts student performance is essential. This paper proposes a method to improve prediction performance by removing or amplifying abnormal data in a classification prediction model for identifying underperforming students. Existing anomaly data processing methods have mainly focused on deleting or ignoring data, but this paper presents a criterion to distinguish noise from change indicators, and contributes to improving the performance of predictive models by deleting or amplifying data. In an experiment using open learning performance data for verification of the proposed method, we found a number of cases in which the proposed method can improve classification performance compared to the existing method.
Keywords
Student Sucess; Student Performance Prediction; Data Amplification; Anomaly Data; Data Imbalance;
Citations & Related Records
연도 인용수 순위
  • Reference
1 O. Serradilla, E. Zugasti, J. Ramirez de Okariz, J. Rodriguez, and U. Zurutuza, "Adaptable and Explainable Predictive Maintenance: Semi-Supervised Deep Learning for Anomaly Detection and Diagnosis in Press Machine Data," Applied Sciences, vol. 11, no. 16, pp. 73-76, Aug. 2021.
2 H. D. Nguyen, K. P. Tran, S. Thomassey, and M. Hamad, "Forecasting and Anomaly Detection approaches using LSTM and LSTM Autoencoder techniques with the applications in supply chain management," International Journal of Information Management, vol. 27, pp. 102282, Apr. 2021.
3 M. Easter, H. P. Kriegel, J. Sander, and X. Xu, "A Density-Based Algorithm for Discovering Clusters in large Spatial Databases with Noise," in KDD'96: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland: OR, USA, pp. 226-231, 1996.
4 L. Ruff, J. R. Kauffmann, R. A. Vandermeulen, G. Montavon, W. Smek, M. Kloft, T. G. Dietterich, and K. -R. Muller, "A Unifying Review of Deep and Shallow Anomaly Detection," in Proceedings of the IEEE, vol. 109, no. 5, pp. 756-795, May 2021.   DOI
5 Ministry of Education. 2021 Basic Competency Assessment for Universities Basic plan [Internet]. Available: https://www.moe.go.kr/boardCnts/view.do?boardID=294&boardSeq=78253&lev=0&searchType=null&statusYN=W&page=1&s=moe&m=020402&opType=N.
6 B. Albreiki, N. Zaki, and H. Alashwal, "A Systematic Literature Review of Student' Performance Prediction Using Machine Learning Techniques," Education Science, vol. 11, no. 9, pp. 1-27, Sep. 2020.   DOI
7 B. Krawczyk, "Learning from imbalanced data: open challenges and future directions," Progress in Artificial Intelligence, vol. 5, no. 4, pp. 221-232, Apr. 2016.   DOI
8 H. Han, W. Wang, and B. Mao, "Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning," in Proceedings of International Conference on Intelligent Computing, Berlin, Heidelberg, pp. 878-887, 2017.
9 H. Zhao, Y. Li, N. He, K. Ma, L. Fang, H. Li, and Y. Zheng, "Anomaly Detection for Medical Images Using Self-Supervised and Translation-Consistent Features," IEEE Transactions on Medical Imaging, vol. 40, no. 12, pp. 3641-3651, Dec. 2021.   DOI
10 P. Cortez and A. Silva. "Using Data Mining to Predict Secondary School Student Performance," in Proceedings of 5th Future Business Technology Conference (FUBUTEC 2008), Portugal, pp. 5-12, 2008.
11 E. Alyahyan and D. Dustegor, "Predicting academic success in higher education: Literature review and best practices," International Journal of Educational Technology in Higher Education, vol. 17, no. 3, Feb. 2020.
12 D. W. Youn, "The present and future of university restructuring (Focused on quota policy)," Korea Higher Education Research Institute, pp. 1-54, Dec. 2021.
13 Complete College America, Four-year myth: Making college more affordable [Internet]. Available: http://completecollege.org/wp-content/uploads/2014/11/4-Year-Myth.pdf, 2014.
14 H. Lakkaraju, E. Aguiar, C. Shan, D. Miller, N. Bhanpuri, R. Ghani, and K. L. Addison, "A Machine Learning Framework to Identify Students at Risk of Adverse Academic Outcome," in Proceedings of the 21st ACM SIGKDD, International Conference on Knowledge Discovery and Data, Sydney, Australia, pp. 1909-1918, 2015.
15 W. Xing and D. Du, "Dropout Prediction in MOOCs: Using Deep Learning for Personalized Intervention," Journal of Educational Computing Research, vol. 57, no. 3, pp. 547-570, Mar. 2019.   DOI
16 N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, "SMOTE: Synthetic Minority Over-Sampling Technique," Journal of Artificial Intelligence Research, vol. 16, no. 1, pp. 321-357, Dec. 2019.
17 J. A. Saez, J. Luengo, J. Stefanowski, and F. Herrera, "SMOTE-IPF: Addressing the noisy and borderline examples problem in imbalanced classification by a re-sampling method with filtering," Information Sciences, vol. 291, no. 10, pp. 184-203, Jan. 2015.   DOI