Utilizing Data Mining Techniques to Predict Students Performance using Data Log from MOODLE

Noora Shawareb;Ahmed Ewais;Fisnik Dalipi;

doi:10.3837/tiis.2024.09.006

KSII Transactions on Internet and Information Systems (TIIS)

Volume 18 Issue 9
/
Pages.2564-2588
/
2024
/
1976-7277(pISSN)
/
1976-7277(eISSN)

Korean Society for Internet Information (한국인터넷정보학회)

DOI QR Code

Utilizing Data Mining Techniques to Predict Students Performance using Data Log from MOODLE

Noora Shawareb (Computer Science Department, Arab American University) ;
Ahmed Ewais (Computer Science Department, Arab American University) ;
Fisnik Dalipi (Informatics Department, Linnaeus University)

Received : 2024.03.11
Accepted : 2024.07.09
Published : 2024.09.30

https://doi.org/10.3837/tiis.2024.09.006 Citation PDF HTML

Download PDF

⟨ Previous Next ⟩

Abstract

Due to COVID19 pandemic, most of educational institutions and schools changed the traditional way of teaching to online teaching and learning using well-known Learning Management Systems (LMS) such as Moodle, Canvas, Blackboard, etc. Accordingly, LMS started to generate a large data related to students' characteristics and achievements and other course-related information. This makes it difficult to teachers to monitor students' behaviour and performance. Therefore, a need to support teachers with a tool alerting student who might be in risk based on their recorded activities and achievements in adopted LMS in the school. This paper focuses on the benefits of using recorded data in LMS platforms, specifically Moodle, to predict students' performance by analysing their behavioural data and engagement activities using data mining techniques. As part of the overall process, this study encountered the task of extracting and selecting relevant data features for predicting performance, along with designing the framework and choosing appropriate machine learning techniques. The collected data underwent pre-processing operations to remove random partitions, empty values, duplicates, and code the data. Different machine learning techniques, including k-NN, TREE, Ensembled Tree, SVM, and MLPNNs were applied to the processed data. The results showed that the MLPNNs technique outperformed other classification techniques, achieving a classification accuracy of 93%, while SVM and k-NN achieved 90% and 87% respectively. This indicates the possibility for future research to investigate incorporating other neural network methods for categorizing students using data from LMS.

Keywords

1. Introduction

Education has a vital role in a country's culture and progress, impacting both individuals and society [1]. The COVID-19 epidemic has resulted in a higher utilization of Learning Management Systems (LMS) in distance learning. With the growing user base and abundance of recorded course data in LMS, there is a demand for automated capabilities to assist administrators, learners, educators, and policymakers in the field of education [2]. Data-driven insights empower educators to make well-informed decisions regarding teaching approaches, showcasing the transformative capabilities of AI (Artificial Intelligence) in learning. AI and data mining methods are acknowledged in academic research for revolutionizing education through tailored and personalized learning, offering adaptable materials to learners worldwide, and employing automated assessment and instant feedback. [3].

Historically, the E-learning systems first had a site for enrolment in courses and test marks. They have transformed into portals for individual profiles, sharing educational resources, and online connection [4]. Student data from the LMS, which includes information on students and their behaviours in courses, is utilized to predict students' academic success through the application of AI and data mining methods. Predicting accurately is a challenging process because of issues such as the accessibility of datasets, the appropriateness of algorithms, and the effectiveness of qualities or variables [5]. Thus, to achieve a definitive outcome, the research requires a vast dataset and the utilization of several methodologies based on different algorithms [6].

To predict the performance of learners in a course, it is also crucial to identify and categorize them based on their unique traits. LMS data is used to create datasets containing learners features and course-related behaviours to tackle challenges faced by learners during their studies [2]. However, the capacity to create accurate predictions depends on factors including dataset availability, algorithm adaptability, and the effectiveness of traits or variables [7].

The widespread availability of the internet has significantly contributed to the emergence and growth of e-learning systems. These platforms have become particularly vital during the COVID-19 pandemic [8], transitioning from their original functions of course registration and test result access to more comprehensive services. They now support individual user profiles, facilitate the distribution of educational content, and enable online interactions between educators and students – especially for distance learning in higher education [4]. LMS data is used to predict students' performance using machine learning algorithms, but accurately predicting performance is challenging due to factors such as dataset availability, algorithm suitability, and the efficacy of attributes or variables. To produce conclusive results, a large dataset and multiple techniques based on various algorithms are required.

Therefore, this research will answer the following questions:

1. To what degree can the features of Moodle be regarded as a dependable indicator of students' academic performance?

2. Which machine learning algorithms can provide superior classification accuracy, surpassing other algorithms in terms of results?

This paper focuses on applying machine learning (ML) techniques to predict students' academic performance by analysing behavioural data and online interactions collected from Moodle. The goals are twofold: enhance student academic success and overall learning quality, as well as pinpoint key factors that influence prediction accuracy.

Significant dataset features and algorithms that contribute to the achievement of students and produce prediction models are identified. To achieve this, hybrid feature selection methods are employed to improve the accuracy of prediction by optimizing the significance of each feature.

Furthermore, machine learning algorithms are utilized for overseeing students in e-learning and furnishing them with pertinent information and remedies [9]. This would facilitate students in anticipating their academic performance and enable instructors to efficiently monitor and support students.

The rest of the paper is organized as follows: Section 2 presents a background on machine learning in education, discusses the dataset, techniques, and algorithms used for predicting student performance, and summarizes the adopted algorithms. In Section 3, explains the data collection and pre-processing phase. Also, it describes the building model phase and classification metrics selection. Section 4 presents the experiment and evaluation of the results by every algorithm and focuses on the comparison between the classification performance of MLPNNs and SVM. Section 5 discusses and compares the results in relation to the state of the art. Finally, in Section 6 we conclude the paper with a brief summary and discuss the future work.

2. Related Work

The demand for IT skills has led to a focus on computer science in education, particularly in STEM (Science, Technology, Education, Mathematics) subdisciplines like artificial intelligence and statistics. Machine learning, a subfield of artificial intelligence, can analyse and interpret data from online learning systems, providing insights for identifying students at risk, suggesting personalized learning materials, improving content delivery, and evaluating instructional strategies. This method enhances online learning experiences and aids in the ongoing enhancement of educational platforms, fostering a potent synergy between technology and education

LMSs facilitate interaction between traditional teaching methods and digital resources, providing personalized e-learning opportunities [10]. As a result of the COVID-19 pandemic, educational institutions have had to adapt to limitations on in-person engagement, which have prevented most traditional methods of teaching and assessment [11].

Higher Education Institutions (HEIs) have effectively utilized LMSs and it was noticed increase in the available tools for academic and educational purposes [11]. Moodle, MOOCs, and Google Classroom were the most extensively studied learning platforms from 2015 to 2020 [8]. Moodle, an open-source cloud based LMS, is widely used in HEIs, including STEM education [12], with its user base increasing from 78 million in 2015 to over 368 million by 2023 [13].

Different research investigated the correlation between the learning performance of students and their cooperative programming learning behaviour. Students who performed well independently, improved themselves with assistance, gained confidence after gaining enlightenment, and imitated others had superior academic performance. Those who plagiarized and performed inadequately without assistance, however, fared worse. Identifying subpar behaviour after a problem has been solved, intervening to improve it, and providing more incentives for motivation and early problem-solving feedback are all recommendations made in the study [14].

Another study examines the relationship between learning styles, participation types, and learning performance in programming language learning using an online forum. Kolb's learning style inventory was used to identify learning types, while Social Learning Theory was used to define participation types. In total, 144 students participated in a half semester ASP.NET course, and learning scores and satisfaction were measured. Results showed that different learning styles and participation types significantly impacted learning scores, with the 'Accommodator' style having superior scores. The study suggests that online forums and active participation can enhance learning performance [15].

On the other hand, authors in [16] conduct a comprehensive assessment of 357 papers and categorizes data sources into five distinct categories: demographic, personal, academic, behavioural, and institutional. Demographic information encompasses several aspects such as age, gender, educational attainment, geographical location, health status, and family background. Personality data encompasses psychological and emotional information, such as self-control, self-confidence, and the five major personality qualities. Academic data encompasses many indicators such as course performance and previous academic achievements, such as performance in secondary school. Behavioural data encompasses several metrics such as time spent on task and the frequency of interactions with the learning material. Institutional data refers to information related to teaching methods, the quality of high schools, and other relevant factors [17].

Numerous statistical and machine learning models have been used to predict students' performance, with Moodle LMS being a popular choice for online teaching and learning in HEIs. However, research on Moodle's capabilities is scattered across published literature. For example, Feiyue Qiu1 et al. developed an e-learning performance predictor based on the process-behaviour classification model (PBC model), incorporating behaviours like self-directed learning and system interaction. They used various algorithms (SVC(R), SVC(L), naïve Bayes, k-NN(U), k-NN(D), Soft max) with three data processing methods to reduce complexity and maintain predictive performance, achieving a 95% accuracy rate [18]. Furthermore, M´elina et al. studied a universal approach to predict students' academic achievement using demographic age, average grades, and the CART decision tree algorithm. The experimental results showed a 95% accuracy rate, indicating the need for more data [19].

Amal Alhassan et al. performed a study to analyse how LMS evaluation grades and online activity data affect students' academic performance. The study makes use of several Moodle features such as file viewing, email communication, completed quizzes, and academic information. The methods used included decision tree, random forest, sequential minimum optimization, multilayer perceptron, and logistic regression. The researchers found that evaluation grades had the most influence, with the random forest algorithm achieving the best accuracy rate of 99.17% [1]. Moreover, Yeong Wook Yang et al developed a technique to predict students' academic achievement by analysing procrastinating tendencies through submission records and assignment scores. The study revealed that procrastinating leads to lower homework marks, reduced free time, and increased sedentary behaviour, ultimately reducing the likelihood of graduating. The study employed L-SVM and R-SVM algorithms to examine features including free time, homework grades, and overall grades. The accuracy rates of the two algorithms were 84% and 69% respectively [20].

Additionally, Parisa Shayan et al. utilized online behaviours to predict a student's academic success by employing Decision Tree J48 and ID3 algorithms, achieving a 73% accuracy rate. The study used learning management system data, student and course details, and performance metrics. The main criteria were midterm grade and LMS data before midterm and final tests [21]. Otherwise, Rianne Conijn et al. tested using LMS data to predict student performance and isolate individual student volatility. The researchers employed Pearson correlation and regression analyses algorithms on important features such as the number of mouse-overs, quantity of used materials, normalized meeting time, disrupted study schedules, longest period of inactivity, and amount of time spent online. The results found low transferability between classes and high influence of individual and group factors on final exam grades. The results attained an overall accuracy of 67% [22]. Table 1 presents a comparison between a number of studies that applied machine learning algorithms on dataset extracted from Moodle.

Table 1. Summarize articles that used Machine Learning based on dataset from Moodle.

E1KOBZ_2024_v18n9_2564_5_t0001.png 이미지

3. Research Approach

This section focuses on the research methods conducted in this study, by describing the dataset utilized. Then, it presents the procedures for data preparation and features selection suitable for the research. Subsequently, it describes the suggested model and illustrates the methods that were implemented. Finally, the selection of appropriate classification metrics to measure the strength and efficiency of the proposed model is explained, and the results are presented and compared.

3.1 Dataset

The data for this study was obtained from the Moodle system of Arab American University, situated in Jenin, Palestine. It includes student evaluation grades and Moodle activity data. The data utilized for the analysis consisted of students' online behaviours such as System Log and Quiz Grades, among others. The system log contains a record of every user click within the system. The dataset has 8220 observations and 20 attributes related to 411 active users in the system. The gathered data served as inputs for machine learning algorithms, which were utilized to predictstudents' performance which was statistically tested to ascertain the accuracy of the algorithm.

3.2 Data Preparation and Feature Selection

Data pre-processing: The Moodle system was utilized for gathering online activity features in the log system, which are then processed in three stages for data analysis (cleaning data, feature scaling, data encoding). The three stages are mainly performed to enhance the utilization of the adopted ML algorithms [23], [24]. The following table displays the features that were analysed in this study.

Table 2. The Features Derived from Moodle system.

E1KOBZ_2024_v18n9_2564_7_t0002.png 이미지

a) Cleaning Data: It is the initial stage of data processing before machine learning algorithms are applied. After checking the data, it was noted that there are incomplete samples and other non-matching samples. Here, incomplete and non-matching samples are removed from the obtained dataset [25].

b) Feature Scaling: This step is crucial for the proper functioning of classification algorithms by normalizing and organizing the collected data, as illustrated in equation (1) [26].

x' = (x-average (x)) / (std (x)) (1)

c) Data Encoding: This process is crucial for neural networks to function efficiently as they cannot handle values exceeding one unit such as the student's performance in evaluation 1 is 0001, followed by 0010, 0100, and 1000 [27].

3.3 Proposed Model

The proposed model considers ML techniques and hybrid models for classification and prediction process in four stages, as described below and presented in Fig. 1:

E1KOBZ_2024_v18n9_2564_9_f0001.png 이미지

Fig. 1. The general method procedure flow chart.

Notably, the process starts with collecting data from a Moodle dataset, which has to be very detailed and contain the intended student. After that, the data analysis techniques are employed to get useful information, including identifying the features, characteristics, and outliers. In the study, feature selection is an important step in the process of selecting the most important features because it can affect the outcome of the algorithm for analysis and training the model. Pre-processing is adjusting all the data to make it cleaner and easier for machine learning algorithms to work with, like dealing with missing values, encoding category variables, and scaling numerical features.

Cross-validation is used to increase the robustness of the machine-learning techniques. Here, the models are tested on the data that was not previously seen to evaluate their ability to predict new, unseen data. After selecting the best model using cross-validation results, we use these data to train, validate, and test several machine learning techniques to be evaluated by the testing on a separate test set to assess its effectiveness. In the end, the model is transferred to be used in the production environments, and it is a sure way of integration into the existing systems for practical implementation.

Based on the proposed model's procedural steps, a framework construction mechanism consists of four basic stages as follow:

Stage one: This stage involves identifying the descriptive attributes. It also explores the potential of machine learning algorithms to improve decision-making processes in predicting, identify the variables used, and identify data sources for these processes.

Stage two: This stage outlines the target attributes for the model to use from the dataset, addressing decisions and data types, with the chosen attributes being decisive.

Stage three: The proposed framework in this stage employs machine learning algorithms such as K-Nearest Neighbour (k-NN), Support Vector Machine (SVM), Decision Tree, Ensemble Techniques, and Multi-Layer Perceptron Neural Networks to train models for prediction processes. The adaptable algorithm pool enables the integration of new machine learning algorithms in data analysis. MATLAB is utilized to implement these models and analyse the outcomes.

Stage four: During this phase, machine learning algorithms are assessed, and their effectiveness is tested using standard evaluation measures including mean absolute percentage error (MAPE), mean square error (MSE), root mean square error (RMSE), and coefficient of determination (R2) or adjusted R*2. Next, analyse the outcomes and determine the most effective method for the prediction process.

The machine learning algorithms used in this framework using the dataset are:

a) K-Nearest Neighbour (k-NN): This algorithm is an example of nonparametric supervised learning that classifies or predicts unknown values in a dataset based on proximity. The input variables for this algorithm consist of 17 values, and the number of views is 553. A 5-fold cross-validation is conducted, using 10 blocks and four output classes. It assumes similar values can be found near each other. The k-NN classifier uses the Euclidean distance between points to assign a class to a new unknown instance. The weight technique determines the predicted class, which is determined by calculating the inverse of the distance and dividing each item by the resulting sum. Then, choose the max value to create a predictable. This technique is useful in this research to rate students' performance in the portal. In addition, it is suitable for multi-class classification. It showed a good result in the first and second class and not good results in the third class.

b) Support Vector Machine (SVM): The input variables for this technique consist of 17 values, and the number of views is 553. A 5-fold cross-validation is conducted, using 10 blocks and four output classes. This technique divides data by identifying a level that separates distributed data based on unique characteristics. The main goal is to obtain a high-quality level by calculating the best distance between points with similar characteristics and the proposed level [28]. SVM technology is a powerful tool for identifying and ignoring extreme elements, allowing it to heal outliers and separate scattered elements. It can adapt to complex data through kernel technology and can handle both linear and non-linear data. The technique uses a loss function to calculate expected and actual values of the system, using weight elements to calculate expected and actual values. The loss function updates the weight using a gradient, which calculates partial derivatives related to the weights associated with the data point. If a false expectation occurs, the classification of the element is lost. This technique is useful in this research to classify students' and to check the quality of it. Also, it is suitable for multi-class classification. It showed a good result in the first and second class and not good results in the third class as in k-NN classifier.

c) Decision Tree: It is a popular technique for prediction and classification. It simulates a tree structure, with nodes representing traits and results displayed on branches. The input variables for this technique consist of 17 values, and the number of views is 553. A 5-fold cross-validation is conducted, using 10 blocks and four output classes. The tree learns by dividing collected data into small subsets based on operations and tests within its nodes. The vertical division process continues until the target variable and subset values are equal. The tree consists of layers with specific data, and it can collect similarities between data. The technique can divide data in different directions but requires large data for effective classification. The tree grows from a root node, divides data into branches, and finds the best attribute in the node. Advanced techniques like ensemble methods and bagging can enhance research by combining the strengths of multiple models [29]. This technique is useful in this research through supporting multidimensional data and can quickly classify data without prior knowledge of parameters. Also, it is suitable for multi-class classification. It showed a good result in the first and second class and not good results in the third class as k-NN classifier and SVM.

d) Ensemble Techniques: Merging multiple models in machine learning can lead to more accurate classifications compared to using only one model. This technique requires complex mathematical processes and calculations, but the obtained results are more accurate. Researchers often use fast algorithms like decision trees and combine slow algorithms but more accurately in some applications such as deep neural networks. Two methods for implementing multiple models are sequential and parallel. Sequential implementation increases the power of relationships and dependence between basic learners, while parallel implementation encourages greater independence between algorithms. Each model uses a different training dataset to determine the error rate [29]. This technique is useful in this research through improving the tree algorithm by employing intricate mathematical procedures to assess the model. In addition, ES verifies the accuracy of the prediction. The accuracy of these strategies is enhanced through the integration of many tree algorithms. The input variables consist of the number 17, and the number of views is 553. A 5-fold cross-validation is conducted using 10 blocks and four output classes. Also, it is suitable for multi-class classification.

Thus, the aforementioned procedures utilize the subsequent scientific terminology:

1. Bagging techniques: which involve compilation and booting, aim to eliminate the issue of using the same dataset for training models, resulting in similar results [30]. They separate the dataset into sub-data sets, using bootstrap aggregating to understand data distribution. The process involves constructing subsets of the original dataset, creating primary learning models, training them independently, and merging predictions to obtain final predictions [31].

2. Random Forests Ensemble technique: This technique is a method that uses bagging and decision tree techniques to create a final algorithm. It involves splitting the dataset into sub-sets, selecting features for decision trees, iterating using multiple variables, checking each decision tree, and selecting the final results from the obtained results [32].

3. Boosting: is a method for minimizing errors in weak learner algorithms like decision trees by increasing the weight of misclassified data points. It involves creating a dataset, loading weights, feeding the model, and increasing the weight of misclassified data points. If not possible, the process is reiterated [33].

e) Multi-Layer Perceptron Neural Networks (MLPNNs): Neural networks are being more commonly utilized in scientific investigations and research because they replicate the processes of neural networks in the human brain. These networks utilize machine learning techniques to forecast accurate outcomes with minimal margin of error [34]. There are two approaches: supervised and unsupervised learning. Neural networks have a vital role in diverse sectors such as education, medicine, agriculture, and manufacturing. The models are trained and fine-tuned by analyzing the discrepancies in the acquired outcomes, guaranteeing a robust framework for precise forecasting and categorization. Neural networks are comprised of interconnected neurons whose weights are modified according to the error ratio between the expected and actual outputs in order to minimize errors and achieve precise outcomes [35].

Multi-layer neural networks are composed of three primary layers: the input layer, output layer, and hidden layer. The input layer of a neural network receives data that represents properties and qualities designed for a particular purpose. The output layer presents the training results, which are then compared with the real outcomes to determine the prediction error. The buried layer resets the synaptic weights between neurons to maintain precision in the outcomes. During forward propagation, the neural network compares expected outputs with accurate ones, and the error caused by the prediction process adjusts the weights. The method is iterated till achieving the minimum error or halted if it does not decrease the error. [36].

In this research, the process of classification is executed via a neural network approach, particularly with a set of 10 neurons. These 10 neurons were chosen after doing multiple trials with different numbers of neurons, including 5, 10, 15, and 20. The objective is to get optimal outcomes while minimizing the number of neurons required. Furthermore, these metrics seek to assess the performance of neural networks in comparison to classification learner algorithms, in order to determine which algorithm exhibits superior capabilities in the classification and prediction tasks specific to our instance.

Classification Metrics Selection

Classification metrics are crucial for measuring the strength and efficiency of machine learning algorithms. Criteria are used to evaluate the feasibility of implementing machine learning models and predict efficient results. Common criteria include binary classification processes, where the expected output is either positive or negative. However, models can make errors in prediction, leading to confusion matrix analysis which is seen in Table 3.

Table 3. Confusion matrix

E1KOBZ_2024_v18n9_2564_12_t0001.png 이미지

In this study, a set of mathematical equations are used to measure the tools of machine learning techniques that were applied by analysing previous classifications and data amounts in each class [37]. The mathematical equations are:

a) Accuracy: The accuracy measure is a commonly used tool to express the ratio of predicted results to the total obtained results, but high accuracy does not guarantee model efficiency or integrity [38]. This scale is quantified using the equation (2):

\(\begin{align}Accuracy=\frac{T P+T N}{T P+F P+F N+T N}\end{align}\) (2)

b) Precision: This scale calculates the ratio of correct expected positive results to total positive results, with high accuracy results when few expected false positives occur [39]. This scale is quantified using the equation (3):

\(\begin{align}Precision=\frac{T P}{T P+F P}\end{align}\) (3)

c) Recall: This scale calculates the ratio of expected correct positive results to actual total positive results using the following equation (4):

\(\begin{align}Recall=\frac{T \mathrm{P}}{T \mathrm{P}+F \mathrm{~N}}\end{align}\) (4)

d) Specificity: This scale calculates the ratio of expected true negative results to the sum of true negative and false positive results using the following equation (5):

\(\begin{align} Specificity=\frac{T N}{T N+\mathrm{FP}}\end{align}\) (5)

e) F1 score: This scale represents the average weight of accuracy and recall, considering false positive and false negative results. High values are used when false positives equal false negatives, while matrices ensure model efficiency and results integrity [39]. This scale is quantified using the equation (6):

\(\begin{align}F 1 score=\frac{2 *(\text { Recall } * \text { Precision })}{(\text { Recall }+ \text { Precision })}\end{align}\) (6)

f) Receiver-Operating Characteristic ROC: This scale is utilized to analyze and evaluate the quality of machine learning models, particularly those with multi-class models and high efficiency on dual models, by analyzing graphical curves obtained during training.

Weighted = (W(C1) * Matrices1 + WC2) * Matrices2+. +W(Cn)*Matrices)/n (7)

4. Experiment and Results

This section presents the different results after machine learning techniques are applied in classification by using MATLAB 2018b on a 9th-generation HP workstation. Four techniques, including SVM, k-NN, Tree, and Ensemble Tree, are applied for multi-class classification. The neural network algorithm, using 10 neurons, is used to achieve the best results with the least number of neurons. The goal is to compare neural networks with classification learner algorithms to determine the strongest one for the case. Similar to previous studies, the dataset was split into training and testing dataset. Through a random shuffling, the original dataset was divided to training and testing with 80:20 ratio. The results from applying the selected techniques and the ROC curve for each algorithm are shown in Table 4,5,6,7,8 and Fig. 2,3,4,5,6:

As we see, Table 4 shows the classification results that appeared from applying the k-NN technique for the different classes. Also, Fig. 2 presents the k-NN ROC curve which is plotted between sensitivity and specificity for each class. Additionally, Table 5 displays the classification results that emerged after applying the SVM technique and Fig. 3 presents the SVM ROC curve.

Table 4. k-NN classification metrices

E1KOBZ_2024_v18n9_2564_14_t0001.png 이미지

E1KOBZ_2024_v18n9_2564_14_f0001.png 이미지

Fig. 2. The k-NN ROC curve.

Table 5. SVM classification metrices

E1KOBZ_2024_v18n9_2564_15_t0001.png 이미지

E1KOBZ_2024_v18n9_2564_15_f0001.png 이미지

Fig. 3. The SVM ROC curve.

Moreover, the classification results of the DT technique are shown in Table 6, while its ROC curve is clearly in Fig. 4.

Table 6. DT classification metrices.

E1KOBZ_2024_v18n9_2564_16_t0001.png 이미지

E1KOBZ_2024_v18n9_2564_16_f0001.png 이미지

Fig. 4. The DT ROC curve.

On the other hand, Table 7 shows the ES boosted classification results and Fig. 5 describes it is ROC curve.

Table 7. ES boosted tree classification metrices

E1KOBZ_2024_v18n9_2564_17_t0001.png 이미지

E1KOBZ_2024_v18n9_2564_17_f0001.png 이미지

Fig. 5. The ES boosted tree ROC curve.

At the end, the classification results of MLP classification technique for 10 neurons are shown in Table 8, while its ROC curve is described in Fig. 6.

Table 8. MLP 10 neuron classification metrices

E1KOBZ_2024_v18n9_2564_18_t0001.png 이미지

E1KOBZ_2024_v18n9_2564_18_f0001.png 이미지

Fig. 6. The MLP 10 neuron ROC curve.

Comparing different classification learner algorithms is crucial for determining the most suitable one for an educational dataset relating to student performance in courses using a portal. So that, a comparison between the first four algorithms employed and highlighting their distinctions. According to the results, the comparisons demonstrate that SVM has superior performance compared to the other classification algorithms. Following SVM, Ensemble Tree performs well, but the k-NN techniques trail behind in classification. As illustrated in Table 9 and Fig. 7.

Table 9. Classification learner comparison

E1KOBZ_2024_v18n9_2564_19_t0001.png 이미지

E1KOBZ_2024_v18n9_2564_19_f0001.png 이미지

Fig. 7. Classification comparison chart.

On the other hand, the performance of the MLPNNs is evaluated by comparing their results with those of the top classification learner mechanism which is SVM mechanism. The evaluation is based on metrics such as accuracy, F1 score, Macro-F1, and weighted-f1 for each technique. The superiority of neural network techniques was demonstrated by the data presented in Table 10 and Fig. 8. These findings highlight the robustness of neural networks in performing prediction and classification tasks.

Table 10. SVM vs. MLP 10 neurons

E1KOBZ_2024_v18n9_2564_19_t0002.png 이미지

E1KOBZ_2024_v18n9_2564_19_f0002.png 이미지

Fig. 8. SVM vs. MLPNNs.

5. Discussion

Assessing machine learning algorithms like k-NN, SVM, Ensembled Trees, Decision Trees, and MLPNNs requires considering their capabilities and problem, with the optimal algorithm selection based on specific circumstances and data [40]. When determining the optimal machine learning model, it is widely acknowledged that feature selection is a crucial step in eliminating unnecessary or redundant predictors that inflate the standard error of the predicted regression coefficients and diminish model performance. This results in limited predictive capacity when the model is trained on noisy data and the number of predictors is near to sample size [41]. The significance and utility of feature selection before modelling were apparent in this study.

Based on our study results, MLPNNs exhibited the best prediction performance, with a classification accuracy reaching 93%, followed by SVM with 90%. Multi-layer perceptron neural networks (MLPNNs) are adept at capturing complex and non-linear relationships in the data, making them ideal for attaining the study's goals. Their extraordinary success is attributed to their adaptability and intrinsic capacity for learning. Previous studies have also been conducted to predict and examine the students’ academic performance within the context of Moodle LMS [42], [43], [44], [45]. Authors in [42] showed the potential superior predictive performance of SVM over other ML algorithms in a mission to analyze Moodle data and identify the most influencing features to develop the predictive model. Furthermore, authors suggest that there is a strong relationship between users’ online activities and their performance. In relation to [43], our study exhibits better prediction performance, where the accuracy is only 83.95%. Similarly, as far as the study [44] is concerned, in terms of accuracy, our model performs better than in this case with accuracy values of 78.02%. Furthermore, the results from our study are also superior compared to the results presented in [45]. More specifically, the method we apply shows better results in terms of accuracy, where in this paper is 88.3%.

It is worth mentioning that, to the authors’ knowledge, no previous study adopted the deep learning model MLPNN in predicting students’ performance at HEI settings using LMS. This indicates the contribution achieved by the suggested approach in properly predicting student academic performance.

Nevertheless, it is crucial to acknowledge that MLPNNs might not always make the most optimal selection. The variability of the dataset's properties is contingent upon the attributes that are selected and the scenario that is chosen. The variations in the prediction models, both in this study and in [46], may also be attributed to the learning designs, since previous research has shown that the nature of activities offered in a LMS might impact the frequency of LMS visits [47]. We only included features or predictors from modules that were present in all courses. Our results, which use a broader range of factors compared to earlier studies, still validate the effectiveness of distinct predictors in various courses. It is likely that a more detailed separation of the content of modules may alter the findings, but this would need a broader series of courses. Based on our observations, merging data from numerous courses for a thorough analysis proves challenging. Additionally, developing generic models capable of accurately predicting outcomes beyond the available data’s scope presents a challenge as well.

Implications and limitations

Due to the increased emergent online learning caused by COVID-19 pandemic, it is crucial to examine how students perceive the new educational approach. The primary objective of this work was to establish accurate correlations between ML methods and students’ academic performance. Subsequently, the chosen framework was used to predict students’ engagement in different courses offered through Moodle, while also identifying significant predictors and features. The study offered valuable insights into ML techniques and pertinent factors used to determine student interaction and performance in HEI settings. The study’s empirical results, obtained through a more accurate predictive model offer valuable insights to teaching faculties, managerial staff, and administrative personnel. These insights highlight the areas that should be prioritized in order to optimize the learning experience and teaching performance using online teaching and learning during the higher education crisis.

It is important to mention that there are a number of limitations have been experienced in this study. For instance, extracting and compiling the dataset from Moodle database servers is considered as a complex task and needs intensive efforts by a database engineer. This is due to the fact that there is a complex entity relationship in the database. Another limitation was related to the need for interpreting the data stored in database to find the intended features such as largest period of inactivity, total time spent on assignments, etc. However, this task (extracting and interpreting data from the database) can be performed via AI tool which will be able to understand the database schema and entity relationships to create the dataset with the required features. Also, due to the limited scope of the study performed at a single university in Palestine, the findings cannot be applied generally to other universities, countries, or other online teaching methods such as blended/hybrid or flipped learning. Moreover, this research just presented results from the students’ side, neglecting the inclusion of professors and other auxiliary personnel.

6. Conclusion and Future Work

Educational data mining is an effective analytical technique for evaluating educational data and making predictions about students’ academic performance to make informed plans and take appropriate actions. Understanding student experiences on LMS platforms is also crucial for policymakers to support education growth. This research used data from 500 students enrolled in three semester courses to construct a predictive model. Confidentiality is ensured, and the data was collected based on university administration transactions. The performance of the various ML models was assessed by considering accuracy values from the comparison of behavioural features of students and it was concluded that deep learning model MLPNN had better predictive performance when compared to other models in this study. Moreover, in the existing state of the art, no prior research has leveraged the MLPNN algorithm for that purpose.

Future improvements aim to enhance the system's efficiency and improve student performance classification results. This includes selecting more features and using new technologies, particularly hybrid ones, for more accurate predictions and classification. Moreover, obtaining larger datasets from universities and other institutions will improve the system's ability to classify data samples more accurately. Additionally, integrating machine learning techniques with software will help decision-makers evaluate and assess results. This provides a reference for future improvements in line with the current situation of Palestinian universities.

References

A. Alhassan, B. Zafar, and A. Mueen, "Predict Students' Academic Performance based on their Assessment Grades and Online Activity Data," Int. J. Adv. Comput. Sci. Appl., vol.11, no.4, 2020.
A. El-Halees, "Mining Students Data To Analyze Learning Behavior : a Case Study Educational Systems," Work, 2008.
A. L. Samuel, "Some Studies in Machine Learning Using the Game of Checkers," IBM J. Res. Dev., vol.3, no.3, pp.210-229, 1959.
B. Mbouzao, M. C. Desmarais, and I. Shrier, "Early Prediction of Success in MOOC from Video Interaction Features," in Proc. of 21st International Conference, Artificial Intelligence in Education, vol.12164, pp.191-196, 2020.
C. Costa, H. Alvelos, and L. Teixeira, "The Use of Moodle e-learning Platform: A Study in a Portuguese University," Procedia Technol., vol.5, pp.334-343, 2012.
E. Galy, C. Downey, and J. Johnson, "The Effect of Using E-Learning Tools in Online and Campus-based Classrooms on Student Performance," J. Inf. Technol. Educ., vol.10, pp.209-230, Jan. 2011.
A. M. Shahiri, W. Husain, and N. A. Rashid, "A Review on Predicting Student's Performance Using Data Mining Techniques," Procedia Comput. Sci., vol.72, pp.414-422, 2015.
B. Albreiki, N. Zaki, and H. Alashwal, "A Systematic Literature Review of Student' Performance Prediction Using Machine Learning Techniques," Educ. Sci., vol.11, no.9, 2021.
M. S. Aliero, M. F. Pasha, D. T. Smith, I. Ghani, M. Asif, S. R. Jeong, M. Samuel, "NonIntrusive Room Occupancy Prediction Performance Analysis Using Different Machine Learning Techniques," Energies, vol.15, no.23, 2022.
S. A. Aljawarneh, "Reviewing and exploring innovative ubiquitous learning tools in higher education," J. Comput. High. Educ., vol.32, no.1, pp.57-73, 2020.
K. G. Byrnes, P. A. Kiely, C. P. Dunne, K. W. Mcdermott, and J. C. Coffey, "Communication, collaboration and contagion: "Virtualisation" of anatomy during COVID-19," Clin. Anat., vol.34, no.1, pp.82-89, Jan. 2021.
B. M. Henrick et al., "Elevated Fecal pH Indicates a Profound Change in the Breastfed Infant Gut Microbiome Due to Reduction of Bifidobacterium over the Past Century," mSphere, vol.3, no.2, 2018.
D. Bigler and G. Hagel, "Technical Report: Define a customized course and import it into Moodle without changes to the configuration of the Moodle system," in Proc. of ECSEE '23: Proceedings of the 5th European Conference on Software Engineering Education, pp.180-183, 2023.
W.-Y. Hwang, R. Shadiev, C.-Y. Wang, and Z.-H. Huang, "A pilot study of cooperative programming learning behavior and its relationship with students' learning performance," Comput. Educ., vol.58, no.4, pp.1267-1281, 2012.
R.-S. Shaw, "A study of the relationships among learning styles, participation types, and performance in programming language learning supported by online forums," Comput. Educ., vol.58, no.1, pp.111-120, Jan. 2012.
A. Hellas et al., "Predicting academic performance: A systematic literature review," in Proc. of ITiCSE 2018 Companion: Proceedings Companion of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education, pp.175-199, 2018.
D. E. Goldberg, "Real-coded Genetic Algorithms, Virtual Alphabets, and Blocking," Complex Syst., vol.5, no.2, pp.139-167, 1991.
F. Qiu et al., "Predicting students' performance in e-learning using learning process and behaviour data," Sci. Rep., vol.12, no.1, pp.1-16, 2022.
M. Verger and H. J. Escalante, "Predicting students' performance in online courses using multiple data sources," arXiv:2109.07903, 2021.
Y. Yang, D. Hooshyar, M. Pedaste, M. Wang, Y.-M. Huang, and H. Lim, "Predicting course achievement of university students based on their procrastination behaviour on Moodle," Soft Comput., vol.24, pp.18777-18793, Dec. 2020.
P. Shayan and M. van Zaanen, "Predicting Student Performance from Their Behavior in Learning Management Systems," Int. J. Inf. Educ. Technol., vol.9, no.5, pp.337-341, 2019.
R. Conijn, C. Snijders, A. Kleingeld, and U. Matzat, "Predicting Student Performance from LMS Data: A Comparison of 17 Blended Courses Using Moodle LMS," IEEE Trans. Learn. Technol., vol.10, no.1, pp.17-29, 2017.
J. H. Koo, L. M. Hwang, H. H. Kim, T. H. Kim, J. H. Kim, and H. S. Song, "Machine learning-based nutrient classification recommendation algorithm and nutrient suitability assessment questionnaire," KSII Trans. Internet Inf. Syst., vol.17, no.1, pp.16-30, 2023.
T. Alhmiedat and M. Alotaibi, "The Investigation of Employing Supervised Machine Learning Models to Predict Type 2 Diabetes Among Adults," KSII Trans. Internet Inf. Syst., vol.16, no.9, pp.2904-2926, Sep. 2022.
I. F. Ilyas and T. Rekatsinas, "Machine Learning and Data Cleaning: Which Serves the Other?," J. Data Inf. Qual., vol.14, no.3, pp.1-11, Jul. 2022.
Y. Kayode Saheed, A. Idris Abiodun, S. Misra, M. Kristiansen Holone, and R. Colomo-Palacios, "A machine learning-based intrusion detection for detecting internet of things network attacks," Alexandria Eng. J., vol.61, no.12, pp.9395-9409, 2022.
N. Biswas, K. M. M. Uddin, S. T. Rikta, and S. K. Dey, "A comparative analysis of machine learning classifiers for stroke prediction: A predictive analytics approach," Healthc. Anal., vol.2, 2022 .
B. Mohammadi, M. J. S. Safari, and S. Vazifehkhah, "IHACRES, GR4J and MISD-based multi conceptual-machine learning approach for rainfall-runoff modeling," Sci. Rep., vol.12, no.1, 2022.
P. Kumar Mall, R. Kumar Yadav, A. Kumar Rai, V. Narayan, and S. Srivastava, "Early Warning Signs Of Parkinson's Disease Prediction Using Machine Learning Technique," J. Pharm. Negat. Results, vol.13, no.10, pp.4784-4792, 2022.
S. S. Zehra, M. Magarini, R. Qureshi, S. M. N. Mustafa, and F. Farooq, "Proactive approach for preamble detection in 5G-NR PRACH using supervised machine learning and ensemble model," Sci. Rep., vol.12, no.1, 2022.
M. Jamei, M. Karbasi, M. Ali, A. Malik, X. Chu, and Z. M. Yaseen, "A novel global solar exposure forecasting model based on air temperature: Designing a new multi-processing ensemble deep learning paradigm," Expert Syst. Appl., vol.222, 2023.
R. Benazir Begam, M. Palanivelan, and Preethi S, "An Ensemble Machine Learning Algorithm To Diagnose Alzheimer's Disease," in Proc. of 2023 International Conference on Recent Advances in Electrical, Electronics, Ubiquitous Communication, and Computational Intelligence (RAEEUCCI), pp.1-6, 2023.
O. D. Okey et al., "BoostedEnML: Efficient Technique for Detecting Cyberattacks in IoT Systems Using Boosted Ensemble Machine Learning," Sensors, vol.22, no.19. 2022.
Z. Chang et al., "Landslide susceptibility prediction using slope unit-based machine learning models considering the heterogeneity of conditioning factors," J. Rock Mech. Geotech. Eng., vol.15, no.5, pp.1127-1143, 2023.
S. Ghimire, R. C. Deo, D. Casillas-Perez, S. Salcedo-Sanz, E. Sharma, and M. Ali, "Deep learning CNN-LSTM-MLP hybrid fusion model for feature optimizations and daily solar radiation prediction," Measurement, vol.202, 2022.
S. S. Sivasankari, J. Surendiran, N. Yuvaraj, M. Ramkumar, C. N. Ravi, and R. G. Vidhya, "Classification of Diabetes using Multilayer Perceptron," in Proc. of 2022 IEEE International Conference on Distributed Computing and Electrical Circuits and Electronics (ICDCECE), pp.1-5, 2022.
P. E. A. Ayawah et al., "A review and case study of Artificial intelligence and Machine learning methods used for ground condition prediction ahead of tunnel boring Machines," Tunn. Undergr. Sp. Technol., vol.125, 2022.
J. Niyogisubizo, L. Liao, E. Nziyumva, E. Murwanashyaka, and P. C. Nshimyumukiza, "Predicting student's dropout in university classes using two-layer ensemble machine learning approach: A novel stacked generalization," Comput. Educ. Artif. Intell., vol.3, 2022.
S. Memis, S. Enginoglu, and U. Erkan, "A classification method in machine learning based on soft decision-making via fuzzy parameterized fuzzy soft matrices," Soft Comput., vol.26, no.3, pp.1165-1180, 2022.
G. Naidu, T. Zuva, and E. M. Sibanda, "A Review of Evaluation Metrics in Machine Learning Algorithms, " in Proc. of the 12th Computer Science On-line Conference 2023, Artificial Intelligence Application in Networks and Systems, vol.3, pp.15-25, 2023.
M. Krzywinski and N. Altman, "Multiple linear regression," Nat. Methods, vol.12, pp.1103-1104, 2015.
S. Shrestha and M. Pokharel, "Educational data mining in moodle data," Int. J. Informatics Commun. Technol., vol.10, no.1, pp.9-18, 2021.
P. Prasertisirikul, S. Laohakiat, R. Trakunphutthirak, and S. Sukaphat, "A Predictive Model for Student Academic Performance in Online Learning System," in Proc. of 2022 International Conference on Digital Government Technology and Innovation (DGTi-CON), pp.76-79, 2022.
A. H. Nabizadeh, D. Goncalves, S. Gama, and J. Jorge, "Early Prediction of Students' Final Grades in a Gamified Course," IEEE Trans. Learn. Technol., vol.15, no.3, pp.311-325, 2022.
R. Hasan, S. Palaniappan, S. Mahmood, A. Abbas, K. U. Sarker, and M. U. Sattar, "Predicting Student Performance in Higher Educational Institutions Using Video Learning Analytics and Data Mining Techniques," Applied Sciences, vol.10, no.11, 2020.
R. Conijn, C. Snijders, A. Kleingeld, and U. Matzat, "Predicting Student Performance from LMS Data: A Comparison of 17 Blended Courses Using Moodle LMS," IEEE Trans. Learn. Technol., vol.10, no.1, pp.17-29, 2017.
B. Rienties, L. Toetenel, and A. Bryan, ""Scaling up" learning design: impact of learning design activities on LMS behavior and performance," in Proc. of the Fifth International Conference on Learning Analytics And Knowledge, pp.315-319, 2015.