Ensemble Deep Learning Model using Random Forest for Patient Shock Detection

Minsu Jeong;Namhwa Lee;Byuk Sung Ko;Inwhee Joe;

doi:10.3837/tiis.2023.04.003

KSII Transactions on Internet and Information Systems (TIIS)

제17권4호
/
Pages.1080-1099
/
2023
/
1976-7277(pISSN)
/
1976-7277(eISSN)

한국인터넷정보학회 (Korean Society for Internet Information)

DOI QR Code

Ensemble Deep Learning Model using Random Forest for Patient Shock Detection

Minsu Jeong (Department of Computer Science, Hanyang University) ;
Namhwa Lee (Department of Computer Science, Hanyang University) ;
Byuk Sung Ko (Department of Emergency Medicine, College of Medicine, Hanyang University) ;
Inwhee Joe (Department of Computer Science, Hanyang University)

투고 : 2022.10.19
심사 : 2023.04.04
발행 : 2023.04.30

https://doi.org/10.3837/tiis.2023.04.003 인용 PDF HTML

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

Digital healthcare combined with telemedicine services in the form of convergence with digital technology and AI is developing rapidly. Digital healthcare research is being conducted on many conditions including shock. However, the causes of shock are diverse, and the treatment is very complicated, requiring a high level of medical knowledge. In this paper, we propose a shock detection method based on the correlation between shock and data extracted from hemodynamic monitoring equipment. From the various parameters expressed by this equipment, four parameters closely related to patient shock were used as the input data for a machine learning model in order to detect the shock. Using the four parameters as input data, that is, feature values, a random forest-based ensemble machine learning model was constructed. The value of the mean arterial pressure was used as the correct answer value, the so called label value, to detect the patient's shock state. The performance was then compared with the decision tree and logistic regression model using a confusion matrix. The average accuracy of the random forest model was 92.80%, which shows superior performance compared to other models. We look forward to our work playing a role in helping medical staff by making recommendations for the diagnosis and treatment of complex and difficult cases of shock.

키워드

1. Introduction

Shock is a syndrome that occurs when the blood flow required for normal functioning of the main organs of the human body is insufficient or when the body cell group cannot metabolize nutrients normally. In other words, it refers to a state in which blood is insufficient in the body due to circulatory disorders in various organs responsible for the circulatory function in the human body. Typical symptoms of shock include confusion of consciousness and irregular pulse and breathing. However, these symptoms occur not only in shock patients but also in patients with other conditions. Therefore, shock patients do not know when shock will occur, requiring continuous monitoring by medical staff. As a result, digital healthcare combined with telemedicine is rapidly [1-4] evolving under many conditions, including shock, in the form of a fusion of digital technology and AI.

Shock is divided into four major categories according to the cause of the shock. The first type is hemorrhagic [5-6] and hypovolemic [7-8] shock. This type occurs due to bleeding, burns, or dehydration. The second type is cardiogenic shock, which occurs due to conditions such as myocardial infarction, arrhythmia, cardiac pressure, and acute coronary syndrome [9-12]. Third, neurogenic shock [13-14] occurs due to spinal anesthesia, spinal cord injury, anaphylaxis, and similar conditions. The final type is septic shock [15-16], hypotension due to the symptoms of sepsis, which leads to shock. In addition, shock occurs as a result of a number of causes, and among them, in the case of a patient with severe symptoms, it is necessary to be admitted to a hospital and continuously monitored by medical staff. The only way to monitor the patient's condition is to connect various sensors attached to the patient's body to a hemodynamic monitoring device that displays data such as the patient's blood pressure and oxygen saturation. The medical staff directly checks for changes in the data. However, the current situation is that the number of medical personnel is insufficient to manage all seriously ill patients. To solve this problem, digital healthcare, which combines technologies such as AI and IoT, has recently been in the spotlight, and research on various diseases, including shock, is being actively conducted. If an AI neural network model can detect the possibility of shock and the exact timing of the occurrence, and if it can take an initial response quickly, more effective observation and treatment are possible with less medical staff.

In the study of this paper, data was extracted from the hemodynamic monitoring equipment that expresses the status information of the shock patient, and the characteristics of the data were identified and used to detect the shock interval. Among them, the heart rate (HR) [17], left stroke work index (LSWI), stroke index (SI) [18], and stroke systemic vascular resistance index (SSVRI) [19] frequently show specific phenomena about the occurrence of shock. We propose a method to detect the onset of shock using parameters [20-21].

As well as various types of data, there are also various types of shock, and the symptoms shown by a patient differ depending on the type of shock. Therefore, different treatment methods must be applied depending on the type of shock. In addition, since the information on various hemodynamic parameters indicating a patient's vitality index is different for each patient, even if they have the same type of shock, the type of drug to be administered must also be different for each patient. Although this paper deals with content limited to MAP, a hemodynamic parameter related to shock diagnosis, we propose a random forest-based patient shock detection technique, a machine learning-based algorithm, for scalability to various types of shock in the future. When the scalability of shock detection is considered, there is an advantage in that the type of shock can be predicted and detected through a random forest classifier, and the appropriate treatment method can be identified for individual cases.

The structure of this paper is as follows. In Section 2, background knowledge related to shock and research on various machine learning models used to predict or detect shock are introduced, and in Section 3, our random forest ensemble model for shock detection using data from hemodynamic monitoring equipment is introduced. Based on this model, Section 4 discusses the results. Section 5 concludes with a discussion of the results and future research directions.

2. Related Work

2.1 Mean Arterial Pressure

The goal of the management of shock is to maintain the appropriate blood pressure value. One of the most important types of patient's biometric data expressed by hemodynamic monitoring equipment is MAP. This type of data indicates the mean arterial pressure, which is the most important parameter for monitoring the patient's shock state [22]. Fig. 1 shows the change in MAP value per second for a specific patient. The normal MAP category of the hemodynamic monitoring equipment used in this paper should be higher than 70 mmHg. Table 1 below summarizes the normal ranges of the parameters of the hemodynamic monitoring equipment that have the highest relevance to patient shock diagnosis.

E1KOBZ_2023_v17n4_1080_f0001.png 이미지

Fig. 1. Change in the MAP value per second of a patient

Table 1. Types of hemodynamic parameters and their normal categories

E1KOBZ_2023_v17n4_1080_t0001.png 이미지

2.2 Shock Treatment

Shock refers to a condition in which there is insufficient blood flow to body tissues due to circulatory disorders. Shock can occur in a variety of ways. Treatment of shock at the level of abnormality a patient exhibits is very important. For example, in hemodynamic data, when blood pressure decreases, cardiac output per minute is normal, but the systemic vascular index decreases, so a vasopressor is administered. Conversely, when blood pressure rises, cardiac output per minute and systemic vascular index are normal, but the intravascular volume decreases, increasing the rate of fluid administration. Fig. 2 below shows part of the manual provided by BioZ, a manufacturer of hemodynamic monitoring equipment. Concerning the MAP and SI expressed by the hemodynamic monitoring equipment, the patient's shock is divided into 9 zones, and the drug injection guide for each zone is shown.

E1KOBZ_2023_v17n4_1080_f0002.png 이미지

Fig. 2. Drug injection guide according to the detailed area

• Class 1: MAP is normal, increased, and decreased. Administer fluids only when MAP is reduced.

• Class 2: Because the MAP is over 70 mmHg, the condition is not in a shock state and does not require any special treatment. However, SSVRI is increased, so vasodilators are used.

• Class 3: Because MAP is over 70 mmHg, it is not in a state of shock and no special treatment is required. However, SSVRI and SI are increased, and vasodilators or diuretics are administered.

• Class 4: Because the MAP is over 70 mmHg, it is not in a state of shock, and no special treatment is required. However, diuretics are administered because the SI is increased.

• Class 5: Ideal situation as MAP and SI are normal.

• Class 6: Fluids and dobutamine administration because MAP and SI are decreased.

• Class 7: MAP and SI are decreased, and SSVRI is decreased. Norepinephrine, dobutamine, fluid administration are administered.

• Class 8: A situation in which MAP is decreased and SSVRI is decreased. Norepinephrine is administered.

• Class 9: MAP is normal, increased, and decreased. Norepinephrine is administered only if MAP is decreased.

2.3 Parameters Related to MAP

As mentioned above, when the value of the MAP parameter expressed by the hemodynamic monitoring equipment is under 70 mmHg, it is judged that the patient is in shock.

Fig. 3 shows the changes in the values of SI and LSWI before the value of MAP decreases. The vertical line drawn in the middle of the graph indicates the section where the value of MAP starts to decrease. When checking the left section based on the vertical line, it can be seen that the values of SI and LSWI decrease.

E1KOBZ_2023_v17n4_1080_f0003.png 이미지

Fig. 3. Changes in SI and LSWI before MAP decrease

Fig. 4 is a graph showing the changes in the values of HR and SSVRI before the values of MAP decrease. If you check the left section based on the vertical line in the middle of the graph, you can see that the values of HR and SSVRI increase.

E1KOBZ_2023_v17n4_1080_f0004.png 이미지

Fig. 4. Changes in HR and SSVRI before MAP decrease

In this paper, we use the four parameters, SI, LSWI, HR, and SSVRI as the input data of the machine learning model to predict whether the value of MAP will decrease or increase, that is, to detect the occurrence of a shock state.

2.4 Patient Shock Monitoring

The field of patient shock monitoring continuously evaluates the flow of the hemodynamic biometric data of patients. In addition, since the patient's vital signs are continuously observed, the goal is to detect the occurrence of shock and minimize damage while maximizing the treatment effect by quickly taking initial measures. A representative example is the IoT-based automatic shock treatment system conducted in collaboration with Hanyang University Hospital. The patient's biometric data represented by the hemodynamic monitoring equipment is recognized for letter and numeric values using optical character recognition (OCR), and the recognized values are stored in the database in real time [23]. Real-time stored data is analyzed by the server computer and when the value of the hemodynamic parameter determining patient shock exceeds the threshold, an infusion pump automatically injects the drugs according to the patient's physical condition.

However, this OCR-based detection algorithm has the disadvantage of being less versatile because it focuses only on hypovascular shock and detects only MAP among various hemodynamic parameters. In this paper, since the machine learning-based random forest algorithm is used to determine the shock state of a patient, it can be applied not only to hypotensive shock but also to psychogenic shock, neurotic shock, and septic shock. Therefore, it is excellent in terms of versatility.

2.5 Patient Shock Prediction

If the field of patient shock detection aims to accurately detect when a shock occurs in a patient, the field of patient shock prediction differs in that it predicates the possibility of a patient's shock. In other words, detection identifies when a patient falls into a shock state, and prediction determines the probability that the patient may fall into a shock state. Patient shock prediction does not predict the pattern at the time of the shock, but its purpose is to prevent the occurrence by learning the pattern of changes in the hemodynamic parameters in a specific section before the time of the shock, thereby predicting the possibility of the patient's shock. This allows for measures such as drug injection to occur more rapidly. Lindberg et al. [24] predicted septic shock using ensemble techniques such as random forest and XGBoost. Netmati et al. [25] utilized high-resolution time series data from 4 to 12 hours before the onset to predict septic shock onset. Kim et al. [26] identified the possibility of predicting septic shock within 24 hours using a machine learning-based model. Finally, Lin et al. [27] conducted a study to predict septic shock using the convolutional-LSTM model.

3. Shock Detection Techniques Based On Random Forest

In this paper, we focus on the detection of hypotensive shock among existing shocks. The purpose of the method for shock detection is for more accurate monitoring and diagnosis and treatment of diseases by medical staff when hypotensive shock occurs. Currently, most shock patients check whether their patient is in shock through the hemodynamic parameters expressed by the hemodynamic monitor. Among the many hemodynamic parameters, the most important parameter for judging a patient's shock state is MAP. MAP is an indicator of the patient's average arterial pressure and when the value expressed by the monitor is 70 mmHg or more, it is in a normal state, and when the value is 70 mmHg or less, it is judged to be in shock. Diagnosis of the shock state is most dependent on MAP and SI, which are calculated through dependent on MAP and SI. These values are calculated through SSVRI, LSWI, CVP, PAOP, and other methods. Equation 1 and Equation 2 below express the method of calculating SSVRI and LSWI through MAP and SI as equations.

SSVRI = ((MAP - CVP)/SI) * 80 (1)

LSWI = (MAP - POAP) * SI * 0.014 (2)

CVP stands for central venous pressure. POAP stands for Pulmonary Artery Occlusive Pressure. MAP is the average arterial pressure per heartbeat per cycle, and SI is the amount of blood pumped out from the heart divided by the body area.

3.1 System Model

Fig. 5 is a system model diagram related to the shock state detection method proposed in this paper. The dataset used was provided by Hanyang University Hospital, and the measured values of the hemodynamic monitoring device for the actual shock patients were used. First, since the data measured by the hemodynamic monitoring equipment is raw data, necessary pre-processing was performed. There were a total of 60 patients’ vital data expressed by the hemodynamic monitor, and among the 60 data, four parameters most closely related to the diagnosis of shock were selected, and a dataset was constructed. In case of a large difference in the unit value between 0 and 1, the normal interval and the shock interval were divided for each of several patient’s data. This is because, to detect a shocking state, information on when a normal interval or a shocking interval starts and ends is required. Finally, the entire dataset is composed of a training dataset to be input as the data input of the random forest model, a validation dataset for the training dataset, and a test dataset for performance testing of the trained model. In addition, to use the previously preprocessed hemodynamic parameter-based dataset as a training dataset for the random forest model, there is a need to match the input format of the data. The correct value for the data used as the input, label data is required. The previous normal interval is divided by the duration of the shock interval based on the start point of the shock interval for each patient summarized above. For example, if the shock interval lasts 30 min, the normal interval is also generated based on the dataset in which the normal section and the shock section are mixed and then used as the input of the random forest model. This random forest model learns the changing pattern of the input data received as input and concatenates the predicted values provided by each decision tree to detect the patient's normal state and shock state.

E1KOBZ_2023_v17n4_1080_f0005.png 이미지

Fig. 5. System Model Diagram

3.2 Hemodynamic Dataset

In this paper, an experiment was conducted based on the vitality data of the shock patients provided by Hanyang University Hospital. The hemodynamic parameters of the shock patient data provided by the hospital were measured from a total of 25 patients, and about 150 shock sections were observed. To measure the patient's hemodynamic parameters, a blood pressure monitor was attached to a hemodynamic monitor device, and it was worn on the patient's arm. The ICG sensor was then placed on the patient's neck and chest. Fig. 6 shows the ICG sensor attachment guide provided by BioZ, a manufacturer of hemodynamic monitoring devices, to measure the patient's hemodynamic parameters. Among the ICG sensors that are directly attached to the patient's body, the blue and purple sensors were attached to the artery located on the patient's neck, and the green and orange sensors were attached to the chest.

E1KOBZ_2023_v17n4_1080_f0006.png 이미지

Fig. 6. ICG sensor attachment guide

In this paper, a dataset of shock patients provided by Hanyang University Hospital was selected as the reference point with the following limitations. First, patient data for which data was not measured at all or at least one of the five parameters of MAP, HR, SI, SSVRI, and LSWI was not measured and excluded from the dataset. The patient's vital sign data were measured by attaching various sensor devices, including a blood pressure monitor, directly to the patient's body in order to ensure the accuracy. If the parameter MAP is not properly measured, it is impossible to determine whether the patient's current state is in shock or normal. If the data is not correct, the training may not proceed properly in the subsequent model training process. Second, patients with a missing value of more than 50% of the hemodynamic parameter value were excluded from the data set configuration. In general, in machine learning and deep learning, when missing values exist in a dataset, the average value of the entire data is filled in, or in the case of the sequence data, the missing values are handled by filling in the values before or after the missing values. However, if the missing value exceeds more than half of the total data, there is a possibility that it will be considered as randomly generated dummy data rather than pure patient biometric information, so it is appropriate for use as experimental data in this paper. Finally, based on the MAP value of each patient, if there was no shock period or if only the shock state continued, it was excluded from the experimental data. Since the detection method proposed in this paper detects the moment when the patient's state with a normal range of hemodynamic values falls into a shocking state, if only a shock ball exists or if a normal hemodynamic value continues without shock, this paper would not match the nature of the experiment to be conducted.

Based on the reasons listed above, among the data of 25 patients with shock, in this paper, data from patients No.1, No.2, No.5, No. 6, No. 8, No. 9, No.10, No.11, and No.12 were used. Table 2 below contains the shock interval information for each of the total patient data used in the experiment of this paper.

Table 2. Shock interval information for each patient

E1KOBZ_2023_v17n4_1080_t0002.png 이미지

3.3 Data Pre-Processing

The hemodynamic monitoring equipment used in this paper expresses about 60 patient vital signs. The hemodynamic parameter values used in this paper are 5 out of 60: MAP, HR, SI, SSVRI, and LSWI. Based on the patient data selection criteria described above, the data to be used were selected using the value of the MAP. After that, four parameters to be used as inputs of the random forest model were extracted. The data was measured by directly attaching the IGC sensor connected to the hemodynamic monitor device to the patient's body. Due to the characteristics of the sensor, there are a number of missing values caused by various factors such as not being properly attached to the patient's body. In addition, since the hemodynamic parameter value is recorded in the form of a sequence once per second, the same values are continuously recorded for a few seconds, and if there is a change, the changed value is recorded again in succession. Fig. 7 is a graph showing the changing pattern per unit sequence of HR, which is one of the hemodynamic parameters of a specific patient used in this paper. The x-axis is the unit sequence recorded per second, and the y-axis is the HR value. Looking at the graph, it can be seen that the value (y-axis) of the HR parameter does not change every moment according to the passage of time (x-axis), and that it has the same value for a specific time. As mentioned earlier, the hemodynamic parameter value is recorded once a second in sequence, so if the patient's biometric data value does not change, it will inevitably have the same value for a specific time.

E1KOBZ_2023_v17n4_1080_f0007.png 이미지

Fig. 7. HR data diagram that changes per second

In general, in machine learning and deep learning, missing values are processed by filling in the average value of the entire sequence for a specific column to process missing values. However, in the case of the hemodynamic parameter used in this paper, since the pattern of change per data sequence is different, the average value for each section is obtained instead of the general missing value processing, and the missing value is processed with the average value. Even in this case, the value (average value for each interval) of the y-axis has the same value according to the flow (missing value) of the x-axis.

In this paper, the section where the missing value starts, and the section where the missing value ends was found. The data immediately before the starting value of the missing value and the value immediately after the value where the missing value ends were averaged to deal with the missing value. As mentioned above, it can be confirmed that the same value continues for a specific time after the value is changed due to the characteristics of the hemodynamic parameter. If there is a missing value in the hemodynamic parameter with these characteristics and the missing value is treated as the average value for the entire sequence, there is a possibility that the missing value will be treated as a value completely different from the vital sign expressed by the original patient's body.

3.4 Dataset

A dataset was created based on the shock interval for each patient shown in Table 2. First, the shock interval information is read from the CSV file that summarizes the shock interval for each patient. Based on the start point of the shock interval for each patient read from the CSV file, the length of the shock interval in the previous normal interval is determined. Finally, a data set for each shock interval is generated based on the identified normal interval and shock interval information. As mentioned above, the generated data set consists of a normal interval and a shocking interval in a 1:1 ratio. Fig. 8 shows the creation of a training dataset and a test dataset for the shock interval based on the shock interval for each patient used as the dataset. The training dataset and test dataset consist of the X_train and Y_train, and the X_test and Y_test, respectively. X has the feature information, and Y has the label information. These datasets are structured as follows for each patient for future experimentation. For example, when the data on patient 1 is used for testing, the data of the patients other than patient 1 are used as the training data. That is, all the data of the remaining patients except for patient 1 are bundled to create a single set of study data and used as the input data for the random forest model. It also predicts and detects whether the patient's current state is in the shock or normal state by using the remaining patient 1 data as the test dataset of the learned model. For a more accurate comparison of the input data of the logistic regression and decision tree models used for performance comparison with the random forest model in this paper, data preprocessing was performed in the same way as above, and a data set was created.

E1KOBZ_2023_v17n4_1080_f0008.png 이미지

Fig. 8. Dataset making by the patient

3.5 Random Forest Model

In this paper, we used the random forest classification model provided by the python sklearn package among the bagging-based ensemble machine learning models for the detection of shock states in patients. The random forest classification model used here does not use all existing variables in each node of the decision tree, but randomly selects some of the input variables to create decision trees with different characteristics. The use of this type of random forest classification model results in a smaller correlation between each decision tree and can improve the overfitting phenomenon, which is pointed out as a weakness of the decision tree. In addition, when predicting whether the patient's state used as the label value in this paper is a shock state or a normal state, the accuracy is increased.

4. Performance Evaluation

In this section, we show some tests and results to check whether our proposed method would work for shock state detection.

4.1 Experimental Setup

In this paper, a workstation environment with the specifications shown in Table 3 was established for simulation. The OS environment used the Windows 10 environment. The CPU used an Intel® Core™ i5-9600KF, and a total of 16 Gb of DDR4 memory was used. In addition, the GPU was simulated using NVIDIA GeForce GTX 1660 (6 Gb). As software (framework), a simulation environment was built using python 3.8.13 and CUDA 11.1-based Tensorflow 2.8.0 and Keras 2.8.0.

Table 3. Workstation Specification Used in Experiments

E1KOBZ_2023_v17n4_1080_t0003.png 이미지

The simulation was conducted by comparing the results of the patient's shock state detection using three models: a logistic regression model, a decision tree classifier model, and a random forest classifier model. To secure the reliability of the experimental results, all the hyperparameters for each model were set to the default values, and then the simulations were performed. After evaluating the performance indicators of each model, hyperparameter tuning was performed to improve the performance of the random forest.

Table 4 shows the hyperparameter information of the decision tree and random forest models. The min_samples_split is the minimum amount of sample data to split a node that is used to control overfitting. As the default number is set to 2 for both the decision tree and random forest, the number of nodes to be split increases, and the possibility of overfitting increases. The minimum amount of sample data to become a leaf node is the min_samples_leaf. It is used together with the min_samples_split parameter described above to control the data overfitting. In the case of non-uniform input data, that is, if data is concentrated only in a specific class, there is a need to set the corresponding parameter to be small. The maximum number of features considered for optimal segmentation is the max_feature. In the case of the decision tree classifier, the default value of the max_features parameter is none, but in the case of the random forest classifier, it is automatically set. The max_depth is the parameter that determines to what extent the maximum depth of the tree is set. It has the characteristic of dividing until the class of the input data is completely divided or it continues to divide until it becomes smaller than the set min_sample_split value. However, if the depth becomes too deep, there is a possibility of overfitting, so it is very important to find an appropriate value. The parameter used in the random forest is the n_estimators, which specifies the number of decision trees. The default value is 10, and the higher it is set, the better the expected performance. However, as the number of trees increases, the corresponding learning time also increases.

Table 4. Hyperparameter information of the decision tree and random forest

E1KOBZ_2023_v17n4_1080_t0004.png 이미지

4.2 Experimental Results

To verify the results of the experiments conducted in this paper, we compared the performance of the logistic regression model, the decision tree model, and the random forest model using the confusion matrix. The Actual Val in Table 5 is the actual value of the test dataset used for the trained model, and the Predict Val is the value predicted by the trained model. True Positive (TP), False Positive (FP), False Negative (FN), and True Negative (TN), respectively, based on the criteria for shock patient detection, are as follows.

Table 5. Confusion matrix for evaluating model performance

E1KOBZ_2023_v17n4_1080_t0005.png 이미지

• TP: Success with positive predictions. That is, the trained model predicted that the patient was in shock, and the actual state of the patient is a shock state.

• TN: Success with negative predictions. That is, the trained model predicted that the patient was not in shock, and the actual state of the patient is not in shock.

• FP: Failed positive predictions. That is, although the trained model predicted a patient in shock, the actual patient’s condition was not in shock.

• FN: Failing negative predictions. That is, although the trained model predicted that the patient was not in shock, the actual state of the patient was in shock.

Then, the performance of the actual machine learning model was evaluated using the confusion matrix. The metrics were as follows:

Recall (Sensitivity): This is an indicator expressing how well the actual positive value was predicted with sensitivity. Similar to Accuracy, it means the ratio of the learned model predicting the actual shock patient as a shock patient. Therefore, the higher the Recall value, the better the model was trained. If this is expressed using the confusion matrix, it is as shown in Equation 3 below.

Recall(Sensitivity) = TP/(TP + FN) (3)

Specificity: this indicator expressing the specificity of how well the actual negative value was predicted. In other words, it is a measure of how well the learned model predicted that a person who is not actually a shock patient is not a shock patient, and has the opposite character to Recall. If this is expressed using the confusion matrix, it is as shown in Equation 4 below.

Specificity = TN/(TN + FP) (4)

Precision: This is the precise ratio of the values predicted as positives that were actually positive. Accuracy is an index indicating what percentage of all results were predicted with accuracy as correctly predicted. If this is expressed using the confusion matrix, it is as shown in Equation 5 below.

Precision = TP/(TP + FP) (5)

Accuracy: This is an index indicating the percentage of correctly predicted results among all the predictions with accuracy. If this is expressed using the confusion matrix, it is as shown in Equation 6 below.

Accuracy = (TP + TN)/(TP + TN + FP + FN) (6)

F1-Score: This is the harmonic average of Precision and Recall, and is an evaluation index used to overcome data bias because the data bias is too great to evaluate as Accuracy in an unbalanced state. If this is expressed using the confusion matrix, it is as shown in Equation 7 below.

F1 Score = (2 * Precision * Recall)/(Precision + Recall) (7)

Fig. 9 shows the performance index for predicting the test data of a specific patient for the three models used in the experiment: logistic regression, and decision tree, random forest. At this time, as described above, all the hyperparameters were set to the same default, and a simulation was performed. It can be seen that the random forest model shows better performance than the other models in all performance indicators including Accuracy. Table 6, Table 7, and Table 8 below show the performance indicators for the simulation results of all patients for each of the logistic regression, decision tree, and random forest models. Fig. 10 is a graphical representation of the average performance index of all patients in logistic regression, decision tree, and random forest. As described in Section 2 of this chapter, in order to obtain a reliable evaluation result, all of the hyperparameter values were set to default, and simulations were performed. Afterwards, it was confirmed that the performance of the random forest model was the best. After that, all the hyperparameters were set to the default, and simulations were conducted, so the performance evaluation results of the random forest model were improved through the hyperparameter tuning process. In this paper, we found the optimal hyperparameters through the GridSearch module supported by sklearn. The GridSearch module finds the optimal hyperparameter by inputting the number of all cases when the user inputs the number of cases for the random forest hyperparameter. As a result of tuning the hyperparameter of the random forest model, it was confirmed that the best learning result occurred when the training was carried out with the default value.

E1KOBZ_2023_v17n4_1080_f0009.png 이미지

Fig. 9. Performance indicator result for each model

E1KOBZ_2023_v17n4_1080_f0010.png 이미지

Fig. 10. Average performance indicators for each model for all patient data

Table 6. Logistic Regression model performance indicator results of all patient data

E1KOBZ_2023_v17n4_1080_t0006.png 이미지

Table 7. Decision Tree model performance indicator results of all patient data

E1KOBZ_2023_v17n4_1080_t0007.png 이미지

Table 8. Random Forest model performance indicator results of all patient data

E1KOBZ_2023_v17n4_1080_t0008.png 이미지

5. Discussion and Conclusion

In this paper, the patient's shock state was detected by using the hemodynamic parameter dataset obtained through the hemodynamic monitoring device as the input of the random forest ensemble model. To detect the patient's shock, the MAP, HR, SI, LSWI, and SSVRI parameters were extracted from numerous hemodynamic parameters, and the missing values were processed and normalized. Afterwards, the four parameters of HR, SI, LSWI, and SSVRI were obtained from the training dataset. The characteristic points of the test dataset, that is, the features and labeling, were carried out based on the MAP. These datasets were then used as input for the random forest model. To compare the results, simulations were performed using the same dataset as input for the logistic regression model and the decision tree model. As a result of the simulation, the average performance of the random forest model was 92.80 for Accuracy, 96.75 for Precision, 88.73 for Recall, 92.32 for F1-score, and 98.00 for Specificity. On the other hand, the average performance of the logistic regression model was 84.15 for Accuracy, 84.36 for Precision, 85.96 for Recall, 84.39 for F1-score, and 82.34 for Specificity. The average performance of the decision tree model was 91.31 for Accuracy, 94.46 for Precision, 87.25 for Recall, 90.33 for F1-score, and 94.53 for Specificity. From the results of this experiment, it was confirmed that the random forest outperformed the decision tree and logistic registration models in all areas. Since the study conducted in this paper was conducted on only hypotensive shock patients among the various types of shock, it is difficult to say that it represents all shock patients in terms of research performance. However, based on this study, if we collect various data such as septic shock and psychogenic shock that exists in the future, find patterns, and use them for research, it may serve as a better model in terms of versatility. In addition, research results can be expected in which a patient's shock can be quickly responded to by injecting an appropriate drug for each type of shock as soon as shock is detected. In addition, if research to detect a patient's shock state is successful, it is thought that it will be possible to conduct a study to identify prognostic symptoms that appear before shock by using various hemodynamic parameter indicators and to make predictions based on this data.

Acknowledgement

This work was supported by an Institute of Information Communications Technology Planning Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2020-0-00107, Development of technology to automate the recommendations for big data analytic models that define data characteristics and problems).

참고문헌

Shishvan, Omid Rajabi, Daphney-Stavroula Zois, and Tolga Soyata, "Machine intelligence in healthcare and medical cyber physical systems: A survey," IEEE Access, 6, 46419-46494, 2018. https://doi.org/10.1109/ACCESS.2018.2866049
Islam, Md Rakibul, Rushdi Zahid Rusho, and Sheikh Md Rabiul Islam, "Design and Implementation of Low Cost Smart Syringe Pump for Telemedicine and Healthcare," in Proc. of 2019 International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST), IEEE, 2019.
Angaran, David M, "Telemedicine and telepharmacy: current status and future implications," American Journal of Health-System Pharmacy, 56(14), 1405-1426, 1999. https://doi.org/10.1093/ajhp/56.14.1405
Yellowlees, Peter M, "Successfully developing a telemedicine system," Journal of telemedicine and telecare, 11(7), 331, 2005.
Cannon, J. W., "Hemorrhagic shock," New England Journal of Medicine, 378(4), 370-379, 2018. https://doi.org/10.1056/NEJMra1705649
Peitzman, A. B., Harbrecht, B. G., Udekwu, A. O., Billiar, T. R., Kelly, E., Simmons, R. L., "Hemorrhagic shock," Current problems in surgery, 32(11), 925-1002, 1995. https://doi.org/10.1016/S0011-3840(05)80008-5
Kelley, D. M, "Hypovolemic shock: an overview," Critical care nursing quarterly, 28(1), 2-19, 2005. https://doi.org/10.1097/00002727-200501000-00002
Kobayashi, L., Costantini, T. W., Coimbra, R., "Hypovolemic shock resuscitation," Surgical Clinics, 92(6), 1403-1423, 2012. https://doi.org/10.1016/j.suc.2012.08.006
Bohm, A., et al, "Artificial intelligence model for prediction of cardiogenic shock in patients with acute coronary syndrome," European Heart Journal: Acute Cardiovascular Care, 11, Supplemen_1, zuac041-077, 2022. https://doi.org/10.1093/ehjacc/zuac041
Califf, R. M., Bengtson, J. R., "Cardiogenic shock," New England Journal of Medicine, 330(24), 1724-1730, 1994. https://doi.org/10.1056/NEJM199406163302406
Hollenberg, S. M., Kavinsky, C. J., Parrillo, J. E., "Cardiogenic shock," Annals of internal medicine, 131(1), 47-59, 1999. https://doi.org/10.7326/0003-4819-131-1-199907060-00010
Topalian, S., Ginsberg, F., Parrillo, J. E., "Cardiogenic shock," Critical care medicine, 36(1), S66-S74, 2008. https://doi.org/10.1097/01.CCM.0000296268.57993.90
Meister, R., Pasquier, M., Clerc, D., Carron, P. N., "Neurogenic shock," Revue Medicale Suisse, 10(438), 1506-1510, 2014. https://doi.org/10.53738/REVMED.2014.10.438.1506
Mack, E. H., "Neurogenic shock," The Open Pediatric Medicine Journal, 7(1), 16-18, 2013. https://doi.org/10.2174/1874309901307010016
Annane, D., Bellissant, E., Cavaillon, J. M., "Septic shock," The Lancet, 365(9453), 63-78, 2005. https://doi.org/10.1016/S0140-6736(04)17667-8
Astiz, M. E., Rackow, E. C., "Septic shock," The Lancet, 351(9114), 1501-1505, 1998. https://doi.org/10.1016/S0140-6736(98)01134-9
Frolich, Michael A., and Donald Caton, "Baseline heart rate may predict hypotension after spinal anesthesia in prehydrated obstetrical patients," Canadian Journal of Anesthesia, 49(2), 185-189, 2002. https://doi.org/10.1007/BF03020493
Hofhuizen, Charlotte, et al, "Spinal anesthesia-induced hypotension is caused by a decrease in stroke volume in elderly patients," Local and regional anesthesia, 12, 19-26, 2019. https://doi.org/10.2147/LRA.S193925
Nakasuji, Masato, et al, "Hypotension from spinal anesthesia in patients aged greater than 80 years is due to a decrease in systemic vascular resistance," Journal of clinical anesthesia, 24(3), 201-206, 2012. https://doi.org/10.1016/j.jclinane.2011.07.014
Shannahoff-Khalsa, David S., et al, "Hemodynamic observations on a yogic breathing technique claimed to help eliminate and prevent heart attacks: a pilot study," Journal of Alternative & Complementary Medicine, 10(5), 757-766, 2004. https://doi.org/10.1089/acm.2004.10.757
Koch, Erica, et al, "Shock index in the emergency department: utility and limitations," Open access emergency medicine: OAEM, 11, 179-199, 2019. https://doi.org/10.2147/OAEM.S178358
Jefferys, J. G., "Advances in understanding basic mechanisms of epilepsy and seizures," Seizure, 19(10), 638-646, 2010. https://doi.org/10.1016/j.seizure.2010.10.026
Lee, Namhwa, et al, "IoT-based Architecture and Implementation for Automatic Shock Treatment," KSII Transactions on Internet & Information Systems, 16(7), 2209-2224, 2022. https://doi.org/10.3837/tiis.2022.07.005
Misra, Debdipto, et al, "Early detection of septic shock onset using interpretable machine learners," Journal of Clinical Medicine, 10(2), 301, 2021.
Nemati, Shamim, et al, "An interpretable machine learning model for accurate prediction of sepsis in the ICU," Critical care medicine, 46(4), 547-553, 2018. https://doi.org/10.1097/CCM.0000000000002936
Kim, Joonghee, et al, "Machine learning for prediction of septic shock at initial triage in emergency department," Journal of critical care, 55, 163-170, 2020. https://doi.org/10.1016/j.jcrc.2019.09.024
Lin, Chen, et al, "Early diagnosis and prediction of sepsis shock by combining static and dynamic information using convolutional-LSTM," in Proc. of 2018 IEEE International Conference on Healthcare Informatics (ICHI), IEEE, 2018.

KSII Transactions on Internet and Information Systems (TIIS)

Ensemble Deep Learning Model using Random Forest for Patient Shock Detection

초록

키워드

1. Introduction

2. Related Work

2.1 Mean Arterial Pressure

2.2 Shock Treatment

2.3 Parameters Related to MAP

2.4 Patient Shock Monitoring

2.5 Patient Shock Prediction

3. Shock Detection Techniques Based On Random Forest

3.1 System Model

3.2 Hemodynamic Dataset

3.3 Data Pre-Processing

3.4 Dataset

3.5 Random Forest Model

4. Performance Evaluation

4.1 Experimental Setup

4.2 Experimental Results

5. Discussion and Conclusion

Acknowledgement

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)