• Title/Summary/Keyword: Predictive Accuracy

Search Result 814, Processing Time 0.031 seconds

Leveraging LLMs for Corporate Data Analysis: Employee Turnover Prediction with ChatGPT (대형 언어 모델을 활용한 기업데이터 분석: ChatGPT를 활용한 직원 이직 예측)

  • Sungmin Kim;Jee Yong Chung
    • Knowledge Management Research
    • /
    • v.25 no.2
    • /
    • pp.19-47
    • /
    • 2024
  • Organizational ability to analyze and utilize data plays an important role in knowledge management and decision-making. This study aims to investigate the potential application of large language models in corporate data analysis. Focusing on the field of human resources, the research examines the data analysis capabilities of these models. Using the widely studied IBM HR dataset, the study reproduces machine learning-based employee turnover prediction analyses from previous research through ChatGPT and compares its predictive performance. Unlike past research methods that required advanced programming skills, ChatGPT-based machine learning data analysis, conducted through the analyst's natural language requests, offers the advantages of being much easier and faster. Moreover, its prediction accuracy was found to be competitive compared to previous studies. This suggests that large language models could serve as effective and practical alternatives in the field of corporate data analysis, which has traditionally demanded advanced programming capabilities. Furthermore, this approach is expected to contribute to the popularization of data analysis and the spread of data-driven decision-making (DDDM). The prompts used during the data analysis process and the program code generated by ChatGPT are also included in the appendix for verification, providing a foundation for future data analysis research using large language models.

Application of Point Shearwave Elastography to Breast Ultrasonography: Initial Experience Using "S-Shearwave" in Differential Diagnosis (Point Shearwave Elastography의 유방 초음파에서의 적용: "S-Shearwave"를 이용한 감별진단의 초기경험)

  • Myung Hwan Lee;Eun-Kyung Kim;Eun Ju Lee;Ha Yan Kim;Jung Hyun Yoon
    • Journal of the Korean Society of Radiology
    • /
    • v.81 no.1
    • /
    • pp.157-165
    • /
    • 2020
  • Purpose To evaluate the optimal measurement location, cut-off value, and diagnostic performance of S-Shearwave in differential diagnosis of breast masses seen on ultrasonography (US). Materials and Methods During the study period, 225 breast masses in 197 women were included. S-Shearwave measurements were made by applying a square region-of-interest automatically generated by the US machine. Shearwave elasticity was measured three times at four different locations of the mass, and the highest shearwave elasticity was used for calculating the optimal cut-off value. Diagnostic performance was evaluated by using the area under the receiving operator characteristic curve (AUC). Results Of the 225 breast masses, 156 (69.3%) were benign and 69 (30.7%) were malignant. Mean S-Shearwave values were significantly higher for malignant masses (108.0 ± 70.0 kPa vs. 43.4 ± 38.3 kPa; p < 0.001). No significant differences were seen among AUC values at different measurement locations. With a cut-off value of 41.9 kPa, S-Shearwave showed 85.7% sensitivity, 63.9% specificity, 70.7% accuracy, and positive and negative predictive values of 51.7% and 90.8%, respectively. The AUCs for US and S-Shearwave did not show significant differences (p = 0.179). Conclusion S-Shearwave shows comparable diagnostic performance to that of grayscale US that can be applied for differential diagnosis of breast masses seen on US.

Study on the Agreement Values of Pulmonary Arterial Hypertension Measured by Cardiac Sonographers (심장초음파 검사자 간의 폐동맥고혈압 진단 측정값 일치도 분석 연구)

  • Seol Hwa KIM;Sundo JUNG
    • Korean Journal of Clinical Laboratory Science
    • /
    • v.55 no.4
    • /
    • pp.269-275
    • /
    • 2023
  • Echocardiography is a non-invasive method that is useful for diagnosing pulmonary arterial hypertension. It is known that echocardiography depends on the experience, education, and knowledge level of the cardiac sonographer. This study aimed to compare the agreement values between cardiac sonographer with different practical experiences in the diagnosis of pulmonary arterial hypertension using echocardiography. Three readers re-evaluated the echocardiography images of 148 patients who were diagnosed with pulmonary arterial hypertension at the S Medical Center from January 1, 2020, to December 31, 2020. The echocardiography values measured by each reader were compared and analyzed. The results of the analysis of discrete variables revealed that the agreement values of the cardiac sonographers showed excellent consistency for both reader 3 and the cardiologist group, indicating that more experience leads to better predictive accuracy for diagnosis of the condition. Furthermore, in terms of continuous variables, all the cardiac sonographer demonstrated good agreement in the measured values of the right atrium, which was easier to assess and clearer than the structurally complex measurements of the right ventricle. This study represents the first analysis in Korea of the agreement values measured by medical technologists who are cardiac sonographers.

Automated Versus Handheld Breast Ultrasound for Evaluating Axillary Lymph Nodes in Patients With Breast Cancer

  • Sun Mi Kim;Mijung Jang;Bo La Yun;Sung Ui Shin;Jiwon Rim;Eunyoung Kang;Eun-Kyu Kim;Hee-Chul Shin;So Yeon Park;Bohyoung Kim
    • Korean Journal of Radiology
    • /
    • v.25 no.2
    • /
    • pp.146-156
    • /
    • 2024
  • Objective: Automated breast ultrasound (ABUS) is a relevant imaging technique for early breast cancer diagnosis and is increasingly being used as a supplementary tool for mammography. This study compared the performance of ABUS and handheld ultrasound (HHUS) in detecting and characterizing the axillary lymph nodes (LNs) in patients with breast cancer. Materials and Methods: We retrospectively reviewed the medical records of women with recently diagnosed early breast cancer (≤ T2) who underwent both ABUS and HHUS examinations for axilla (September 2017-May 2018). ABUS and HHUS findings were compared using pathological outcomes as reference standards. Diagnostic performance in predicting any axillary LN metastasis and heavy nodal-burden metastases (i.e., ≥ 3 LNs) was evaluated. The ABUS-HHUS agreement for visibility and US findings was calculated. Results: The study included 377 women (53.1 ± 11.1 years). Among 385 breast cancers in 377 patients, 101 had axillary LN metastases and 30 had heavy nodal burden metastases. ABUS identified benign-looking or suspicious axillary LNs (average, 1.4 ± 0.8) in 246 axillae (63.9%, 246/385). According to the per-breast analysis, the sensitivity, specificity, positive and negative predictive values, and accuracy of ABUS in predicting axillary LN metastases were 43.6% (44/101), 95.1% (270/284), 75.9% (44/58), 82.6% (270/327), and 81.6% (314/385), respectively. The corresponding results for HHUS were 41.6% (42/101), 95.1% (270/284), 75.0% (42/56), 82.1% (270/329), and 81.0% (312/385), respectively, which were not significantly different from those of ABUS (P ≥ 0.53). The performance results for heavy nodal-burden metastases were 70.0% (21/30), 89.6% (318/355), 36.2% (21/58), 97.3% (318/327), and 88.1% (339/385), respectively, for ABUS and 66.7% (20/30), 89.9% (319/355), 35.7% (20/56), 97.0% (319/329), and 88.1% (339/385), respectively, for HHUS, also not showing significant difference (P ≥ 0.57). The ABUS-HHUS agreement was 95.9% (236/246; Cohen's kappa = 0.883). Conclusion: Although ABUS showed limited sensitivity in diagnosing axillary LN metastasis in early breast cancer, it was still useful as the performance was comparable to that of HHUS.

Development and Validation of MRI-Based Radiomics Models for Diagnosing Juvenile Myoclonic Epilepsy

  • Kyung Min Kim;Heewon Hwang;Beomseok Sohn;Kisung Park;Kyunghwa Han;Sung Soo Ahn;Wonwoo Lee;Min Kyung Chu;Kyoung Heo;Seung-Koo Lee
    • Korean Journal of Radiology
    • /
    • v.23 no.12
    • /
    • pp.1281-1289
    • /
    • 2022
  • Objective: Radiomic modeling using multiple regions of interest in MRI of the brain to diagnose juvenile myoclonic epilepsy (JME) has not yet been investigated. This study aimed to develop and validate radiomics prediction models to distinguish patients with JME from healthy controls (HCs), and to evaluate the feasibility of a radiomics approach using MRI for diagnosing JME. Materials and Methods: A total of 97 JME patients (25.6 ± 8.5 years; female, 45.5%) and 32 HCs (28.9 ± 11.4 years; female, 50.0%) were randomly split (7:3 ratio) into a training (n = 90) and a test set (n = 39) group. Radiomic features were extracted from 22 regions of interest in the brain using the T1-weighted MRI based on clinical evidence. Predictive models were trained using seven modeling methods, including a light gradient boosting machine, support vector classifier, random forest, logistic regression, extreme gradient boosting, gradient boosting machine, and decision tree, with radiomics features in the training set. The performance of the models was validated and compared to the test set. The model with the highest area under the receiver operating curve (AUROC) was chosen, and important features in the model were identified. Results: The seven tested radiomics models, including light gradient boosting machine, support vector classifier, random forest, logistic regression, extreme gradient boosting, gradient boosting machine, and decision tree, showed AUROC values of 0.817, 0.807, 0.783, 0.779, 0.767, 0.762, and 0.672, respectively. The light gradient boosting machine with the highest AUROC, albeit without statistically significant differences from the other models in pairwise comparisons, had accuracy, precision, recall, and F1 scores of 0.795, 0.818, 0.931, and 0.871, respectively. Radiomic features, including the putamen and ventral diencephalon, were ranked as the most important for suggesting JME. Conclusion: Radiomic models using MRI were able to differentiate JME from HCs.

Diagnostic Performance of 18F-Fluorodeoxyglucose Positron Emission Tomography/CT for Chronic Empyema-Associated Malignancy

  • Miju Cheon;Jang Yoo;Seung Hyup Hyun;Kyung Soo Lee;Hojoong Kim;Jhingook Kim;Jae Il Zo;Young Mog Shim;Joon Young Choi
    • Korean Journal of Radiology
    • /
    • v.20 no.8
    • /
    • pp.1293-1299
    • /
    • 2019
  • Objective: The purpose of this study was to evaluate the diagnostic performance of 18F-fluorodeoxyglucose positron emission tomography/computed tomography (18F-FDG PET/CT) for chronic empyema-associated malignancy (CEAM). Materials and Methods: We retrospectively reviewed the 18F-FDG PET/CT images of 33 patients with chronic empyema, and analyzed the following findings: 1) shape of the empyema cavity, 2) presence of fistula, 3) maximum standardized uptake value (SUV) of the empyema cavity, 4) uptake pattern of the empyema cavity, 5) presence of a protruding soft tissue mass within the empyema cavity, and 6) involvement of adjacent structures. Final diagnosis was determined based on histopathology or clinical follow-up for at least 6 months. The abovementioned findings were compared between the 18F-FDG PET/CT images of CEAM and chronic empyema. A receiver operating characteristic (ROC) analysis was also performed. Results: Six lesions were histopathologically proven as malignant; there were three cases of diffuse large B-cell lymphoma, two of squamous cell carcinoma, and one of poorly differentiated carcinoma. Maximum SUV within the empyema cavity (p < 0.001) presence of a protruding soft tissue mass (p = 0.002), and involvement of the adjacent structures (p < 0.001) were significantly different between the CEAM and chronic empyema images. The maximum SUV exhibited the highest diagnostic performance, with the highest specificity (96.3%, 26/27), positive predictive value (85.7%, 6/7), and accuracy (97.0%, 32/33) among all criteria. On ROC analysis, the area under the curve of maximum SUV was 0.994. Conclusion: 18F-FDG PET/CT can be useful for diagnosing CEAM in patients with chronic empyema. The maximum SUV within the empyema cavity is the most accurate 18F-FDG PET/CT diagnostic criterion for CEAM.

Assessment of Two Clinical Prediction Models for a Pulmonary Embolism in Patients with a Suspected Pulmonary Embolism (폐색전증이 의심된 환자에서 두 가지 폐색전증 진단 예측 모형의 평가)

  • Park, Jae Seok;Choi, Won-Il;Min, Bo Ram;Park, Jie Hae;Chae, Jin Nyeong;Jeon, Young June;Yu, Ho Jung;Kim, Ji-Young;Kim, Gyoung-Ju;Ko, Sung-Min
    • Tuberculosis and Respiratory Diseases
    • /
    • v.64 no.4
    • /
    • pp.266-271
    • /
    • 2008
  • Background: Estimation of the probability of a patient having an acute pulmonary embolism (PE) for patients with a suspected PE are well established in North America and Europe. However, an assessment of the prediction rules for a PE has not been clearly defined in Korea. The aim of this study is to assess the prediction rules for patients with a suspected PE in Korea. Methods: We performed a retrospective study of 210 inpatients or patients that visited the emergency ward with a suspected PE where computed tomography pulmonary angiography was performed at a single institution between January 2005 and March 2007. Simplified Wells rules and revised Geneva rules were used to estimate the clinical probability of a PE based on information from medical records. Results: Of the 210 patients with a suspected PE, 49 (19.5%) patients had an actual diagnosis of a PE. The proportion of patients classified by Wells rules and the Geneva rules had a low probability of 1% and 21%, an intermediate probability of 62.5% and 76.2%, and a high probability of 33.8% and 2.8%, respectively. The prevalence of PE patients with a low, intermediate and high probability categorized by the Wells rules and Geneva rules was 100% and 4.5% in the low range, 18.2% and 22.5% in the intermediate range, and 19.7% and 50% in the high range, respectively. Receiver operating characteristic curve analysis showed that the revised Geneva rules had a higher accuracy than the Wells rules in terms of detecting PE. Concordance between the two prediction rules was poor ($\kappa$ coefficient=0.06). Conclusion: In the present study, the two prediction rules had a different predictive accuracy for pulmonary embolisms. Applying the revised Geneva rules to inpatients and emergency ward patients suspected of having PE may allow a more effective diagnostic process than the use of the Wells rules.

Impact of Initial Helical Abdominal Computed Tomography on the Diagnosis of Hollow Viscus Injury and Blunt Abdominal Traumare (복부 둔상 및 유강장기 손상에 있어서 초기 나선형 복부전산화 단층촬영의 진단적 가치)

  • Cho, Young-Duck;Hong, Yun-Sik;Lee, Sung-Woo;Choi, Sung-Hyuk;Yoon, Young-Hoon;Lim, Sung-Ik;Jang, Ik-Jin;Baek, Seung-Won
    • Journal of Trauma and Injury
    • /
    • v.21 no.1
    • /
    • pp.28-35
    • /
    • 2008
  • Purpose: This study was conducted to examine the clinical significance IV-contrasted helical abdomen computed tomography (CT) as a diagnostic screening tool to evaluate hollow viscus injury in blunt abdominal trauma patients. Methods: This is a retrospective study encompassing 108 patients, presenting to Korea University Medical Center (KUMC) Emergency Department (ED) from January 2007 to December 2007, with an initial CT finding suggestive of intra-abdominal injury. An initial non-enhanced abdomen CT was taken, followed by an enhanced CT with intravenous contrast. Patients' demographic data, as well as the mechanisms of injury, were inquired upon and obtained, initial diagnosis, as dictated by specialized radiologists, were added to post-operational (post-OP) findings and to additional CT findings acquired during their hospital stays, and all were combined to arrive at final diagnosis. Initial CT findings were further compared with the final diagnosis, yielding values for sensitivity, specificity, and accuracy, as well as positive and negative predictive values. Patients were further divided into two groups, namely, those that underwent operational intervention and those that did not. The initial CT findings of each group were subsequently compared and analyzed. Results: Initial CT scans revealed abnormal findings in a total of 212 cases - solid organ injuries being the most common finding, as was observed in 97 cases. Free fluid accumulation was evident in another 69 cases. Based on the CT findings, 77 cases (71.3%) were initially diagnosed as having a solid organ injury, 20 cases (18.5%) as having a combined (solid organ + hollow viscus) injury, and 11 cases (10.2%), as having an isolated hollow viscus injury. The final diagnosis however, were somewhat different, with only 67 cases (62.0%) attributed to solid organ injury, 31 cases (28.7%) to combined injury (solid + hollow), and 10 cases (9.3%) to hollow viscus injury. The sensitivity (CI 95%) of the initial helical CT in diagnosing hollow viscus injury was 75.6%, and its specificity was 100%. The accuracy in diagnosing hollow viscus injury was also meaningfully lower compared to that in diagnosis of solid organ injury. Among patients initially diagnosed with solid organ injuries, 10 patients (2 from follow-up CT and 8 from post-OP finding) turned out to have combined injuries. A total of 38 patients underwent an operation, and the proportion of initial CT findings suggesting free air, mesenteric hematoma or bowel wall thickening turned out to be significantly higher in the operation group. Conclusion: Abdominal CT was a meaningful screening test for hollow viscus injury, but the sensitivity of abdominal CT was significantly lower in detecting hollow viscus injury as compared to solid organ injury. This calls for special consideration and careful observation by the ED physicians when dealing with cases of blunt abdominal trauma.

Prediction of Air Temperature and Relative Humidity in Greenhouse via a Multilayer Perceptron Using Environmental Factors (환경요인을 이용한 다층 퍼셉트론 기반 온실 내 기온 및 상대습도 예측)

  • Choi, Hayoung;Moon, Taewon;Jung, Dae Ho;Son, Jung Eek
    • Journal of Bio-Environment Control
    • /
    • v.28 no.2
    • /
    • pp.95-103
    • /
    • 2019
  • Temperature and relative humidity are important factors in crop cultivation and should be properly controlled for improving crop yield and quality. In order to control the environment accurately, we need to predict how the environment will change in the future. The objective of this study was to predict air temperature and relative humidity at a future time by using a multilayer perceptron (MLP). The data required to train MLP was collected every 10 min from Oct. 1, 2016 to Feb. 28, 2018 in an eight-span greenhouse ($1,032m^2$) cultivating mango (Mangifera indica cv. Irwin). The inputs for the MLP were greenhouse inside and outside environment data, and set-up and operating values of environment control devices. By using these data, the MLP was trained to predict the air temperature and relative humidity at a future time of 10 to 120 min. Considering typical four seasons in Korea, three-day data of the each season were compared as test data. The MLP was optimized with four hidden layers and 128 nodes for air temperature ($R^2=0.988$) and with four hidden layers and 64 nodes for relative humidity ($R^2=0.990$). Due to the characteristics of MLP, the accuracy decreased as the prediction time became longer. However, air temperature and relative humidity were properly predicted regardless of the environmental changes varied from season to season. For specific data such as spray irrigation, however, the numbers of trained data were too small, resulting in poor predictive accuracy. In this study, air temperature and relative humidity were appropriately predicted through optimization of MLP, but were limited to the experimental greenhouse. Therefore, it is necessary to collect more data from greenhouses at various places and modify the structure of neural network for generalization.

A Study of Anomaly Detection for ICT Infrastructure using Conditional Multimodal Autoencoder (ICT 인프라 이상탐지를 위한 조건부 멀티모달 오토인코더에 관한 연구)

  • Shin, Byungjin;Lee, Jonghoon;Han, Sangjin;Park, Choong-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.57-73
    • /
    • 2021
  • Maintenance and prevention of failure through anomaly detection of ICT infrastructure is becoming important. System monitoring data is multidimensional time series data. When we deal with multidimensional time series data, we have difficulty in considering both characteristics of multidimensional data and characteristics of time series data. When dealing with multidimensional data, correlation between variables should be considered. Existing methods such as probability and linear base, distance base, etc. are degraded due to limitations called the curse of dimensions. In addition, time series data is preprocessed by applying sliding window technique and time series decomposition for self-correlation analysis. These techniques are the cause of increasing the dimension of data, so it is necessary to supplement them. The anomaly detection field is an old research field, and statistical methods and regression analysis were used in the early days. Currently, there are active studies to apply machine learning and artificial neural network technology to this field. Statistically based methods are difficult to apply when data is non-homogeneous, and do not detect local outliers well. The regression analysis method compares the predictive value and the actual value after learning the regression formula based on the parametric statistics and it detects abnormality. Anomaly detection using regression analysis has the disadvantage that the performance is lowered when the model is not solid and the noise or outliers of the data are included. There is a restriction that learning data with noise or outliers should be used. The autoencoder using artificial neural networks is learned to output as similar as possible to input data. It has many advantages compared to existing probability and linear model, cluster analysis, and map learning. It can be applied to data that does not satisfy probability distribution or linear assumption. In addition, it is possible to learn non-mapping without label data for teaching. However, there is a limitation of local outlier identification of multidimensional data in anomaly detection, and there is a problem that the dimension of data is greatly increased due to the characteristics of time series data. In this study, we propose a CMAE (Conditional Multimodal Autoencoder) that enhances the performance of anomaly detection by considering local outliers and time series characteristics. First, we applied Multimodal Autoencoder (MAE) to improve the limitations of local outlier identification of multidimensional data. Multimodals are commonly used to learn different types of inputs, such as voice and image. The different modal shares the bottleneck effect of Autoencoder and it learns correlation. In addition, CAE (Conditional Autoencoder) was used to learn the characteristics of time series data effectively without increasing the dimension of data. In general, conditional input mainly uses category variables, but in this study, time was used as a condition to learn periodicity. The CMAE model proposed in this paper was verified by comparing with the Unimodal Autoencoder (UAE) and Multi-modal Autoencoder (MAE). The restoration performance of Autoencoder for 41 variables was confirmed in the proposed model and the comparison model. The restoration performance is different by variables, and the restoration is normally well operated because the loss value is small for Memory, Disk, and Network modals in all three Autoencoder models. The process modal did not show a significant difference in all three models, and the CPU modal showed excellent performance in CMAE. ROC curve was prepared for the evaluation of anomaly detection performance in the proposed model and the comparison model, and AUC, accuracy, precision, recall, and F1-score were compared. In all indicators, the performance was shown in the order of CMAE, MAE, and AE. Especially, the reproduction rate was 0.9828 for CMAE, which can be confirmed to detect almost most of the abnormalities. The accuracy of the model was also improved and 87.12%, and the F1-score was 0.8883, which is considered to be suitable for anomaly detection. In practical aspect, the proposed model has an additional advantage in addition to performance improvement. The use of techniques such as time series decomposition and sliding windows has the disadvantage of managing unnecessary procedures; and their dimensional increase can cause a decrease in the computational speed in inference.The proposed model has characteristics that are easy to apply to practical tasks such as inference speed and model management.