• Title/Summary/Keyword: EVALUATION

Search Result 82,064, Processing Time 0.097 seconds

A Study of Anomaly Detection for ICT Infrastructure using Conditional Multimodal Autoencoder (ICT 인프라 이상탐지를 위한 조건부 멀티모달 오토인코더에 관한 연구)

  • Shin, Byungjin;Lee, Jonghoon;Han, Sangjin;Park, Choong-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.57-73
    • /
    • 2021
  • Maintenance and prevention of failure through anomaly detection of ICT infrastructure is becoming important. System monitoring data is multidimensional time series data. When we deal with multidimensional time series data, we have difficulty in considering both characteristics of multidimensional data and characteristics of time series data. When dealing with multidimensional data, correlation between variables should be considered. Existing methods such as probability and linear base, distance base, etc. are degraded due to limitations called the curse of dimensions. In addition, time series data is preprocessed by applying sliding window technique and time series decomposition for self-correlation analysis. These techniques are the cause of increasing the dimension of data, so it is necessary to supplement them. The anomaly detection field is an old research field, and statistical methods and regression analysis were used in the early days. Currently, there are active studies to apply machine learning and artificial neural network technology to this field. Statistically based methods are difficult to apply when data is non-homogeneous, and do not detect local outliers well. The regression analysis method compares the predictive value and the actual value after learning the regression formula based on the parametric statistics and it detects abnormality. Anomaly detection using regression analysis has the disadvantage that the performance is lowered when the model is not solid and the noise or outliers of the data are included. There is a restriction that learning data with noise or outliers should be used. The autoencoder using artificial neural networks is learned to output as similar as possible to input data. It has many advantages compared to existing probability and linear model, cluster analysis, and map learning. It can be applied to data that does not satisfy probability distribution or linear assumption. In addition, it is possible to learn non-mapping without label data for teaching. However, there is a limitation of local outlier identification of multidimensional data in anomaly detection, and there is a problem that the dimension of data is greatly increased due to the characteristics of time series data. In this study, we propose a CMAE (Conditional Multimodal Autoencoder) that enhances the performance of anomaly detection by considering local outliers and time series characteristics. First, we applied Multimodal Autoencoder (MAE) to improve the limitations of local outlier identification of multidimensional data. Multimodals are commonly used to learn different types of inputs, such as voice and image. The different modal shares the bottleneck effect of Autoencoder and it learns correlation. In addition, CAE (Conditional Autoencoder) was used to learn the characteristics of time series data effectively without increasing the dimension of data. In general, conditional input mainly uses category variables, but in this study, time was used as a condition to learn periodicity. The CMAE model proposed in this paper was verified by comparing with the Unimodal Autoencoder (UAE) and Multi-modal Autoencoder (MAE). The restoration performance of Autoencoder for 41 variables was confirmed in the proposed model and the comparison model. The restoration performance is different by variables, and the restoration is normally well operated because the loss value is small for Memory, Disk, and Network modals in all three Autoencoder models. The process modal did not show a significant difference in all three models, and the CPU modal showed excellent performance in CMAE. ROC curve was prepared for the evaluation of anomaly detection performance in the proposed model and the comparison model, and AUC, accuracy, precision, recall, and F1-score were compared. In all indicators, the performance was shown in the order of CMAE, MAE, and AE. Especially, the reproduction rate was 0.9828 for CMAE, which can be confirmed to detect almost most of the abnormalities. The accuracy of the model was also improved and 87.12%, and the F1-score was 0.8883, which is considered to be suitable for anomaly detection. In practical aspect, the proposed model has an additional advantage in addition to performance improvement. The use of techniques such as time series decomposition and sliding windows has the disadvantage of managing unnecessary procedures; and their dimensional increase can cause a decrease in the computational speed in inference.The proposed model has characteristics that are easy to apply to practical tasks such as inference speed and model management.

A Two-Stage Learning Method of CNN and K-means RGB Cluster for Sentiment Classification of Images (이미지 감성분류를 위한 CNN과 K-means RGB Cluster 이-단계 학습 방안)

  • Kim, Jeongtae;Park, Eunbi;Han, Kiwoong;Lee, Junghyun;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.139-156
    • /
    • 2021
  • The biggest reason for using a deep learning model in image classification is that it is possible to consider the relationship between each region by extracting each region's features from the overall information of the image. However, the CNN model may not be suitable for emotional image data without the image's regional features. To solve the difficulty of classifying emotion images, many researchers each year propose a CNN-based architecture suitable for emotion images. Studies on the relationship between color and human emotion were also conducted, and results were derived that different emotions are induced according to color. In studies using deep learning, there have been studies that apply color information to image subtraction classification. The case where the image's color information is additionally used than the case where the classification model is trained with only the image improves the accuracy of classifying image emotions. This study proposes two ways to increase the accuracy by incorporating the result value after the model classifies an image's emotion. Both methods improve accuracy by modifying the result value based on statistics using the color of the picture. When performing the test by finding the two-color combinations most distributed for all training data, the two-color combinations most distributed for each test data image were found. The result values were corrected according to the color combination distribution. This method weights the result value obtained after the model classifies an image's emotion by creating an expression based on the log function and the exponential function. Emotion6, classified into six emotions, and Artphoto classified into eight categories were used for the image data. Densenet169, Mnasnet, Resnet101, Resnet152, and Vgg19 architectures were used for the CNN model, and the performance evaluation was compared before and after applying the two-stage learning to the CNN model. Inspired by color psychology, which deals with the relationship between colors and emotions, when creating a model that classifies an image's sentiment, we studied how to improve accuracy by modifying the result values based on color. Sixteen colors were used: red, orange, yellow, green, blue, indigo, purple, turquoise, pink, magenta, brown, gray, silver, gold, white, and black. It has meaning. Using Scikit-learn's Clustering, the seven colors that are primarily distributed in the image are checked. Then, the RGB coordinate values of the colors from the image are compared with the RGB coordinate values of the 16 colors presented in the above data. That is, it was converted to the closest color. Suppose three or more color combinations are selected. In that case, too many color combinations occur, resulting in a problem in which the distribution is scattered, so a situation fewer influences the result value. Therefore, to solve this problem, two-color combinations were found and weighted to the model. Before training, the most distributed color combinations were found for all training data images. The distribution of color combinations for each class was stored in a Python dictionary format to be used during testing. During the test, the two-color combinations that are most distributed for each test data image are found. After that, we checked how the color combinations were distributed in the training data and corrected the result. We devised several equations to weight the result value from the model based on the extracted color as described above. The data set was randomly divided by 80:20, and the model was verified using 20% of the data as a test set. After splitting the remaining 80% of the data into five divisions to perform 5-fold cross-validation, the model was trained five times using different verification datasets. Finally, the performance was checked using the test dataset that was previously separated. Adam was used as the activation function, and the learning rate was set to 0.01. The training was performed as much as 20 epochs, and if the validation loss value did not decrease during five epochs of learning, the experiment was stopped. Early tapping was set to load the model with the best validation loss value. The classification accuracy was better when the extracted information using color properties was used together than the case using only the CNN architecture.

The Evaluation of Non-Coplanar Volumetric Modulated Arc Therapy for Brain stereotactic radiosurgery (뇌 정위적 방사선수술 시 Non-Coplanar Volumetric Modulated Arc Therapy의 유용성 평가)

  • Lee, Doo Sang;Kang, Hyo Seok;Choi, Byoung Joon;Park, Sang Jun;Jung, Da Ee;Lee, Geon Ho;Ahn, Min Woo;Jeon, Myeong Soo
    • The Journal of Korean Society for Radiation Therapy
    • /
    • v.30 no.1_2
    • /
    • pp.9-16
    • /
    • 2018
  • Purpose : Brain Stereotactic Radiosurgery can treat non-invasive diseases with high rates of complications due to surgical operations. However, brain stereotactic radiosurgery may be accompanied by radiation induced side effects such as fractionation radiation therapy because it uses radiation. The effects of Coplanar Volumetric Modulated Arc Therapy(C-VMAT) and Non-Coplanar Volumetric Modulated Arc Therapy(NC-VMAT) on surrounding normal tissues were analyzed in order to reduce the side effects caused fractionation radiation therapy such as head and neck. But, brain stereotactic radiosurgery these contents were not analyzed. In this study, we evaluated the usefulness of NC-VMAT by comparing and analyzing C-VMAT and NC-VMAT in patients who underwent brain stereotactic radiosurgery. Methods and materials : With C-VMAT and NC-VMAT, 13 treatment plans for brain stereotactic radiosurgery were established. The Planning Target Volume ranged from a minimum of 0.78 cc to a maximum of 12.26 cc, Prescription doses were prescribed between 15 and 24 Gy. Treatment machine was TrueBeam STx (Varian Medical Systems, USA). The energy used in the treatment plan was 6 MV Flattening Filter Free (6FFF) X-ray. The C-VMAT treatment plan used a half 2 arc or full 2 arc treatment plan, and the NC-VMAT treatment plan used 3 to 7 Arc 40 to 190 degrees. The angle of the couch was planned to be 3-7 angles. Results : The mean value of the maximum dose was $105.1{\pm}1.37%$ in C-VMAT and $105.8{\pm}1.71%$ in NC-VMAT. Conformity index of C-VMAT was $1.08{\pm}0.08$ and homogeneity index was $1.03{\pm}0.01$. Conformity index of NC-VMAT was $1.17{\pm}0.1$ and homogeneity index was $1.04{\pm}0.01$. $V_2$, $V_8$, $V_{12}$, $V_{18}$, $V_{24}$ of the brain were $176{\pm}149.36cc$, $31.50{\pm}25.03cc$, $16.53{\pm}12.63cc$, $8.60{\pm}6.87cc$ and $4.03{\pm}3.43cc$ in the C-VMAT and $135.55{\pm}115.93cc$, $24.34{\pm}17.68cc$, $14.74{\pm}10.97cc$, $8.55{\pm}6.79cc$, $4.23{\pm}3.48cc$. Conclusions : The maximum dose, conformity index, and homogeneity index showed no significant difference between C-VMAT and NC-VMAT. $V_2$ to $V_{18}$ of the brain showed a difference of at least 0.5 % to 48 %. $V_{19}$ to $V_{24}$ of the brain showed a difference of at least 0.4 % to 4.8 %. When we compare the mean value of $V_{12}$ that Radione-crosis begins to generate, NC-VMAT has about 12.2 % less amount than C-VMAT. These results suggest that if NC-VMAT is used, the volume of $V_2$ to $V_{18}$ can be reduced, which can reduce Radionecrosis.

  • PDF

The Application of 3D Bolus with Neck in the Treatment of Hypopharynx Cancer in VMAT (Hypopharynx Cancer의 VMAT 치료 시 Neck 3D Bolus 적용에 대한 유용성 평가)

  • An, Ye Chan;Kim, Jin Man;Kim, Chan Yang;Kim, Jong Sik;Park, Yong Chul
    • The Journal of Korean Society for Radiation Therapy
    • /
    • v.32
    • /
    • pp.41-52
    • /
    • 2020
  • Purpose: To find out the dosimetric usefulness, setup reproducibility and efficiency of applying 3D Bolus by comparing two treatment plans in which Commercial Bolus and 3D Bolus produced by 3D Printing Technology were applied to the neck during VMAT treatment of Hypopahrynx Cancer to evaluate the clinical applicability. Materials and Methods: Based on the CT image of the RANDO phantom to which CB was applied, 3D Bolus were fabricated in the same form. 3D Bolus was printed with a polyurethane acrylate resin with a density of 1.2g/㎤ through the SLA technique using OMG SLA 660 Printer and MaterializeMagics software. Based on two CT images using CB and 3D Bolus, a treatment plan was established assuming VMAT treatment of Hypopharynx Cancer. CBCT images were obtained for each of the two established treatment plans 18 times, and the treatment efficiency was evaluated by measuring the setup time each time. Based on the obtained CBCT image, the adaptive plan was performed through Pinnacle, a computerized treatment planning system, to evaluate target, normal organ dose evaluation, and changes in bolus volume. Results: The setup time for each treatment plan was reduced by an average of 28 sec in the 3D Bolus treatment plan compared to the CB treatment plan. The Bolus Volume change during the pretreatment period was 86.1±2.70㎤ in 83.9㎤ of CB Initial Plan and 99.8±0.46㎤ in 92.2㎤ of 3D Bolus Initial Plan. The change in CTV Min Value was 167.4±19.38cGy in CB Initial Plan 191.6cGy and 149.5±18.27cGy in 3D Bolus Initial Plan 167.3cGy. The change in CTV Mean Value was 228.3±0.38cGy in CB Initial Plan 227.1cGy and 227.7±0.30cGy in 3D Bolus Initial Plan 225.9cGy. The change in PTV Min Value was 74.9±19.47cGy in CB Initial Plan 128.5cGy and 83.2±12.92cGy in 3D Bolus Initial Plan 139.9cGy. The change in PTV Mean Value was 226.2±0.83cGy in CB Initial Plan 225.4cGy and 225.8±0.33cGy in 3D Bolus Initial Plan 224.1cGy. The maximum value for the normal organ spinal cord was the same as 135.6cGy on average each time. Conclusion: From the experimental results of this paper, it was found that the application of 3D Bolus to the irregular body surface is more dosimetrically useful than the application of Commercial Bolus, and the setup reproducibility and efficiency are excellent. If further case studies along with research on the diversity of 3D printing materials are conducted in the future, the application of 3D Bolus in the field of radiation therapy is expected to proceed more actively.

A Study on the Effect of Network Centralities on Recommendation Performance (네트워크 중심성 척도가 추천 성능에 미치는 영향에 대한 연구)

  • Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.23-46
    • /
    • 2021
  • Collaborative filtering, which is often used in personalization recommendations, is recognized as a very useful technique to find similar customers and recommend products to them based on their purchase history. However, the traditional collaborative filtering technique has raised the question of having difficulty calculating the similarity for new customers or products due to the method of calculating similaritiesbased on direct connections and common features among customers. For this reason, a hybrid technique was designed to use content-based filtering techniques together. On the one hand, efforts have been made to solve these problems by applying the structural characteristics of social networks. This applies a method of indirectly calculating similarities through their similar customers placed between them. This means creating a customer's network based on purchasing data and calculating the similarity between the two based on the features of the network that indirectly connects the two customers within this network. Such similarity can be used as a measure to predict whether the target customer accepts recommendations. The centrality metrics of networks can be utilized for the calculation of these similarities. Different centrality metrics have important implications in that they may have different effects on recommended performance. In this study, furthermore, the effect of these centrality metrics on the performance of recommendation may vary depending on recommender algorithms. In addition, recommendation techniques using network analysis can be expected to contribute to increasing recommendation performance even if they apply not only to new customers or products but also to entire customers or products. By considering a customer's purchase of an item as a link generated between the customer and the item on the network, the prediction of user acceptance of recommendation is solved as a prediction of whether a new link will be created between them. As the classification models fit the purpose of solving the binary problem of whether the link is engaged or not, decision tree, k-nearest neighbors (KNN), logistic regression, artificial neural network, and support vector machine (SVM) are selected in the research. The data for performance evaluation used order data collected from an online shopping mall over four years and two months. Among them, the previous three years and eight months constitute social networks composed of and the experiment was conducted by organizing the data collected into the social network. The next four months' records were used to train and evaluate recommender models. Experiments with the centrality metrics applied to each model show that the recommendation acceptance rates of the centrality metrics are different for each algorithm at a meaningful level. In this work, we analyzed only four commonly used centrality metrics: degree centrality, betweenness centrality, closeness centrality, and eigenvector centrality. Eigenvector centrality records the lowest performance in all models except support vector machines. Closeness centrality and betweenness centrality show similar performance across all models. Degree centrality ranking moderate across overall models while betweenness centrality always ranking higher than degree centrality. Finally, closeness centrality is characterized by distinct differences in performance according to the model. It ranks first in logistic regression, artificial neural network, and decision tree withnumerically high performance. However, it only records very low rankings in support vector machine and K-neighborhood with low-performance levels. As the experiment results reveal, in a classification model, network centrality metrics over a subnetwork that connects the two nodes can effectively predict the connectivity between two nodes in a social network. Furthermore, each metric has a different performance depending on the classification model type. This result implies that choosing appropriate metrics for each algorithm can lead to achieving higher recommendation performance. In general, betweenness centrality can guarantee a high level of performance in any model. It would be possible to consider the introduction of proximity centrality to obtain higher performance for certain models.

Evaluation of the Usefulness of Exactrac in Image-guided Radiation Therapy for Head and Neck Cancer (두경부암의 영상유도방사선치료에서 ExacTrac의 유용성 평가)

  • Baek, Min Gyu;Kim, Min Woo;Ha, Se Min;Chae, Jong Pyo;Jo, Guang Sub;Lee, Sang Bong
    • The Journal of Korean Society for Radiation Therapy
    • /
    • v.32
    • /
    • pp.7-15
    • /
    • 2020
  • Purpose: In modern radiotherapy technology, several methods of image guided radiation therapy (IGRT) are used to deliver accurate doses to tumor target locations and normal organs, including CBCT (Cone Beam Computed Tomography) and other devices, ExacTrac System, other than CBCT equipped with linear accelerators. In previous studies comparing the two systems, positional errors were analysed rearwards using Offline-view or evaluated only with a Yaw rotation with the X, Y, and Z axes. In this study, when using CBCT and ExacTrac to perform 6 Degree of the Freedom(DoF) Online IGRT in a treatment center with two equipment, the difference between the set-up calibration values seen in each system, the time taken for patient set-up, and the radiation usefulness of the imaging device is evaluated. Materials and Methods: In order to evaluate the difference between mobile calibrations and exposure radiation dose, the glass dosimetry and Rando Phantom were used for 11 cancer patients with head circumference from March to October 2017 in order to assess the difference between mobile calibrations and the time taken from Set-up to shortly before IGRT. CBCT and ExacTrac System were used for IGRT of all patients. An average of 10 CBCT and ExacTrac images were obtained per patient during the total treatment period, and the difference in 6D Online Automation values between the two systems was calculated within the ROI setting. In this case, the area of interest designation in the image obtained from CBCT was fixed to the same anatomical structure as the image obtained through ExacTrac. The difference in positional values for the six axes (SI, AP, LR; Rotation group: Pitch, Roll, Rtn) between the two systems, the total time taken from patient set-up to just before IGRT, and exposure dose were measured and compared respectively with the RandoPhantom. Results: the set-up error in the phantom and patient was less than 1mm in the translation group and less than 1.5° in the rotation group, and the RMS values of all axes except the Rtn value were less than 1mm and 1°. The time taken to correct the set-up error in each system was an average of 256±47.6sec for IGRT using CBCT and 84±3.5sec for ExacTrac, respectively. Radiation exposure dose by IGRT per treatment was measured at 37 times higher than ExacTrac in CBCT and ExacTrac at 2.468mGy and 0.066mGy at Oral Mucosa among the 7 measurement locations in the head and neck area. Conclusion: Through 6D online automatic positioning between the CBCT and ExacTrac systems, the set-up error was found to be less than 1mm, 1.02°, including the patient's movement (random error), as well as the systematic error of the two systems. This error range is considered to be reasonable when considering that the PTV Margin is 3mm during the head and neck IMRT treatment in the present study. However, considering the changes in target and risk organs due to changes in patient weight during the treatment period, it is considered to be appropriately used in combination with CBCT.

Edge to Edge Model and Delay Performance Evaluation for Autonomous Driving (자율 주행을 위한 Edge to Edge 모델 및 지연 성능 평가)

  • Cho, Moon Ki;Bae, Kyoung Yul
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.191-207
    • /
    • 2021
  • Up to this day, mobile communications have evolved rapidly over the decades, mainly focusing on speed-up to meet the growing data demands of 2G to 5G. And with the start of the 5G era, efforts are being made to provide such various services to customers, as IoT, V2X, robots, artificial intelligence, augmented virtual reality, and smart cities, which are expected to change the environment of our lives and industries as a whole. In a bid to provide those services, on top of high speed data, reduced latency and reliability are critical for real-time services. Thus, 5G has paved the way for service delivery through maximum speed of 20Gbps, a delay of 1ms, and a connecting device of 106/㎢ In particular, in intelligent traffic control systems and services using various vehicle-based Vehicle to X (V2X), such as traffic control, in addition to high-speed data speed, reduction of delay and reliability for real-time services are very important. 5G communication uses high frequencies of 3.5Ghz and 28Ghz. These high-frequency waves can go with high-speed thanks to their straightness while their short wavelength and small diffraction angle limit their reach to distance and prevent them from penetrating walls, causing restrictions on their use indoors. Therefore, under existing networks it's difficult to overcome these constraints. The underlying centralized SDN also has a limited capability in offering delay-sensitive services because communication with many nodes creates overload in its processing. Basically, SDN, which means a structure that separates signals from the control plane from packets in the data plane, requires control of the delay-related tree structure available in the event of an emergency during autonomous driving. In these scenarios, the network architecture that handles in-vehicle information is a major variable of delay. Since SDNs in general centralized structures are difficult to meet the desired delay level, studies on the optimal size of SDNs for information processing should be conducted. Thus, SDNs need to be separated on a certain scale and construct a new type of network, which can efficiently respond to dynamically changing traffic and provide high-quality, flexible services. Moreover, the structure of these networks is closely related to ultra-low latency, high confidence, and hyper-connectivity and should be based on a new form of split SDN rather than an existing centralized SDN structure, even in the case of the worst condition. And in these SDN structural networks, where automobiles pass through small 5G cells very quickly, the information change cycle, round trip delay (RTD), and the data processing time of SDN are highly correlated with the delay. Of these, RDT is not a significant factor because it has sufficient speed and less than 1 ms of delay, but the information change cycle and data processing time of SDN are factors that greatly affect the delay. Especially, in an emergency of self-driving environment linked to an ITS(Intelligent Traffic System) that requires low latency and high reliability, information should be transmitted and processed very quickly. That is a case in point where delay plays a very sensitive role. In this paper, we study the SDN architecture in emergencies during autonomous driving and conduct analysis through simulation of the correlation with the cell layer in which the vehicle should request relevant information according to the information flow. For simulation: As the Data Rate of 5G is high enough, we can assume the information for neighbor vehicle support to the car without errors. Furthermore, we assumed 5G small cells within 50 ~ 250 m in cell radius, and the maximum speed of the vehicle was considered as a 30km ~ 200 km/hour in order to examine the network architecture to minimize the delay.

The Prediction of Export Credit Guarantee Accident using Machine Learning (기계학습을 이용한 수출신용보증 사고예측)

  • Cho, Jaeyoung;Joo, Jihwan;Han, Ingoo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.83-102
    • /
    • 2021
  • The government recently announced various policies for developing big-data and artificial intelligence fields to provide a great opportunity to the public with respect to disclosure of high-quality data within public institutions. KSURE(Korea Trade Insurance Corporation) is a major public institution for financial policy in Korea, and thus the company is strongly committed to backing export companies with various systems. Nevertheless, there are still fewer cases of realized business model based on big-data analyses. In this situation, this paper aims to develop a new business model which can be applied to an ex-ante prediction for the likelihood of the insurance accident of credit guarantee. We utilize internal data from KSURE which supports export companies in Korea and apply machine learning models. Then, we conduct performance comparison among the predictive models including Logistic Regression, Random Forest, XGBoost, LightGBM, and DNN(Deep Neural Network). For decades, many researchers have tried to find better models which can help to predict bankruptcy since the ex-ante prediction is crucial for corporate managers, investors, creditors, and other stakeholders. The development of the prediction for financial distress or bankruptcy was originated from Smith(1930), Fitzpatrick(1932), or Merwin(1942). One of the most famous models is the Altman's Z-score model(Altman, 1968) which was based on the multiple discriminant analysis. This model is widely used in both research and practice by this time. The author suggests the score model that utilizes five key financial ratios to predict the probability of bankruptcy in the next two years. Ohlson(1980) introduces logit model to complement some limitations of previous models. Furthermore, Elmer and Borowski(1988) develop and examine a rule-based, automated system which conducts the financial analysis of savings and loans. Since the 1980s, researchers in Korea have started to examine analyses on the prediction of financial distress or bankruptcy. Kim(1987) analyzes financial ratios and develops the prediction model. Also, Han et al.(1995, 1996, 1997, 2003, 2005, 2006) construct the prediction model using various techniques including artificial neural network. Yang(1996) introduces multiple discriminant analysis and logit model. Besides, Kim and Kim(2001) utilize artificial neural network techniques for ex-ante prediction of insolvent enterprises. After that, many scholars have been trying to predict financial distress or bankruptcy more precisely based on diverse models such as Random Forest or SVM. One major distinction of our research from the previous research is that we focus on examining the predicted probability of default for each sample case, not only on investigating the classification accuracy of each model for the entire sample. Most predictive models in this paper show that the level of the accuracy of classification is about 70% based on the entire sample. To be specific, LightGBM model shows the highest accuracy of 71.1% and Logit model indicates the lowest accuracy of 69%. However, we confirm that there are open to multiple interpretations. In the context of the business, we have to put more emphasis on efforts to minimize type 2 error which causes more harmful operating losses for the guaranty company. Thus, we also compare the classification accuracy by splitting predicted probability of the default into ten equal intervals. When we examine the classification accuracy for each interval, Logit model has the highest accuracy of 100% for 0~10% of the predicted probability of the default, however, Logit model has a relatively lower accuracy of 61.5% for 90~100% of the predicted probability of the default. On the other hand, Random Forest, XGBoost, LightGBM, and DNN indicate more desirable results since they indicate a higher level of accuracy for both 0~10% and 90~100% of the predicted probability of the default but have a lower level of accuracy around 50% of the predicted probability of the default. When it comes to the distribution of samples for each predicted probability of the default, both LightGBM and XGBoost models have a relatively large number of samples for both 0~10% and 90~100% of the predicted probability of the default. Although Random Forest model has an advantage with regard to the perspective of classification accuracy with small number of cases, LightGBM or XGBoost could become a more desirable model since they classify large number of cases into the two extreme intervals of the predicted probability of the default, even allowing for their relatively low classification accuracy. Considering the importance of type 2 error and total prediction accuracy, XGBoost and DNN show superior performance. Next, Random Forest and LightGBM show good results, but logistic regression shows the worst performance. However, each predictive model has a comparative advantage in terms of various evaluation standards. For instance, Random Forest model shows almost 100% accuracy for samples which are expected to have a high level of the probability of default. Collectively, we can construct more comprehensive ensemble models which contain multiple classification machine learning models and conduct majority voting for maximizing its overall performance.

Comparative analysis of Glomerular Filtration Rate measurement and estimated glomerular filtration rate using 99mTc-DTPA in kidney transplant donors. (신장이식 공여자에서 99mTc-DTPA를 이용한 Glomerular Filtration Rate 측정과 추정사구체여과율의 비교분석)

  • Cheon, Jun Hong;Yoo, Nam Ho;Lee, Sun Ho
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.25 no.2
    • /
    • pp.35-40
    • /
    • 2021
  • Purpose Glomerular filtration rate(GFR) is an important indicator for the diagnosis, treatment, and follow-up of kidney disease and is also used by healthy individuals for drug use and evaluating kidney function in donors. The gold standard method of the GFR test is to measure by continuously injecting the inulin which is extrinsic marker, but it takes a long time and the test method is complicated. so, the method of measuring the serum concentration of creatinine is used. Estimated glomerular filtration rate (eGFR) is used instead. However, creatinine is known to be affected by age, gender, muscle mass, etc. eGFR formulas that are currently used include the Cockroft-Gault formula, the modification of diet in renal disease (MDRD) formula, and the chronic kidney disease epidemilogy collaboration (CKD-EPI) formula for adults. For children, the Schwartz formula is used. Measurement of GFR using 51Cr-EDTA (diethylenetriamine tetraacetic acid), 99mTc-DTPA (diethylenetriamine pentaacetic acid) can replace inulin and is currently in use. Therefore, We compared the GFR measured using 99mTc-DTPA with the eGFR using CKD-EPI formula. Materials and Methods For 200 kidney transplant donors who visited Asan medical center.(96 males, 104 females, 47.3 years ± 12.7 years old) GFR was measured using plasma(Two-plasma-sample-method, TPSM) obtained by intravenous administration of 99mTc-DTPA(0.5mCi, 18.5 MBq). eGFR was derived using CKD-EPI formula based on serum creatinine concentration. Results GFR average measured using 99mTc-DTPA for 200 kidney transplant donors is 97.27±19.46(ml/min/1.73m2), and the eGFR average value using the CKD-EPI formula is 96.84±17.74(ml/min/1.73m2), The concentration of serum creatinine is 0.84±0.39(mg/dL). Regression formula of 99mTc-DTPA GFR for serum creatinine-based eGFR was Y = 0.5073X + 48.186, and the correlation coefficient was 0.698 (P<0.01). Difference (%) was 1.52±18.28. Conclusion The correlation coefficient between the 99mTc-DTPA and the eGFR derived on serum creatinine concentration was confirmed to be moderate. This is estimated that eGFR is affected by external factors such as age, gender, and muscle mass and use of formulas made for kidney disease patients. By using 99mTc-DTPA, we can provide reliable GFR results, which is used for diagnosis, treatment and observation of kidney disease, and kidney evaluation of kidney transplant patients.

Evaluation of Ovary Dose of Childbearing age Woman with Breast cancer in Radiation therapy (가임기 여성의 방사선 치료 시 난소 선량 평가)

  • Park, Sung Jun;Lee, Yeong Cheol;Kim, Seon Myeong;Kim, Young Bum
    • The Journal of Korean Society for Radiation Therapy
    • /
    • v.33
    • /
    • pp.145-153
    • /
    • 2021
  • Purpose: The purpose of this study is to evaluate the ovarian dose during radiation therapy for breast cancer in women of childbearing age through an experiment. The ovarian dose is evaluated by comparing and analyzing between the calculated dose in the treatment planning system according to the treatment technique and the measured dose using a thermoluminescence dosimeter (TLD). The clinical usefulness of lead (Pb) apron is investigated through dose analysis according to whether or not it is used. Materials and Methods: Rando humanoid phantom was used for measurement, and wedge filter radiation therapy, 3D conformal radiation therapy, and intensity modulated radiation therapy were used as treatment techniques. A treatment plan was established so that 95% of the prescribed dose could be delivered to the right breast of the Rando humanoid phantom 3D image obtained using the CT simulator. TLD was inserted into the surface and depth of the virtual ovary of the Rando hunmanoid phantom and irradiated with radiation. The measurement location was the center of treatment and the point moved 2 cm to the opposite breast from the center of the Rando hunmanoid phantom, 5cm, 10cm, 12.5cm, 15cm, 17.5cm, 20cm from the boundary of the right breast to the center of treatment and downward, and the surface and depth of the right ovary. Measurements were made at a total of 9 central points. In the dose comparison of treatment planning systems, two wedge filter treatment techniques, three-dimensional conformal radiotherapy, and intensity-modulated radiation therapy were established and compared. Treatments were compared, and dose measurements according to the use of lead apron were compared and analyzed in intensity-modulated radiation therapy. The measured value was calculated by averaging three TLD values for each point and converting using the TLD calibration value, which was calculated as the point dose mean value. In order to compare the treatment plan value with the actual measured value, the absolute dose value was measured and compared at each point (%Diff). Results: At Point A, the center of treatment, a maximum of 201.7cGy was obtained in the treatment planning system, and a maximum of 200.6cGy was obtained in the TLD. In all treatment planning systems, 0cGy was calculated from Point G, which is a point 17.5cm downward from the breast interface. As a result of TLD, a maximum of 2.6cGy was obtained at Point G, and a maximum of 0.9cGy was obtained at Point J, which is the ovarian dose, and the absolute dose was 0.3%~1.3%. The difference in dose according to the use of lead aprons was from a maximum of 2.1cGy to a minimum of 0.1cGy, and the %Diff value was 0.1%~1.1%. Conclusion: In the treatment planning system, the difference in dose according to the three treatment plans did not show a significant difference from 0.85% to 2.45%. In the ovary, the difference between the Rando humanoid phantom's treatment planning system and the actual measured dose was within 0.9%, and the actual measured dose was slightly higher. This did not accurately reflect the effect of scattered radiation in the treatment planning system, and it is thought that the dose of scattered radiation and the dose taken by CBCT with TLD inserted were reflected in the actual measurement. In dosimetry according to the with or without a lead apron, when a lead apron was used, the closer the distance from the treatment range, the more effective the shielding was. Although it is not clinically appropriate for pregnancy or artificial insemination during radiotherapy, the dose irradiated to the ovaries during treatment is not expected to significantly affect the reproductive function of women of childbearing age after radiotherapy. However, since women of childbearing age have constant anxiety, it is thought that psychological stability can be promoted by presenting the data from this study.