• Title/Summary/Keyword: 부스팅 알고리즘

Search Result 54, Processing Time 0.019 seconds

A Study on the Prediction of Disc Cutter Wear Using TBM Data and Machine Learning Algorithm (TBM 데이터와 머신러닝 기법을 이용한 디스크 커터마모 예측에 관한 연구)

  • Tae-Ho, Kang;Soon-Wook, Choi;Chulho, Lee;Soo-Ho, Chang
    • Tunnel and Underground Space
    • /
    • v.32 no.6
    • /
    • pp.502-517
    • /
    • 2022
  • As the use of TBM increases, research has recently increased to to analyze TBM data with machine learning techniques to predict the exchange cycle of disc cutters, and predict the advance rate of TBM. In this study, a regression prediction of disc cutte wear of slurry shield TBM site was made by combining machine learning based on the machine data and the geotechnical data obtained during the excavation. The data were divided into 7:3 for training and testing the prediction of disc cutter wear, and the hyper-parameters are optimized by cross-validated grid-search over a parameter grid. As a result, gradient boosting based on the ensemble model showed good performance with a determination coefficient of 0.852 and a root-mean-square-error of 3.111 and especially excellent results in fit times along with learning performance. Based on the results, it is judged that the suitability of the prediction model using data including mechanical data and geotechnical information is high. In addition, research is needed to increase the diversity of ground conditions and the amount of disc cutter data.

The Prediction of Survival of Breast Cancer Patients Based on Machine Learning Using Health Insurance Claim Data (건강보험 청구 데이터를 활용한 머신러닝 기반유방암 환자의 생존 여부 예측)

  • Doeggyu Lee;Kyungkeun Byun;Hyungdong Lee;Sunhee Shin
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.2
    • /
    • pp.1-9
    • /
    • 2023
  • Research using AI and big data is also being actively conducted in the health and medical fields such as disease diagnosis and treatment. Most of the existing research data used cohort data from research institutes or some patient data. In this paper, the difference in the prediction rate of survival and the factors affecting survival between breast cancer patients in their 40~50s and other age groups was revealed using health insurance review claim data held by the HIRA. As a result, the accuracy of predicting patients' survival was 0.93 on average in their 40~50s, higher than 0.86 in their 60~80s. In terms of that factor, the number of treatments was high for those in their 40~50s, and age was high for those in their 60~80s. Performance comparison with previous studies, the average precision was 0.90, which was higher than 0.81 of the existing paper. As a result of performance comparison by applied algorithm, the overall average precision of Decision Tree, Random Forest, and Gradient Boosting was 0.90, and the recall was 1.0, and the precision of multi-layer perceptrons was 0.89, and the recall was 1.0. I hope that more research will be conducted using machine learning automation(Auto ML) tools for non-professionals to enhance the use of the value for health insurance review claim data held by the HIRA.

A Study on Risk Parity Asset Allocation Model with XGBoos (XGBoost를 활용한 리스크패리티 자산배분 모형에 관한 연구)

  • Kim, Younghoon;Choi, HeungSik;Kim, SunWoong
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.135-149
    • /
    • 2020
  • Artificial intelligences are changing world. Financial market is also not an exception. Robo-Advisor is actively being developed, making up the weakness of traditional asset allocation methods and replacing the parts that are difficult for the traditional methods. It makes automated investment decisions with artificial intelligence algorithms and is used with various asset allocation models such as mean-variance model, Black-Litterman model and risk parity model. Risk parity model is a typical risk-based asset allocation model which is focused on the volatility of assets. It avoids investment risk structurally. So it has stability in the management of large size fund and it has been widely used in financial field. XGBoost model is a parallel tree-boosting method. It is an optimized gradient boosting model designed to be highly efficient and flexible. It not only makes billions of examples in limited memory environments but is also very fast to learn compared to traditional boosting methods. It is frequently used in various fields of data analysis and has a lot of advantages. So in this study, we propose a new asset allocation model that combines risk parity model and XGBoost machine learning model. This model uses XGBoost to predict the risk of assets and applies the predictive risk to the process of covariance estimation. There are estimated errors between the estimation period and the actual investment period because the optimized asset allocation model estimates the proportion of investments based on historical data. these estimated errors adversely affect the optimized portfolio performance. This study aims to improve the stability and portfolio performance of the model by predicting the volatility of the next investment period and reducing estimated errors of optimized asset allocation model. As a result, it narrows the gap between theory and practice and proposes a more advanced asset allocation model. In this study, we used the Korean stock market price data for a total of 17 years from 2003 to 2019 for the empirical test of the suggested model. The data sets are specifically composed of energy, finance, IT, industrial, material, telecommunication, utility, consumer, health care and staple sectors. We accumulated the value of prediction using moving-window method by 1,000 in-sample and 20 out-of-sample, so we produced a total of 154 rebalancing back-testing results. We analyzed portfolio performance in terms of cumulative rate of return and got a lot of sample data because of long period results. Comparing with traditional risk parity model, this experiment recorded improvements in both cumulative yield and reduction of estimated errors. The total cumulative return is 45.748%, about 5% higher than that of risk parity model and also the estimated errors are reduced in 9 out of 10 industry sectors. The reduction of estimated errors increases stability of the model and makes it easy to apply in practical investment. The results of the experiment showed improvement of portfolio performance by reducing the estimated errors of the optimized asset allocation model. Many financial models and asset allocation models are limited in practical investment because of the most fundamental question of whether the past characteristics of assets will continue into the future in the changing financial market. However, this study not only takes advantage of traditional asset allocation models, but also supplements the limitations of traditional methods and increases stability by predicting the risks of assets with the latest algorithm. There are various studies on parametric estimation methods to reduce the estimated errors in the portfolio optimization. We also suggested a new method to reduce estimated errors in optimized asset allocation model using machine learning. So this study is meaningful in that it proposes an advanced artificial intelligence asset allocation model for the fast-developing financial markets.

Building battery deterioration prediction model using real field data (머신러닝 기법을 이용한 납축전지 열화 예측 모델 개발)

  • Choi, Keunho;Kim, Gunwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.243-264
    • /
    • 2018
  • Although the worldwide battery market is recently spurring the development of lithium secondary battery, lead acid batteries (rechargeable batteries) which have good-performance and can be reused are consumed in a wide range of industry fields. However, lead-acid batteries have a serious problem in that deterioration of a battery makes progress quickly in the presence of that degradation of only one cell among several cells which is packed in a battery begins. To overcome this problem, previous researches have attempted to identify the mechanism of deterioration of a battery in many ways. However, most of previous researches have used data obtained in a laboratory to analyze the mechanism of deterioration of a battery but not used data obtained in a real world. The usage of real data can increase the feasibility and the applicability of the findings of a research. Therefore, this study aims to develop a model which predicts the battery deterioration using data obtained in real world. To this end, we collected data which presents change of battery state by attaching sensors enabling to monitor the battery condition in real time to dozens of golf carts operated in the real golf field. As a result, total 16,883 samples were obtained. And then, we developed a model which predicts a precursor phenomenon representing deterioration of a battery by analyzing the data collected from the sensors using machine learning techniques. As initial independent variables, we used 1) inbound time of a cart, 2) outbound time of a cart, 3) duration(from outbound time to charge time), 4) charge amount, 5) used amount, 6) charge efficiency, 7) lowest temperature of battery cell 1 to 6, 8) lowest voltage of battery cell 1 to 6, 9) highest voltage of battery cell 1 to 6, 10) voltage of battery cell 1 to 6 at the beginning of operation, 11) voltage of battery cell 1 to 6 at the end of charge, 12) used amount of battery cell 1 to 6 during operation, 13) used amount of battery during operation(Max-Min), 14) duration of battery use, and 15) highest current during operation. Since the values of the independent variables, lowest temperature of battery cell 1 to 6, lowest voltage of battery cell 1 to 6, highest voltage of battery cell 1 to 6, voltage of battery cell 1 to 6 at the beginning of operation, voltage of battery cell 1 to 6 at the end of charge, and used amount of battery cell 1 to 6 during operation are similar to that of each battery cell, we conducted principal component analysis using verimax orthogonal rotation in order to mitigate the multiple collinearity problem. According to the results, we made new variables by averaging the values of independent variables clustered together, and used them as final independent variables instead of origin variables, thereby reducing the dimension. We used decision tree, logistic regression, Bayesian network as algorithms for building prediction models. And also, we built prediction models using the bagging of each of them, the boosting of each of them, and RandomForest. Experimental results show that the prediction model using the bagging of decision tree yields the best accuracy of 89.3923%. This study has some limitations in that the additional variables which affect the deterioration of battery such as weather (temperature, humidity) and driving habits, did not considered, therefore, we would like to consider the them in the future research. However, the battery deterioration prediction model proposed in the present study is expected to enable effective and efficient management of battery used in the real filed by dramatically and to reduce the cost caused by not detecting battery deterioration accordingly.