Search | Korea Science

Kernel-Trick Regression and Classification

Huh, Myung-Hoe
- Communications for Statistical Applications and Methods
- /
- v.22 no.2
- /
- pp.201-207
- /
- 2015
Support vector machine (SVM) is a well known kernel-trick supervised learning tool. This study proposes a working scheme for kernel-trick regression and classification (KtRC) as a SVM alternative. KtRC fits the model on a number of random subsamples and selects the best model. Empirical examples and a simulation study indicate that KtRC's performance is comparable to SVM.
https://doi.org/10.5351/CSAM.2015.22.2.201 인용 PDF KSCI

Predictive Model of Optimal Continuous Positive Airway Pressure for Obstructive Sleep Apnea Patients with Obesity by Using Machine Learning (비만 폐쇄수면무호흡 환자에서 기계학습을 통한 적정양압 예측모형)

Kim, Seung Soo;Yang, Kwang Ik
- Journal of Sleep Medicine
- /
- v.15 no.2
- /
- pp.48-54
- /
- 2018
Objectives: The aim of this study was to develop a predicting model for the optimal continuous positive airway pressure (CPAP) for obstructive sleep apnea (OSA) patient with obesity by using a machine learning. Methods: We retrospectively investigated the medical records of 162 OSA patients who had obesity [body mass index (BMI) ≥ 25] and undertaken successful CPAP titration study. We divided the data to a training set (90%) and a test set (10%), randomly. We made a random forest model and a least absolute shrinkage and selection operator (lasso) regression model to predict the optimal pressure by using the training set, and then applied our models and previous reported equations to the test set. To compare the fitness of each models, we used a correlation coefficient (CC) and a mean absolute error (MAE). Results: The random forest model showed the best performance {CC 0.78 [95% confidence interval (CI) 0.43-0.93], MAE 1.20}. The lasso regression model also showed the improved result [CC 0.78 (95% CI 0.42-0.93), MAE 1.26] compared to the Hoffstein equation [CC 0.68 (95% CI 0.23-0.89), MAE 1.34] and the Choi's equation [CC 0.72 (95% CI 0.30-0.90), MAE 1.40]. Conclusions: Our random forest model and lasso model ($26.213+0.084{\times}BMI+0.004{\times}$apnea-hypopnea index+$0.004{\times}oxygen$ desaturation index-$0.215{\times}mean$ oxygen saturation) showed the improved performance compared to the previous reported equations. The further study for other subgroup or phenotype of OSA is required.
https://doi.org/10.13078/jsm.18012 인용

Bayesian Inference for Censored Panel Regression Model

Lee, Seung-Chun;Choi, Byongsu
- Communications for Statistical Applications and Methods
- /
- v.21 no.2
- /
- pp.193-200
- /
- 2014
It was recognized by some researchers that the disturbance variance in a censored regression model is frequently underestimated by the maximum likelihood method. This underestimation has implications for the estimation of marginal effects and asymptotic standard errors. For instance, the actual coverage probability of the confidence interval based on a maximum likelihood estimate can be significantly smaller than the nominal confidence level; consequently, a Bayesian estimation is considered to overcome this difficulty. The behaviors of the maximum likelihood and Bayesian estimators of disturbance variance are examined in a fixed effects panel regression model with a limited dependent variable, which is known to have the incidental parameter problem. Behavior under random effect assumption is also investigated.
https://doi.org/10.5351/CSAM.2014.21.2.193 인용 PDF KSCI

System Identification of a Diesel Engine -Throttle-Smoke Response- (디젤 기관(機關)의 계통식별(系統識別) -연료주입율(燃料注入率) 대(對) 매연반응(煤煙反應)-)

Cho, H.K.
- Journal of Biosystems Engineering
- /
- v.16 no.2
- /
- pp.111-117
- /
- 1991
An empirical model for diesel engine control was obtained using a system identification method. A pseudo-random binary sequence was used as an input signal. Spectral anaylsis was used to find the frequency response of system. Model parameters of transfer functions were obtained using nonlinear regression.
PDF

Genetic Parameters for Milk Yield and Lactation Persistency Using Random Regression Models in Girolando Cattle

Canaza-Cayo, Ali William;Lopes, Paulo Savio;da Silva, Marcos Vinicius Gualberto Barbosa;de Almeida Torres, Robledo;Martins, Marta Fonseca;Arbex, Wagner Antonio;Cobuci, Jaime Araujo
- Asian-Australasian Journal of Animal Sciences
- /
- v.28 no.10
- /
- pp.1407-1418
- /
- 2015
A total of 32,817 test-day milk yield (TDMY) records of the first lactation of 4,056 Girolando cows daughters of 276 sires, collected from 118 herds between 2000 and 2011 were utilized to estimate the genetic parameters for TDMY via random regression models (RRM) using Legendre's polynomial functions whose orders varied from 3 to 5. In addition, nine measures of persistency in milk yield ($PS_i$) and the genetic trend of 305-day milk yield (305MY) were evaluated. The fit quality criteria used indicated RRM employing the Legendre's polynomial of orders 3 and 5 for fitting the genetic additive and permanent environment effects, respectively, as the best model. The heritability and genetic correlation for TDMY throughout the lactation, obtained with the best model, varied from 0.18 to 0.23 and from -0.03 to 1.00, respectively. The heritability and genetic correlation for persistency and 305MY varied from 0.10 to 0.33 and from -0.98 to 1.00, respectively. The use of $PS_7$ would be the most suitable option for the evaluation of Girolando cattle. The estimated breeding values for 305MY of sires and cows showed significant and positive genetic trends. Thus, the use of selection indices would be indicated in the genetic evaluation of Girolando cattle for both traits.
https://doi.org/10.5713/ajas.14.0620 인용 PDF KSCI

Nonparametric Estimators for Percentile Regression Functions

Jee, Eun-Sook
- The Mathematical Education
- /
- v.30 no.1
- /
- pp.47-50
- /
- 1991
We consider the .regression model H = h(x) + E, where h is an unknown smooth regression function ard E is the random error with unknown distribution F. in this context we present and eamine the asymptotic behavior of some nonparametric estimators for the percentile functions ζ$\_$p/(x)+ζ$\_$p/, where 0 < p < 1 and ζ$\_$p/ = inf {x : F{x} $\geq$ p}
PDF

A Design and Implement of Efficient Agricultural Product Price Prediction Model

Im, Jung-Ju;Kim, Tae-Wan;Lim, Ji-Seoup;Kim, Jun-Ho;Yoo, Tae-Yong;Lee, Won Joo
- Journal of the Korea Society of Computer and Information
- /
- v.27 no.5
- /
- pp.29-36
- /
- 2022
In this paper, we propose an efficient agricultural products price prediction model based on dataset which provided in DACON. This model is XGBoost and CatBoost, and as an algorithm of the Gradient Boosting series, the average accuracy and execution time are superior to the existing Logistic Regression and Random Forest. Based on these advantages, we design a machine learning model that predicts prices 1 week, 2 weeks, and 4 weeks from the previous prices of agricultural products. The XGBoost model can derive the best performance by adjusting hyperparameters using the XGBoost Regressor library, which is a regression model. The implemented model is verified using the API provided by DACON, and performance evaluation is performed for each model. Because XGBoost conducts its own overfitting regulation, it derives excellent performance despite a small dataset, but it was found that the performance was lower than LGBM in terms of temporal performance such as learning time and prediction time.
https://doi.org/10.9708/jksci.2022.27.05.029 인용 PDF KSCI HTML

Comparative study of prediction models for corporate bond rating (국내 회사채 신용 등급 예측 모형의 비교 연구)

Park, Hyeongkwon;Kang, Junyoung;Heo, Sungwook;Yu, Donghyeon
- The Korean Journal of Applied Statistics
- /
- v.31 no.3
- /
- pp.367-382
- /
- 2018
Prediction models for a corporate bond rating in existing studies have been developed using various models such as linear regression, ordered logit, and random forest. Financial characteristics help build prediction models that are expected to be contained in the assigning model of the bond rating agencies. However, the ranges of bond ratings in existing studies vary from 5 to 20 and the prediction models were developed with samples in which the target companies and the observation periods are different. Thus, a simple comparison of the prediction accuracies in each study cannot determine the best prediction model. In order to conduct a fair comparison, this study has collected corporate bond ratings and financial characteristics from 2013 to 2017 and applied prediction models to them. In addition, we applied the elastic-net penalty for the linear regression, the ordered logit, and the ordered probit. Our comparison shows that data-driven variable selection using the elastic-net improves prediction accuracy in each corresponding model, and that the random forest is the most appropriate model in terms of prediction accuracy, which obtains 69.6% accuracy of the exact rating prediction on average from the 5-fold cross validation.
https://doi.org/10.5351/KJAS.2018.31.3.367 인용 PDF KSCI

Care Cost Prediction Model for Orphanage Organizations in Saudi Arabia

Alhazmi, Huda N;Alghamdi, Alshymaa;Alajlani, Fatimah;Abuayied, Samah;Aldosari, Fahd M
- International Journal of Computer Science & Network Security
- /
- v.21 no.4
- /
- pp.84-92
- /
- 2021
Care services are a significant asset in human life. Care in its overall nature focuses on human needs and covers several aspects such as health care, homes, personal care, and education. In fact, care deals with many dimensions: physical, psychological, and social interconnections. Very little information is available on estimating the cost of care services that provided to orphans and abandoned children. Prediction of the cost of the care system delivered by governmental or non-governmental organizations to support orphans and abandoned children is increasingly needed. The purpose of this study is to analyze the care cost for orphanage organizations in Saudi Arabia to forecast the cost as well as explore the most influence factor on the cost. By using business analytic process that applied statistical and machine learning techniques, we proposed a model includes simple linear regression, Naive Bayes classifier, and Random Forest algorithms. The finding of our predictive model shows that Naive Bayes has addressed the highest accuracy equals to 87% in predicting the total care cost. Our model offers predictive approach in the perspective of business analytics.
https://doi.org/10.22937/IJCSNS.2021.21.4.13 인용 PDF KSCI

A Study on the Development of Model for Estimating the Thickness of Clay Layer of Soft Ground in the Nakdong River Estuary (낙동강 조간대 연약지반의 지역별 점성토층 두께 추정 모델 개발에 관한 연구)

Seongin, Ahn;Dong-Woo, Ryu
- Tunnel and Underground Space
- /
- v.32 no.6
- /
- pp.586-597
- /
- 2022
In this study, a model was developed for the estimating the locational thickness information of the upper clay layer to be used for the consolidation vulnerability evaluation in the Nakdong river estuary. To estimate ground layer thickness information, we developed four spatial estimation models using machine learning algorithms, which are RF (Random Forest), SVR (Support Vector Regression) and GPR (Gaussian Process Regression), and geostatistical technique such as Ordinary Kriging. Among the 4,712 borehole data in the study area collected for model development, 2,948 borehole data with an upper clay layer were used, and Pearson correlation coefficient and mean squared error were used to quantitatively evaluate the performance of the developed models. In addition, for qualitative evaluation, each model was used throughout the study area to estimate the information of the upper clay layer, and the thickness distribution characteristics of it were compared with each other.
https://doi.org/10.7474/TUS.2022.32.6.586 인용 PDF KSCI

Search Result 494, Processing Time 0.053 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)