• Title/Summary/Keyword: Random regression model

Search Result 499, Processing Time 0.025 seconds

An Analysis on Determinants of the Capesize Freight Rate and Forecasting Models (케이프선 시장 운임의 결정요인 및 운임예측 모형 분석)

  • Lim, Sang-Seop;Yun, Hee-Sung
    • Journal of Navigation and Port Research
    • /
    • v.42 no.6
    • /
    • pp.539-545
    • /
    • 2018
  • In recent years, research on shipping market forecasting with the employment of non-linear AI models has attracted significant interest. In previous studies, input variables were selected with reference to past papers or by relying on the intuitions of the researchers. This paper attempts to address this issue by applying the stepwise regression model and the random forest model to the Cape-size bulk carrier market. The Cape market was selected due to the simplicity of its supply and demand structure. The preliminary selection of the determinants resulted in 16 variables. In the next stage, 8 features from the stepwise regression model and 10 features from the random forest model were screened as important determinants. The chosen variables were used to test both models. Based on the analysis of the models, it was observed that the random forest model outperforms the stepwise regression model. This research is significant because it provides a scientific basis which can be used to find the determinants in shipping market forecasting, and utilize a machine-learning model in the process. The results of this research can be used to enhance the decisions of chartering desks by offering a guideline for market analysis.

Predicting claim size in the auto insurance with relative error: a panel data approach (상대오차예측을 이용한 자동차 보험의 손해액 예측: 패널자료를 이용한 연구)

  • Park, Heungsun
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.5
    • /
    • pp.697-710
    • /
    • 2021
  • Relative error prediction is preferred over ordinary prediction methods when relative/percentile errors are regarded as important, especially in econometrics, software engineering and government official statistics. The relative error prediction techniques have been developed in linear/nonlinear regression, nonparametric regression using kernel regression smoother, and stationary time series models. However, random effect models have not been used in relative error prediction. The purpose of this article is to extend relative error prediction to some of generalized linear mixed model (GLMM) with panel data, which is the random effect models based on gamma, lognormal, or inverse gaussian distribution. For better understanding, the real auto insurance data is used to predict the claim size, and the best predictor and the best relative error predictor are comparatively illustrated.

Kernel Regression Estimation for Permutation Fixed Design Additive Models

  • Baek, Jangsun;Wehrly, Thomas E.
    • Journal of the Korean Statistical Society
    • /
    • v.25 no.4
    • /
    • pp.499-514
    • /
    • 1996
  • Consider an additive regression model of Y on X = (X$_1$,X$_2$,. . .,$X_p$), Y = $sum_{j=1}^pf_j(X_j) + $\varepsilon$$, where $f_j$s are smooth functions to be estimated and $\varepsilon$ is a random error. If $X_j$s are fixed design points, we call it the fixed design additive model. Since the response variable Y is observed at fixed p-dimensional design points, the behavior of the nonparametric regression estimator depends on the design. We propose a fixed design called permutation fixed design, and fit the regression function by the kernel method. The estimator in the permutation fixed design achieves the univariate optimal rate of convergence in mean squared error for any p $\geq$ 2.

  • PDF

Random Regression Models Are Suitable to Substitute the Traditional 305-Day Lactation Model in Genetic Evaluations of Holstein Cattle in Brazil

  • Padilha, Alessandro Haiduck;Cobuci, Jaime Araujo;Costa, Claudio Napolis;Neto, Jose Braccini
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.29 no.6
    • /
    • pp.759-767
    • /
    • 2016
  • The aim of this study was to compare two random regression models (RRM) fitted by fourth ($RRM_4$) and fifth-order Legendre polynomials ($RRM_5$) with a lactation model (LM) for evaluating Holstein cattle in Brazil. Two datasets with the same animals were prepared for this study. To apply test-day RRM and LMs, 262,426 test day records and 30,228 lactation records covering 305 days were prepared, respectively. The lowest values of Akaike's information criterion, Bayesian information criterion, and estimates of the maximum of the likelihood function (-2LogL) were for $RRM_4$. Heritability for 305-day milk yield (305MY) was 0.23 ($RRM_4$), 0.24 ($RRM_5$), and 0.21 (LM). Heritability, additive genetic and permanent environmental variances of test days on days in milk was from 0.16 to 0.27, from 3.76 to 6.88 and from 11.12 to 20.21, respectively. Additive genetic correlations between test days ranged from 0.20 to 0.99. Permanent environmental correlations between test days were between 0.07 and 0.99. Standard deviations of average estimated breeding values (EBVs) for 305MY from $RRM_4$ and $RRM_5$ were from 11% to 30% higher for bulls and around 28% higher for cows than that in LM. Rank correlations between RRM EBVs and LM EBVs were between 0.86 to 0.96 for bulls and 0.80 to 0.87 for cows. Average percentage of gain in reliability of EBVs for 305-day yield increased from 4% to 17% for bulls and from 23% to 24% for cows when reliability of EBVs from RRM models was compared to those from LM model. Random regression model fitted by fourth order Legendre polynomials is recommended for genetic evaluations of Brazilian Holstein cattle because of the higher reliability in the estimation of breeding values.

A Survival Prediction Model of Rats in Uncontrolled Acute Hemorrhagic Shock Using the Random Forest Classifier (랜덤 포리스트를 이용한 비제어 급성 출혈성 쇼크의 흰쥐에서의 생존 예측)

  • Choi, J.Y.;Kim, S.K.;Koo, J.M.;Kim, D.W.
    • Journal of Biomedical Engineering Research
    • /
    • v.33 no.3
    • /
    • pp.148-154
    • /
    • 2012
  • Hemorrhagic shock is a primary cause of deaths resulting from injury in the world. Although many studies have tried to diagnose accurately hemorrhagic shock in the early stage, such attempts were not successful due to compensatory mechanisms of humans. The objective of this study was to construct a survival prediction model of rats in acute hemorrhagic shock using a random forest (RF) model. Heart rate (HR), mean arterial pressure (MAP), respiration rate (RR), lactate concentration (LC), and peripheral perfusion (PP) measured in rats were used as input variables for the RF model and its performance was compared with that of a logistic regression (LR) model. Before constructing the models, we performed 5-fold cross validation for RF variable selection, and forward stepwise variable selection for the LR model to examine which variables were important for the models. For the LR model, sensitivity, specificity, accuracy, and area under the receiver operating characteristic curve (ROC-AUC) were 0.83, 0.95, 0.88, and 0.96, respectively. For the RF models, sensitivity, specificity, accuracy, and AUC were 0.97, 0.95, 0.96, and 0.99, respectively. In conclusion, the RF model was superior to the LR model for survival prediction in the rat model.

Prediction on Busan's Gross Product and Employment of Major Industry with Logistic Regression and Machine Learning Model (로지스틱 회귀모형과 머신러닝 모형을 활용한 주요산업의 부산 지역총생산 및 고용 효과 예측)

  • Chae-Deug Yi
    • Korea Trade Review
    • /
    • v.47 no.2
    • /
    • pp.69-88
    • /
    • 2022
  • This paper aims to predict Busan's regional product and employment using the logistic regression models and machine learning models. The following are the main findings of the empirical analysis. First, the OLS regression model shows that the main industries such as electricity and electronics, machine and transport, and finance and insurance affect the Busan's income positively. Second, the binomial logistic regression models show that the Busan's strategic industries such as the future transport machinery, life-care, and smart marine industries contribute on the Busan's income in large order. Third, the multinomial logistic regression models show that the Korea's main industries such as the precise machinery, transport equipment, and machinery influence the Busan's economy positively. And Korea's exports and the depreciation can affect Busan's economy more positively at the higher employment level. Fourth, the voting ensemble model show the higher predictive power than artificial neural network model and support vector machine models. Furthermore, the gradient boosting model and the random forest show the higher predictive power than the voting model in large order.

Analysis of Break in Presence During Game Play Using a Linear Mixed Model

  • Chung, Jae-Yong;Yoon, Hwan-Jin;Gardne, Henry J.
    • ETRI Journal
    • /
    • v.32 no.5
    • /
    • pp.687-694
    • /
    • 2010
  • Breaks in presence (BIP) are those moments during virtual environment (VE) exposure in which participants become aware of their real world setting and their sense of presence in the VE becomes disrupted. In this study, we investigate participants' experience when they encounter technical anomalies during game play. We induced four technical anomalies and compared the BIP responses of a navigation mode game to that of a combat mode game. In our analysis, we applied a linear mixed model (LMM) and compared the results with those of a conventional regression model. Results indicate that participants felt varied levels of impact and recovery when experiencing the various technical anomalies. The impact of BIPs was clearly affected by the game mode, whereas recovery appears to be independent of game mode. The results obtained using the LMM did not differ significantly from those obtained using the general regression model; however, it was shown that treatment effects could be improved by consideration of random effects in the regression model.

Ensemble Deep Learning Model using Random Forest for Patient Shock Detection

  • Minsu Jeong;Namhwa Lee;Byuk Sung Ko;Inwhee Joe
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.4
    • /
    • pp.1080-1099
    • /
    • 2023
  • Digital healthcare combined with telemedicine services in the form of convergence with digital technology and AI is developing rapidly. Digital healthcare research is being conducted on many conditions including shock. However, the causes of shock are diverse, and the treatment is very complicated, requiring a high level of medical knowledge. In this paper, we propose a shock detection method based on the correlation between shock and data extracted from hemodynamic monitoring equipment. From the various parameters expressed by this equipment, four parameters closely related to patient shock were used as the input data for a machine learning model in order to detect the shock. Using the four parameters as input data, that is, feature values, a random forest-based ensemble machine learning model was constructed. The value of the mean arterial pressure was used as the correct answer value, the so called label value, to detect the patient's shock state. The performance was then compared with the decision tree and logistic regression model using a confusion matrix. The average accuracy of the random forest model was 92.80%, which shows superior performance compared to other models. We look forward to our work playing a role in helping medical staff by making recommendations for the diagnosis and treatment of complex and difficult cases of shock.

Dirichlet Process Mixtures of Linear Mixed Regressions

  • Kyung, Minjung
    • Communications for Statistical Applications and Methods
    • /
    • v.22 no.6
    • /
    • pp.625-637
    • /
    • 2015
  • We develop a Bayesian clustering procedure based on a Dirichlet process prior with cluster specific random effects. Gibbs sampling of a normal mixture of linear mixed regressions with a Dirichlet process was implemented to calculate posterior probabilities when the number of clusters was unknown. Our approach (unlike its counterparts) provides simultaneous partitioning and parameter estimation with the computation of the classification probabilities. A Monte Carlo study of curve estimation results showed that the model was useful for function estimation. We find that the proposed Dirichlet process mixture model with cluster specific random effects detects clusters sensitively by combining vague edges into different clusters. Examples are given to show how these models perform on real data.

Asymptotic Distribution of the LM Test Statistic for the Nested Error Component Regression Model

  • Jung, Byoung-Cheol;Myoungshic Jhun;Song, Seuck-Heun
    • Journal of the Korean Statistical Society
    • /
    • v.28 no.4
    • /
    • pp.489-501
    • /
    • 1999
  • In this paper, we consider the panel data regression model in which the disturbances have nested error component. We derive a Lagrange Multiplier(LM) test which is jointly testing for the presence of random individual effects and nested effects under the normality assumption of the disturbances. This test extends the earlier work of Breusch and Pagan(1980) and Baltagi and Li(1991). Further, it is shown that this LM test has the same asymptotic distribution without normality assumption of the disturbances.

  • PDF