• Title/Summary/Keyword: Random regression model

Search Result 499, Processing Time 0.03 seconds

Estimation of Genetic Parameters for Milk Production Traits Using a Random Regression Test-day Model in Holstein Cows in Korea

  • Kim, Byeong-Woo;Lee, Deukhwan;Jeon, Jin-Tae;Lee, Jung-Gyu
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.22 no.7
    • /
    • pp.923-930
    • /
    • 2009
  • This study was conducted to compare three models: two random regression models with and without considering heterogeneity in the residual variances and a lactation model (LM) for evaluating the genetic ability of Holstein cows in Korea. Two datasets were prepared for this study. To apply the test-day random regression model, 94,390 test-day records were prepared from 15,263 cows. The second data set consisted of 14,704 lactation records covering milk production over 305 days. Raw milk yield and composition data were collected from 1998 to 2002 by the National Agricultural Cooperative Federation' dairy cattle improvement center by way of its milk testing program, which is nationally based. The pedigree information for this analysis was collected by the Korean Animal Improvement Association. The random regression models (RRMs) are single-trait animal models that consider each lactation record as an independent trait. Estimates of covariance were assumed to be different ones. In order to consider heterogeneity of residual variance in the analysis, test-days were classified into 29 classes. By considering heterogeneity of residual variance, variation for lactation performance in the early lactation classes was higher than during the middle classes and variance was lower in the late lactation classes than in the other two classes. This may be due to feeding management system and physiological properties of Holstein cows in Korea. Over classes e6 to e26 (covering 61 to 270 DIM), there was little change in residual variance, suggesting that a model with homogeneity of variance be used restricting the data to these days only. Estimates of heritability for milk yield ranged from 0.154 to 0.455, for which the estimates were variable depending on different lactation periods. Most of the heritabilities for milk yield using the RRM were higher than in the lactation model, and the estimate of genetic variance of milk yield was lower in the late lactation period than in the early or middle periods.

Logistic regression model for major separation rate

  • Choi, Jae-Sung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.13 no.2
    • /
    • pp.129-138
    • /
    • 2002
  • This paper deals with logistic regression models for analysing separation rates from majors. The model building procedure shows how to incoporate the effects of some factors causing from three-way nested sampling scheme and discusses what type of characteristics as independent variables directly affecting the rates should be considered.

  • PDF

City Gas Pipeline Pressure Prediction Model (도시가스 배관압력 예측모델)

  • Chung, Won Hee;Park, Giljoo;Gu, Yeong Hyeon;Kim, Sunghyun;Yoo, Seong Joon;Jo, Young-do
    • The Journal of Society for e-Business Studies
    • /
    • v.23 no.2
    • /
    • pp.33-47
    • /
    • 2018
  • City gas pipelines are buried underground. Because of this, pipeline is hard to manage, and can be easily damaged. This research proposes a real time prediction system that helps experts can make decision about pressure anomalies. The gas pipline pressure data of Jungbu City Gas Company, which is one of the domestic city gas suppliers, time variables and environment variables are analysed. In this research, regression models that predicts pipeline pressure in minutes are proposed. Random forest, support vector regression (SVR), long-short term memory (LSTM) algorithms are used to build pressure prediction models. A comparison of pressure prediction models' preformances shows that the LSTM model was the best. LSTM model for Asan-si have root mean square error (RMSE) 0.011, mean absolute percentage error (MAPE) 0.494. LSTM model for Cheonan-si have RMSE 0.015, MAPE 0.668.

Prediction of fine dust PM10 using a deep neural network model (심층 신경망모형을 사용한 미세먼지 PM10의 예측)

  • Jeon, Seonghyeon;Son, Young Sook
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.2
    • /
    • pp.265-285
    • /
    • 2018
  • In this study, we applied a deep neural network model to predict four grades of fine dust $PM_{10}$, 'Good, Moderate, Bad, Very Bad' and two grades, 'Good or Moderate and Bad or Very Bad'. The deep neural network model and existing classification techniques (such as neural network model, multinomial logistic regression model, support vector machine, and random forest) were applied to fine dust daily data observed from 2010 to 2015 in six major metropolitan areas of Korea. Data analysis shows that the deep neural network model outperforms others in the sense of accuracy.

ASYMPTOTIC NORMALITY OF ESTIMATOR IN NON-PARAMETRIC MODEL UNDER CENSORED SAMPLES

  • Niu, Si-Li;Li, Qlan-Ru
    • Journal of the Korean Mathematical Society
    • /
    • v.44 no.3
    • /
    • pp.525-539
    • /
    • 2007
  • Consider the regression model $Y_i=g(x_i)+e_i\;for\;i=1,\;2,\;{\ldots},\;n$, where: (1) $x_i$ are fixed design points, (2) $e_i$ are independent random errors with mean zero, (3) g($\cdot$) is unknown regression function defined on [0, 1]. Under $Y_i$ are censored randomly, we discuss the asymptotic normality of the weighted kernel estimators of g when the censored distribution function is known or unknown.

Variable Selection in Linear Random Effects Models for Normal Data

  • Kim, Hea-Jung
    • Journal of the Korean Statistical Society
    • /
    • v.27 no.4
    • /
    • pp.407-420
    • /
    • 1998
  • This paper is concerned with selecting covariates to be included in building linear random effects models designed to analyze clustered response normal data. It is based on a Bayesian approach, intended to propose and develop a procedure that uses probabilistic considerations for selecting premising subsets of covariates. The approach reformulates the linear random effects model in a hierarchical normal and point mass mixture model by introducing a set of latent variables that will be used to identify subset choices. The hierarchical model is flexible to easily accommodate sign constraints in the number of regression coefficients. Utilizing Gibbs sampler, the appropriate posterior probability of each subset of covariates is obtained. Thus, In this procedure, the most promising subset of covariates can be identified as that with highest posterior probability. The procedure is illustrated through a simulation study.

  • PDF

Investment, Export, and Exchange Rate on Prediction of Employment with Decision Tree, Random Forest, and Gradient Boosting Machine Learning Models (투자와 수출 및 환율의 고용에 대한 의사결정 나무, 랜덤 포레스트와 그래디언트 부스팅 머신러닝 모형 예측)

  • Chae-Deug Yi
    • Korea Trade Review
    • /
    • v.46 no.2
    • /
    • pp.281-299
    • /
    • 2021
  • This paper analyzes the feasibility of using machine learning methods to forecast the employment. The machine learning methods, such as decision tree, artificial neural network, and ensemble models such as random forest and gradient boosting regression tree were used to forecast the employment in Busan regional economy. The following were the main findings of the comparison of their predictive abilities. First, the forecasting power of machine learning methods can predict the employment well. Second, the forecasting values for the employment by decision tree models appeared somewhat differently according to the depth of decision trees. Third, the predictive power of artificial neural network model, however, does not show the high predictive power. Fourth, the ensemble models such as random forest and gradient boosting regression tree model show the higher predictive power. Thus, since the machine learning method can accurately predict the employment, we need to improve the accuracy of forecasting employment with the use of machine learning methods.

Development of the Machine Learning-based Employment Prediction Model for Internship Applicants (인턴십 지원자를 위한 기계학습기반 취업예측 모델 개발)

  • Kim, Hyun Soo;Kim, Sunho;Kim, Do Hyun
    • Journal of the Semiconductor & Display Technology
    • /
    • v.21 no.2
    • /
    • pp.138-143
    • /
    • 2022
  • The employment prediction model proposed in this paper uses 16 independent variables, including self-introductions of M University students who applied for IPP and work-study internship, and 3 dependent variable data such as large companies, mid-sized companies, and unemployment. The employment prediction model for large companies was developed using Random Forest and Word2Vec with the result of F1_Weighted 82.4%. The employment prediction model for medium-sized companies and above was developed using Logistic Regression and Word2Vec with the result of F1_Weighted 73.24%. These two models can be actively used in predicting employment in large and medium-sized companies for M University students in the future.

LM Tests in Nested Serially Correlated Error Components Model with Panel Data

  • Song, Seuck-Heun;Jung, Byoung-Cheol;Myoungshic Jhun
    • Journal of the Korean Statistical Society
    • /
    • v.30 no.4
    • /
    • pp.541-550
    • /
    • 2001
  • This paper considers a panel data regression model in which the disturbances follow a nested error components with serial correlation. Given this model, this paper derives several Lagrange Multiplier(LM) testis for the presence of serial correlation as well as random individual effects, nested effects, and for existence of serial correlation given random individual and nested effects.

  • PDF

Effects of Number of Incomplete Data in Latest Generation on the Breeding Value Estimated by Random Regression Model (임의회귀 모형 사용시 마지막 세대의 불완전한 기록이 추정육종가에 미치는 효과)

  • ;;;;;;;;Salces, A.J.
    • Journal of Animal Science and Technology
    • /
    • v.48 no.2
    • /
    • pp.143-150
    • /
    • 2006
  • The data were collected in the dairy herd improvement program from January 2000 to July 2005. Test data included 825,157 records of first parity and animals with both parents known were included. This study aimed to describe the effect of incomplete lactation records of latest generation to the change in sire's breeding value using Random Regression model (RRM) in genetic evaluation. Estimation of genetic parameter and breeding value for sire used REMLF90 and BLUPF90 program. The phenotypic value on the number of test day records between group TD11, TD8, TD5, TD2 showed no large differences. For all the group heritability of test day milk yield range from 0.30 to 0.36. However TD2 group showed low heritability the least test day recode on the latest generation. The correlation of above 50% between test day and TD11(0.610), TD8(0.616), TD5(0.661) and TD2(0.682) with different records in latest generation. Sire's rank of breeding value varied widely depending on the records on the number of lactation from start to the latest generation. Study showed that change in breeding value ranked if daughter's test recode more so it should have at least 5 test day records. The use of RRM in dairy cattle genetic evaluation would be desirable if complete lactation records for latest generation daughters of young bulls when selection for proven bulls. Random Regression model (RRM) require at least 5 test-day lactation recode.