• Title/Summary/Keyword: Bayesian linear regression

Search Result 68, Processing Time 0.031 seconds

Estimation of genetic parameters and trends for production traits of dairy cattle in Thailand using a multiple-trait multiple-lactation test day model

  • Buaban, Sayan;Puangdee, Somsook;Duangjinda, Monchai;Boonkum, Wuttigrai
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.33 no.9
    • /
    • pp.1387-1399
    • /
    • 2020
  • Objective: The objective of this study was to estimate the genetic parameters and trends for milk, fat, and protein yields in the first three lactations of Thai dairy cattle using a 3-trait,-3-lactation random regression test-day model. Methods: Data included 168,996, 63,388, and 27,145 test-day records from the first, second, and third lactations, respectively. Records were from 19,068 cows calving from 1993 to 2013 in 124 herds. (Co) variance components were estimated by Bayesian methods. Gibbs sampling was used to obtain posterior distributions. The model included herd-year-month of testing, breed group-season of calving-month in tested milk group, linear and quadratic age at calving as fixed effects, and random regression coefficients for additive genetic and permanent environmental effects, which were defined as modified constant, linear, quadratic, cubic and quartic Legendre coefficients. Results: Average daily heritabilities ranged from 0.36 to 0.48 for milk, 0.33 to 0.44 for fat and 0.37 to 0.48 for protein yields; they were higher in the third lactation for all traits. Heritabilities of test-day milk and protein yields for selected days in milk were higher in the middle than at the beginning or end of lactation, whereas those for test-day fat yields were high at the beginning and end of lactation. Genetics correlations (305-d yield) among production yields within lactations (0.44 to 0.69) were higher than those across lactations (0.36 to 0.68). The largest genetic correlation was observed between the first and second lactation. The genetic trends of 305-d milk, fat and protein yields were 230 to 250, 25 to 29, and 30 to 35 kg per year, respectively. Conclusion: A random regression model seems to be a flexible and reliable procedure for the genetic evaluation of production yields. It can be used to perform breeding value estimation for national genetic evaluation in the Thai dairy cattle population.

A Study on Regionalization of Parameters of Continuous Rainfall-Runoff Model (연속 강우-유출모형의 매개변수 지역화에 관한 연구)

  • Jeong, Ga-In;Kim, Tae-Jeong;Kwon, Hyun-Han
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2015.05a
    • /
    • pp.182-182
    • /
    • 2015
  • 우리나라에서는 강우관측시스템의 지역적 불균형으로 상대적으로 소규모 저수지의 경우 미계측유역의 특성을 가지며, 신뢰성 있는 강우량, 유출량, 증발량 자료가 매우 부족한 실정이다. 다목적댐 유역과 같은 계측유역의 경우 상류유역의 유입량 자료의 확보가 용이하지만 대부분의 유역의 경우 계측장비가 부족하여 신뢰성이 확보된 유입량 자료를 얻는데 많은 어려움이 있다. 본 연구에서는 미계측유역의 유입량 산정을 위하여 계측유역을 대상으로 강우-유출 모형의 매개변수를 산정하였으며, 산정된 매개변수를 유역특성인자와의 상관성을 토대로 다중선형회귀분석기법(multiple linear regression, MLR)을 적용하여 지역화(regionalization)를 위한 회귀식을 도출하였다. 이를 위해 양질의 유량자료가 확보된 K-water 17개 댐 유역을 대상으로 매개변수를 산정하였으며 이 중 2개의 댐 유역을 미계측유역으로 간주하여 개발된 모형을 검증하였다. 대부분의 통계 지표에서 우수한 모의능력을 확인하였으며, 본 연구를 통하여 개발된 지역화 기법을 미계측유역에 활용한다면 보다 정량적이고 효율적인 수자원 계획이 가능할 것으로 판단된다. 향후 연구로는 불확실성을 고려한 Bayesian GLM 모형을 이용한 지역화기법을 개발하여 매개변수의 불확실성까지 고려할 수 있는 방안을 모색하고자 한다.

  • PDF

Power consumption prediction model based on artificial neural networks for seawater source heat pump system in recirculating aquaculture system fish farm (순환여과식 양식장 해수 열원 히트펌프 시스템의 전력 소비량 예측을 위한 인공 신경망 모델)

  • Hyeon-Seok JEONG;Jong-Hyeok RYU;Seok-Kwon JEONG
    • Journal of the Korean Society of Fisheries and Ocean Technology
    • /
    • v.60 no.1
    • /
    • pp.87-99
    • /
    • 2024
  • This study deals with the application of an artificial neural network (ANN) model to predict power consumption for utilizing seawater source heat pumps of recirculating aquaculture system. An integrated dynamic simulation model was constructed using the TRNSYS program to obtain input and output data for the ANN model to predict the power consumption of the recirculating aquaculture system with a heat pump system. Data obtained from the TRNSYS program were analyzed using linear regression, and converted into optimal data necessary for the ANN model through normalization. To optimize the ANN-based power consumption prediction model, the hyper parameters of ANN were determined using the Bayesian optimization. ANN simulation results showed that ANN models with optimized hyper parameters exhibited acceptably high predictive accuracy conforming to ASHRAE standards.

Comparison of genome-wide association and genomic prediction methods for milk production traits in Korean Holstein cattle

  • Lee, SeokHyun;Dang, ChangGwon;Choy, YunHo;Do, ChangHee;Cho, Kwanghyun;Kim, Jongjoo;Kim, Yousam;Lee, Jungjae
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.32 no.7
    • /
    • pp.913-921
    • /
    • 2019
  • Objective: The objectives of this study were to compare identified informative regions through two genome-wide association study (GWAS) approaches and determine the accuracy and bias of the direct genomic value (DGV) for milk production traits in Korean Holstein cattle, using two genomic prediction approaches: single-step genomic best linear unbiased prediction (ss-GBLUP) and Bayesian Bayes-B. Methods: Records on production traits such as adjusted 305-day milk (MY305), fat (FY305), and protein (PY305) yields were collected from 265,271 first parity cows. After quality control, 50,765 single-nucleotide polymorphic genotypes were available for analysis. In GWAS for ss-GBLUP (ssGWAS) and Bayes-B (BayesGWAS), the proportion of genetic variance for each 1-Mb genomic window was calculated and used to identify informative genomic regions. Accuracy of the DGV was estimated by a five-fold cross-validation with random clustering. As a measure of accuracy for DGV, we also assessed the correlation between DGV and deregressed-estimated breeding value (DEBV). The bias of DGV for each method was obtained by determining regression coefficients. Results: A total of nine and five significant windows (1 Mb) were identified for MY305 using ssGWAS and BayesGWAS, respectively. Using ssGWAS and BayesGWAS, we also detected multiple significant regions for FY305 (12 and 7) and PY305 (14 and 2), respectively. Both single-step DGV and Bayes DGV also showed somewhat moderate accuracy ranges for MY305 (0.32 to 0.34), FY305 (0.37 to 0.39), and PY305 (0.35 to 0.36) traits, respectively. The mean biases of DGVs determined using the single-step and Bayesian methods were $1.50{\pm}0.21$ and $1.18{\pm}0.26$ for MY305, $1.75{\pm}0.33$ and $1.14{\pm}0.20$ for FY305, and $1.59{\pm}0.20$ and $1.14{\pm}0.15$ for PY305, respectively. Conclusion: From the bias perspective, we believe that genomic selection based on the application of Bayesian approaches would be more suitable than application of ss-GBLUP in Korean Holstein populations.

Mapping Landslide Susceptibility Based on Spatial Prediction Modeling Approach and Quality Assessment (공간예측모형에 기반한 산사태 취약성 지도 작성과 품질 평가)

  • Al, Mamun;Park, Hyun-Su;JANG, Dong-Ho
    • Journal of The Geomorphological Association of Korea
    • /
    • v.26 no.3
    • /
    • pp.53-67
    • /
    • 2019
  • The purpose of this study is to identify the quality of landslide susceptibility in a landslide-prone area (Jinbu-myeon, Gangwon-do, South Korea) by spatial prediction modeling approach and compare the results obtained. For this goal, a landslide inventory map was prepared mainly based on past historical information and aerial photographs analysis (Daum Map, 2008), as well as some field observation. Altogether, 550 landslides were counted at the whole study area. Among them, 182 landslides are debris flow and each group of landslides was constructed in the inventory map separately. Then, the landslide inventory was randomly selected through Excel; 50% landslide was used for model analysis and the remaining 50% was used for validation purpose. Total 12 contributing factors, such as slope, aspect, curvature, topographic wetness index (TWI), elevation, forest type, forest timber diameter, forest crown density, geology, landuse, soil depth, and soil drainage were used in the analysis. Moreover, to find out the co-relation between landslide causative factors and incidents landslide, pixels were divided into several classes and frequency ratio for individual class was extracted. Eventually, six landslide susceptibility maps were constructed using the Bayesian Predictive Discriminant (BPD), Empirical Likelihood Ratio (ELR), and Linear Regression Method (LRM) models based on different category dada. Finally, in the cross validation process, landslide susceptibility map was plotted with a receiver operating characteristic (ROC) curve and calculated the area under the curve (AUC) and tried to extract success rate curve. The result showed that Bayesian, likelihood and linear models were of 85.52%, 85.23%, and 83.49% accuracy respectively for total data. Subsequently, in the category of debris flow landslide, results are little better compare with total data and its contained 86.33%, 85.53% and 84.17% accuracy. It means all three models were reasonable methods for landslide susceptibility analysis. The models have proved to produce reliable predictions for regional spatial planning or land-use planning.

Bayesian Network Analysis for the Dynamic Prediction of Financial Performance Using Corporate Social Responsibility Activities (베이지안 네트워크를 이용한 기업의 사회적 책임활동과 재무성과)

  • Sun, Eun-Jung
    • Management & Information Systems Review
    • /
    • v.34 no.5
    • /
    • pp.71-92
    • /
    • 2015
  • This study analyzes the impact of Corporate Social Responsibility (CSR) activities on financial performances using Bayesian Network. The research tries to overcome the issues of the uniform assumption of a linear function between financial performance and CSR activities in multiple regression analysis widely used in previous studies. It is required to infer a causal relationship between activities of CSR which have an impact on the financial performances. Identifying the relationship would empower the firms to improve their financial performance by informing the decision makers about the different CSR activities that influence the financial performance of the firms. This research proposes General Bayesian Network (GBN) and presents Markov Blanket induced from GBN. It is empirically demonstrated that all the proposals presented in this study are statistically significant by the results of the research conducted by Korean Economic Justice Institute (KEJI) under Citizen's Coalition for Economic Justice (CCEJ) which investigated approximately 200 companies in Korea based on Korean Economic Justice Institute Index (KEJI index) from 2005 to 2011. The Bayesian Network to effectively infer the properties affecting financial performances through the probabilistic causal relationship. Moreover, I found that there is a causal relationship among CSR activities variable; that is Environment protection is related to Customer protection, Employee satisfaction, and firm size; Soundness is related to Total CSR Evaluation Score, Debt-Assets Ratio. Though the what-if analysis, I suggest to the sensitive factor among the explanatory variables.

  • PDF

The Risk Assessment and Prediction for the Mixed Deterioration in Cable Bridges Using a Stochastic Bayesian Modeling (확률론적 베이지언 모델링에 의한 케이블 교량의 복합열화 리스크 평가 및 예측시스템)

  • Cho, Tae Jun;Lee, Jeong Bae;Kim, Seong Soo
    • Journal of the Korea institute for structural maintenance and inspection
    • /
    • v.16 no.5
    • /
    • pp.29-39
    • /
    • 2012
  • The main objective is to predict the future degradation and maintenance budget for a suspension bridge system. Bayesian inference is applied to find the posterior probability density function of the source parameters (damage indices and serviceability), given ten years of maintenance data. The posterior distribution of the parameters is sampled using a Markov chain Monte Carlo method. The simulated risk prediction for decreased serviceability conditions are posterior distributions based on prior distribution and likelihood of data updated from annual maintenance tasks. Compared with conventional linear prediction model, the proposed quadratic model provides highly improved convergence and closeness to measured data in terms of serviceability, risky factors, and maintenance budget for bridge components, which allows forecasting a future performance and financial management of complex infrastructures based on the proposed quadratic stochastic regression model.

Application of deep learning with bivariate models for genomic prediction of sow lifetime productivity-related traits

  • Joon-Ki Hong;Yong-Min Kim;Eun-Seok Cho;Jae-Bong Lee;Young-Sin Kim;Hee-Bok Park
    • Animal Bioscience
    • /
    • v.37 no.4
    • /
    • pp.622-630
    • /
    • 2024
  • Objective: Pig breeders cannot obtain phenotypic information at the time of selection for sow lifetime productivity (SLP). They would benefit from obtaining genetic information of candidate sows. Genomic data interpreted using deep learning (DL) techniques could contribute to the genetic improvement of SLP to maximize farm profitability because DL models capture nonlinear genetic effects such as dominance and epistasis more efficiently than conventional genomic prediction methods based on linear models. This study aimed to investigate the usefulness of DL for the genomic prediction of two SLP-related traits; lifetime number of litters (LNL) and lifetime pig production (LPP). Methods: Two bivariate DL models, convolutional neural network (CNN) and local convolutional neural network (LCNN), were compared with conventional bivariate linear models (i.e., genomic best linear unbiased prediction, Bayesian ridge regression, Bayes A, and Bayes B). Phenotype and pedigree data were collected from 40,011 sows that had husbandry records. Among these, 3,652 pigs were genotyped using the PorcineSNP60K BeadChip. Results: The best predictive correlation for LNL was obtained with CNN (0.28), followed by LCNN (0.26) and conventional linear models (approximately 0.21). For LPP, the best predictive correlation was also obtained with CNN (0.29), followed by LCNN (0.27) and conventional linear models (approximately 0.25). A similar trend was observed with the mean squared error of prediction for the SLP traits. Conclusion: This study provides an example of a CNN that can outperform against the linear model-based genomic prediction approaches when the nonlinear interaction components are important because LNL and LPP exhibited strong epistatic interaction components. Additionally, our results suggest that applying bivariate DL models could also contribute to the prediction accuracy by utilizing the genetic correlation between LNL and LPP.

Bayesian inference in finite population sampling under measurement error model

  • Goo, You Mee;Kim, Dal Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.6
    • /
    • pp.1241-1247
    • /
    • 2012
  • The paper considers empirical Bayes (EB) and hierarchical Bayes (HB) predictors of the finite population mean under a linear regression model with measurement errors We discuss how to calculate the mean squared prediction errors of the EB predictors using jackknife methods and the posterior standard deviations of the HB predictors based on the Markov Chain Monte Carlo methods. A simulation study is provided to illustrate the results of the preceding sections and compare the performances of the proposed procedures.

Generalized Linear Model with Time Series Data (비정규 시계열 자료의 회귀모형 연구)

  • 최윤하;이성임;이상열
    • The Korean Journal of Applied Statistics
    • /
    • v.16 no.2
    • /
    • pp.365-376
    • /
    • 2003
  • In this paper we reviewed a variety of non-Gaussian time series models, and studied the model selection criteria such as AIC and BIC to select proper models. We also considered the likelihood ratio test and applied it to analysis of Polio data set.