• Title/Summary/Keyword: Bayesian regression

Search Result 262, Processing Time 0.032 seconds

Differences of Cold-heat Patterns between Healthy and Disease Group (건강군과 질환군의 한열지표 차이에 관한 고찰)

  • Kim Ji-Eun;Lee Seung-Gi;Ryu Hwa-Seung;Park Kyung-Mo
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.20 no.1
    • /
    • pp.224-228
    • /
    • 2006
  • The pattern identification of exterior-interior syndrome and cold-heat syndrome is one of the diagnostic methods using most frequently in Oriental medicine. There was no systematic studies analyzing the characteristics of the 'exterior-interior and cold-heat' between healthy and disease group. In this study, cold-heat pattern, blood pressure, pulse rate, height and weight are recorded from 100 healthy subjects and 196 disease subjects with age ranging from 30 to 59 years. To analyze the differences between healthy and disease group, we used the descriptive statistics. And linear regression function, linear support vector machine and bayesian classifier were used for distinguishing healthy group from disease group. The score of both exterior-heat and interior-cold in healthy group is higher than the score in disease group. This means that if one belongs to the disease group, his(or her) exterior gets cold and his interior gets hot. And also, these result have no relevance to age. But, the attempt to classify healthy group from disease group with a exterior-interior and cold-heat and other vital signs did not have good performance. It mean that even though they have a different trend each other, only these kinds of information couldn't classify healthy group and disease group.

Machine Learning-based Data Analysis for Designing High-strength Nb-based Superalloys (고강도 Nb기 초내열 합금 설계를 위한 기계학습 기반 데이터 분석)

  • Eunho Ma;Suwon Park;Hyunjoo Choi;Byoungchul Hwang;Jongmin Byun
    • Journal of Powder Materials
    • /
    • v.30 no.3
    • /
    • pp.217-222
    • /
    • 2023
  • Machine learning-based data analysis approaches have been employed to overcome the limitations in accurately analyzing data and to predict the results of the design of Nb-based superalloys. In this study, a database containing the composition of the alloying elements and their room-temperature tensile strengths was prepared based on a previous study. After computing the correlation between the tensile strength at room temperature and the composition, a material science analysis was conducted on the elements with high correlation coefficients. These alloying elements were found to have a significant effect on the variation in the tensile strength of Nb-based alloys at room temperature. Through this process, a model was derived to predict the properties using four machine learning algorithms. The Bayesian ridge regression algorithm proved to be the optimal model when Y, Sc, W, Cr, Mo, Sn, and Ti were used as input features. This study demonstrates the successful application of machine learning techniques to effectively analyze data and predict outcomes, thereby providing valuable insights into the design of Nb-based superalloys.

Application of deep learning with bivariate models for genomic prediction of sow lifetime productivity-related traits

  • Joon-Ki Hong;Yong-Min Kim;Eun-Seok Cho;Jae-Bong Lee;Young-Sin Kim;Hee-Bok Park
    • Animal Bioscience
    • /
    • v.37 no.4
    • /
    • pp.622-630
    • /
    • 2024
  • Objective: Pig breeders cannot obtain phenotypic information at the time of selection for sow lifetime productivity (SLP). They would benefit from obtaining genetic information of candidate sows. Genomic data interpreted using deep learning (DL) techniques could contribute to the genetic improvement of SLP to maximize farm profitability because DL models capture nonlinear genetic effects such as dominance and epistasis more efficiently than conventional genomic prediction methods based on linear models. This study aimed to investigate the usefulness of DL for the genomic prediction of two SLP-related traits; lifetime number of litters (LNL) and lifetime pig production (LPP). Methods: Two bivariate DL models, convolutional neural network (CNN) and local convolutional neural network (LCNN), were compared with conventional bivariate linear models (i.e., genomic best linear unbiased prediction, Bayesian ridge regression, Bayes A, and Bayes B). Phenotype and pedigree data were collected from 40,011 sows that had husbandry records. Among these, 3,652 pigs were genotyped using the PorcineSNP60K BeadChip. Results: The best predictive correlation for LNL was obtained with CNN (0.28), followed by LCNN (0.26) and conventional linear models (approximately 0.21). For LPP, the best predictive correlation was also obtained with CNN (0.29), followed by LCNN (0.27) and conventional linear models (approximately 0.25). A similar trend was observed with the mean squared error of prediction for the SLP traits. Conclusion: This study provides an example of a CNN that can outperform against the linear model-based genomic prediction approaches when the nonlinear interaction components are important because LNL and LPP exhibited strong epistatic interaction components. Additionally, our results suggest that applying bivariate DL models could also contribute to the prediction accuracy by utilizing the genetic correlation between LNL and LPP.

Examining Impact of Weather Factors on Apple Yield (사과생산량에 영향을 미치는 기상요인 분석)

  • Kim, Mi Ri;Kim, Seung Gyu
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.16 no.4
    • /
    • pp.274-284
    • /
    • 2014
  • Crops and varieties are mostly affected by temperature, the amount of precipitation, and duration of sunshine. This study aims to identify the weather factors that directly influence to apple yield among the series of daily measured weather variables during growing seasons. In order to identify them, 1) a priori natural scientific knowledge with respect to the growth stage of apples and 2) pure statistical approaches to minimize bias due to the subject selection of variables are considered. Each result estimated by the Panel regression using fixed/random effect models is evaluated through suitability (i.e., Akaike information criterion and Bayesian information criterion) and predictability (i.e., mean absolute error, root mean square error, mean absolute percentage). The Panel data of apple yield and weather factors are collected from fifteen major producing areas of apples from 2006 to 2013 in Korea for the case study. The result shows that variable selection using factor analysis, which is one of the statistical approaches applied in the analysis, increases predictability and suitability most. It may imply that all the weather factors are important to predict apple yield if statistical problems, such as multicollinearity and lower degree of freedom due to too many explanatory variables used in the regression, can be controlled effectively. This may be because whole growth stages, such as germination, florescence, fruit setting, fatting, ripening, coloring, and harvesting, are affected by weather.

Prediction of genomic breeding values of carcass traits using whole genome SNP data in Hanwoo (Korean cattle) (한우에 있어서 유전체 육종가 추정)

  • Lee, Seung Hwan;Kim, Heong Cheul;Lim, Dajeong;Dang, Chang Gwan;Cho, Yong Min;Kim, Si Dong;Lee, Hak Kyo;Lee, Jun Heon;Yang, Boh Suk;Oh, Sung Jong;Hong, Seong Koo;Chang, Won Kyung
    • Korean Journal of Agricultural Science
    • /
    • v.39 no.3
    • /
    • pp.357-364
    • /
    • 2012
  • Genomic breeding value (GEBV) has recently become available in the beef cattle industry. Genomic selection methods are exceptionally valuable for selecting traits, such as marbling, that are difficult to measure until later in life. One method to utilize information from sparse marker panels is the Bayesian model selection method with RJMCMC. The accuracy of prediction varies between a multiple SNP model with RJMCMC (0.47 to 0.73) and a least squares method (0.11 to 0.41) when using SNP information, while the accuracy of prediction increases in the multiple SNP (0.56 to 0.90) and least square methods (0.21 to 0.63) when including a polygenic effect. In the multiple SNP model with RJMCMC model selection method, the accuracy ($r^2$) of GEBV for marbling predicted based only on SNP effects was 0.47, while the $r^2$ of GEBV predicted by SNP plus polygenic effect was 0.56. The accuracies of GEBV predicted using only SNP information were 0.62, 0.68 and 0.73 for CWT, EMA and BF, respectively. However, when polygenic effects were included, the accuracies of GEBV were increased to 0.89, 0.90 and 0.89 for CWT, EMA and BF, respectively. Our data demonstrate that SNP information alone is missing genetic variation information that contributes to phenotypes for carcass traits, and that polygenic effects compensate genetic variation that whole genome SNP data do not explain. Overall, the multiple SNP model with the RJMCMC model selection method provides a better prediction of GEBV than does the least squares method (single marker regression).

Forecasting of Customer's Purchasing Intention Using Support Vector Machine (Support Vector Machine 기법을 이용한 고객의 구매의도 예측)

  • Kim, Jin-Hwa;Nam, Ki-Chan;Lee, Sang-Jong
    • Information Systems Review
    • /
    • v.10 no.2
    • /
    • pp.137-158
    • /
    • 2008
  • Rapid development of various information technologies creates new opportunities in online and offline markets. In this changing market environment, customers have various demands on new products and services. Therefore, their power and influence on the markets grow stronger each year. Companies have paid great attention to customer relationship management. Especially, personalized product recommendation systems, which recommend products and services based on customer's private information or purchasing behaviors in stores, is an important asset to most companies. CRM is one of the important business processes where reliable information is mined from customer database. Data mining techniques such as artificial intelligence are popular tools used to extract useful information and knowledge from these customer databases. In this research, we propose a recommendation system that predicts customer's purchase intention. Then, customer's purchasing intention of specific product is predicted by using data mining techniques using receipt data set. The performance of this suggested method is compared with that of other data mining technologies.

Genomic partitioning of growth traits using a high-density single nucleotide polymorphism array in Hanwoo (Korean cattle)

  • Park, Mi Na;Seo, Dongwon;Chung, Ki-Yong;Lee, Soo-Hyun;Chung, Yoon-Ji;Lee, Hyo-Jun;Lee, Jun-Heon;Park, Byoungho;Choi, Tae-Jeong;Lee, Seung-Hwan
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.33 no.10
    • /
    • pp.1558-1565
    • /
    • 2020
  • Objective: The objective of this study was to characterize the number of loci affecting growth traits and the distribution of single nucleotide polymorphism (SNP) effects on growth traits, and to understand the genetic architecture for growth traits in Hanwoo (Korean cattle) using genome-wide association study (GWAS), genomic partitioning, and hierarchical Bayesian mixture models. Methods: GWAS: A single-marker regression-based mixed model was used to test the association between SNPs and causal variants. A genotype relationship matrix was fitted as a random effect in this linear mixed model to correct the genetic structure of a sire family. Genomic restricted maximum likelihood and BayesR: A priori information included setting the fixed additive genetic variance to a pre-specified value; the first mixture component was set to zero, the second to 0.0001×σ2g, the third 0.001×σ2g, and the fourth to 0.01×σ2g. BayesR fixed a priori information was not more than 1% of the genetic variance for each of the SNPs affecting the mixed distribution. Results: The GWAS revealed common genomic regions of 2 Mb on bovine chromosome 14 (BTA14) and 3 had a moderate effect that may contain causal variants for body weight at 6, 12, 18, and 24 months. This genomic region explained approximately 10% of the variance against total additive genetic variance and body weight heritability at 12, 18, and 24 months. BayesR identified the exact genomic region containing causal SNPs on BTA14, 3, and 22. However, the genetic variance explained by each chromosome or SNP was estimated to be very small compared to the total additive genetic variance. Causal SNPs for growth trait on BTA14 explained only 0.04% to 0.5% of the genetic variance Conclusion: Segregating mutations have a moderate effect on BTA14, 3, and 19; many other loci with small effects on growth traits at different ages were also identified.

Potential influence of κ-casein and β-lactoglobulin genes in genetic association studies of milk quality traits

  • Zepeda-Batista, Jose Luis;Saavedra-Jimenez, Luis Antonio;Ruiz-Flores, Agustin;Nunez-Dominguez, Rafael;Ramirez-Valverde, Rodolfo
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.30 no.12
    • /
    • pp.1684-1688
    • /
    • 2017
  • Objective: From a review of published information on genetic association studies, a meta-analysis was conducted to determine the influence of the genes ${\kappa}-casein$ (CSN3) and ${\beta}-lactoglobulin$ (LGB) on milk yield traits in Holstein, Jersey, Brown Swiss, and Fleckvieh. Methods: The GLIMMIX procedure was used to analyze milk production and percentage of protein and fat in milk. Models included the main effects and all their possible two-way interactions; not estimable effects and non-significant (p>0.05) two-way interactions were dropped from the models. The three traits analyzed used Poisson distribution and a log link function and were determined with the Interactive Data Analysis of SAS software. Least square means and multiple mean comparisons were obtained and performed for significant main effects and their interactions (p<0.0255). Results: Interaction of breed by gene showed that Holstein and Fleckvieh were the breeds on which CSN3 ($6.01%{\pm}0.19%$ and $5.98%{\pm}0.22%$), and LGB ($6.02%{\pm}0.19%$ and $5.70%{\pm}0.22%$) have the greatest influence. Interaction of breed by genotype nested in the analyzed gene indicated that Holstein and Jersey showed greater influence of the CSN3 AA genotype, $6.04%{\pm}0.22%$ and $5.59%{\pm}0.31%$ than the other genotypes, while LGB AA genotype had the largest influence on the traits analyzed, $6.05%{\pm}0.20%$ and $5.60%{\pm}0.19%$, respectively. Furthermore, interaction of type of statistical model by genotype nested in the analyzed gene indicated that CSN3 and LGB genes had similar behavior, maintaining a difference of more than 7% across analyzed genotypes. These results could indicate that both Holstein and Jersey have had lower substitution allele effect in selection programs that include CSN3 and LGB genes than Brown Swiss and Fleckvieh. Conclusion: Breed determined which genotypes had the greatest association with analyzed traits. The mixed model based in Bayesian or Ridge Regression was the best alternative to analyze CSN3 and LGB gene effects on milk yield and protein and fat percentages.

Safety Impacts of Red Light Enforcement on Signalized Intersections (교차로 신호위반 단속카메라 설치가 차량사고에 미치는 영향)

  • Lee, Sang Hyuk;Lee, Yong Doo;Do, Myung Sik
    • Journal of Korean Society of Transportation
    • /
    • v.30 no.6
    • /
    • pp.93-102
    • /
    • 2012
  • The frequency and severity of traffic accidents related to signalized intersections in urban areas have been more serious than those in both arterial segments and crosswalks. Especially, traffic accidents involved with injuries and fatalities have caused by traffic signal violations within intersections. Therefore, many countries including Korea have installed the red light enforcement camera (RLE) to reduce traffic accidents associated with the traffic signal violation. Meanwhile, many methodologies have been studied in terms of safety impacts estimation of red light enforcement, which, however, cannot be easy to conduct. In this study, safety impacts was estimated for intersections of Chicago downtown area using SPF models and EB approach. As a result, for all crash types and target traffic accident types such as "angle", "rear end", "sideswipe in the same and other directions", "turn", and "head on", fatal crashes were reduced by 26% and 38%. However, RLE may increase property-demage-only-crashes by 3.23% and 1.16%, respectively.

Allometric equation for estimating aboveground biomass of Acacia-Commiphora forest, southern Ethiopia

  • Wondimagegn Amanuel;Chala Tadesse;Moges Molla;Desalegn Getinet;Zenebe Mekonnen
    • Journal of Ecology and Environment
    • /
    • v.48 no.2
    • /
    • pp.196-206
    • /
    • 2024
  • Background: Most of the biomass equations were developed using sample trees collected mainly from pan-tropical and tropical regions that may over- or underestimate biomass. Site-specific models would improve the accuracy of the biomass estimates and enhance the country's measurement, reporting, and verification activities. The aim of the study is to develop site-specific biomass estimation models and validate and evaluate the existing generic models developed for pan-tropical forest and newly developed allometric models. Total of 140 trees was harvested from each diameter class biomass model development. Data was analyzed using SAS procedures. All relevant statistical tests (normality, multicollinearity, and heteroscedasticity) were performed. Data was transformed to logarithmic functions and multiple linear regression techniques were used to develop model to estimate aboveground biomass (AGB). The root mean square error (RMSE) was used for measuring model bias, precision, and accuracy. The coefficient of determination (R2 and adjusted [adj]-R2), the Akaike Information Criterion (AIC) and the Schwarz Bayesian information Criterion was employed to select most appropriate models. Results: For the general total AGB models, adj-R2 ranged from 0.71 to 0.85, and model 9 with diameter at stump height at 10 cm (DSH10), ρ and crown width (CW) as predictor variables, performed best according to RMSE and AIC. For the merchantable stem models, adj-R2 varied from 0.73 to 0.82, and model 8) with combination of ρ, diameter at breast height and height (H), CW and DSH10 as predictor variables, was best in terms of RMSE and AIC. The results showed that a best-fit model for above-ground biomass of tree components was developed. AGBStem = exp {-1.8296 + 0.4814 natural logarithm (Ln) (ρD2H) + 0.1751 Ln (CW) + 0.4059 Ln (DSH30)} AGBBranch = exp {-131.6 + 15.0013 Ln (ρD2H) + 13.176 Ln (CW) + 21.8506 Ln (DSH30)} AGBFoliage = exp {-0.9496 + 0.5282 Ln (DSH30) + 2.3492 Ln (ρ) + 0.4286 Ln (CW)} AGBTotal = exp {-1.8245 + 1.4358 Ln (DSH30) + 1.9921 Ln (ρ) + 0.6154 Ln (CW)} Conclusions: The results demonstrated that the development of local models derived from an appropriate sample of representative species can greatly improve the estimation of total AGB.