• Title/Summary/Keyword: Mixture regression model

Search Result 108, Processing Time 0.022 seconds

A Bayesian Method for Narrowing the Scope of Variable Selection in Binary Response Logistic Regression

  • Kim, Hea-Jung;Lee, Ae-Kyung
    • Journal of Korean Society for Quality Management
    • /
    • v.26 no.1
    • /
    • pp.143-160
    • /
    • 1998
  • This article is concerned with the selection of subsets of predictor variables to be included in bulding the binary response logistic regression model. It is based on a Bayesian aproach, intended to propose and develop a procedure that uses probabilistic considerations for selecting promising subsets. This procedure reformulates the logistic regression setup in a hierarchical normal mixture model by introducing a set of hyperparameters that will be used to identify subset choices. It is done by use of the fact that cdf of logistic distribution is a, pp.oximately equivalent to that of $t_{(8)}$/.634 distribution. The a, pp.opriate posterior probability of each subset of predictor variables is obtained by the Gibbs sampler, which samples indirectly from the multinomial posterior distribution on the set of possible subset choices. Thus, in this procedure, the most promising subset of predictors can be identified as that with highest posterior probability. To highlight the merit of this procedure a couple of illustrative numerical examples are given.

  • PDF

Statistical Methods to Control Response Bias in Nursing Activity Surveys (간호활동시간 조사 시 응답편이 통제를 위한 통계적 접근 방안)

  • Lim, Ji-Young;Park, Chang-Gi
    • Journal of Korean Academy of Nursing
    • /
    • v.42 no.1
    • /
    • pp.48-55
    • /
    • 2012
  • Purpose: The aim of this study was to compare statistical methods to control response bias in nursing activity surveys. Methods: Data were collected at a medical unit of a general hospital. The number of nursing activities and consumed activity time were measured using self-report questionnaires. Descriptive statistics were used to identify general characteristics of the units. Average, Z-standardization, gamma regression, finite mixture model, and stochastic frontier model were adopted to estimate true activity time controlling for response bias. Results: The nursing activity time data were highly skewed and had non-normal distributions. Among the 4 different methods, only gamma regression and stochastic frontier model controlled response bias effectively and the estimated total nursing activity time did not exceeded total work time. However, in gamma regression, estimated total nursing activity time was too small to use in real clinical settings. Thus stochastic frontier model was the most appropriate method to control response bias when compared with the other methods. Conclusion: According to these results, we recommend the use of a stochastic frontier model to estimate true nursing activity time when using self-report surveys.

Optimization of Surfactant Mixture Composition for Cleansing Using Mixture Experiment Design (혼합물 실험 계획법을 활용한 세정용 계면활성제 혼합물 조성의 최적화)

  • Song, Maria;Jin, Byung Suk
    • Applied Chemistry for Engineering
    • /
    • v.32 no.5
    • /
    • pp.574-580
    • /
    • 2021
  • The main goal of this study was to find an optimal surfactant mixture composition for the development of the best performing cleansing products. Three different surfactants including sodium cocoyl alaninate (SCoA), cocamidopropyl betaine (CPB), and decyl glucoside (DG) were selected, which showed excellent properties in detergency, foaming height, and contamination rate through preliminary experiments. The experiments by simplex centroid design matrix for surfactant mixtures were performed, and the regression analysis was conducted with the experimental data. Surface response model equations, which is statistically significant (p < 0.05), were obtained. The optimal composition of the surfactant mixture was also determined as SCoA (0.22), CPB (0.78), and DG(0.00) from simultaneous optimization of three response variables.

Gas detonation cell width prediction model based on support vector regression

  • Yu, Jiyang;Hou, Bingxu;Lelyakin, Alexander;Xu, Zhanjie;Jordan, Thomas
    • Nuclear Engineering and Technology
    • /
    • v.49 no.7
    • /
    • pp.1423-1430
    • /
    • 2017
  • Detonation cell width is an important parameter in hydrogen explosion assessments. The experimental data on gas detonation are statistically analyzed to establish a universal method to numerically predict detonation cell widths. It is commonly understood that detonation cell width, ${\lambda}$, is highly correlated with the characteristic reaction zone width, ${\delta}$. Classical parametric regression methods were widely applied in earlier research to build an explicit semiempirical correlation for the ratio of ${\lambda}/{\delta}$. The obtained correlations formulate the dependency of the ratio ${\lambda}/{\delta}$ on a dimensionless effective chemical activation energy and a dimensionless temperature of the gas mixture. In this paper, support vector regression (SVR), which is based on nonparametric machine learning, is applied to achieve functions with better fitness to experimental data and more accurate predictions. Furthermore, a third parameter, dimensionless pressure, is considered as an additional independent variable. It is found that three-parameter SVR can significantly improve the performance of the fitting function. Meanwhile, SVR also provides better adaptability and the model functions can be easily renewed when experimental database is updated or new regression parameters are considered.

Bayes Prediction Density in Linear Models

  • Kim, S.H.
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.3
    • /
    • pp.797-803
    • /
    • 2001
  • This paper obtained Bayes prediction density for the spatial linear model with non-informative prior. It showed the results that predictive inferences is completely unaffected by departures from the normality assumption in the direction of the elliptical family and the structure of prediction density is unchanged by more than one additional future observations.

  • PDF

Enhancing prediction accuracy of concrete compressive strength using stacking ensemble machine learning

  • Yunpeng Zhao;Dimitrios Goulias;Setare Saremi
    • Computers and Concrete
    • /
    • v.32 no.3
    • /
    • pp.233-246
    • /
    • 2023
  • Accurate prediction of concrete compressive strength can minimize the need for extensive, time-consuming, and costly mixture optimization testing and analysis. This study attempts to enhance the prediction accuracy of compressive strength using stacking ensemble machine learning (ML) with feature engineering techniques. Seven alternative ML models of increasing complexity were implemented and compared, including linear regression, SVM, decision tree, multiple layer perceptron, random forest, Xgboost and Adaboost. To further improve the prediction accuracy, a ML pipeline was proposed in which the feature engineering technique was implemented, and a two-layer stacked model was developed. The k-fold cross-validation approach was employed to optimize model parameters and train the stacked model. The stacked model showed superior performance in predicting concrete compressive strength with a correlation of determination (R2) of 0.985. Feature (i.e., variable) importance was determined to demonstrate how useful the synthetic features are in prediction and provide better interpretability of the data and the model. The methodology in this study promotes a more thorough assessment of alternative ML algorithms and rather than focusing on any single ML model type for concrete compressive strength prediction.

Convergence Study on the Optimization for Suppression of Starch Hydrolysis using Rutin, Quercetin and Dietary Fiber Mixture Design (루틴, 퀘르세틴, 식이섬유 혼합물 설계를 이용한 전분소화 지연 효과의 최적화에 대한 융합 연구)

  • Oh, Imkyung;Bae, In Young
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.5
    • /
    • pp.35-41
    • /
    • 2020
  • This study was conducted to develop the efficient system for starch hydrolysis suppression using rutin, quercetin and dietary fiber through the statistical mixture design. The three components were replaced with wheat flour at the level of 10% and the mixed gel with three components was characterized by in vitro starch digestion. The mixture design was applied by simplex-centroid experimental model. The quadratic model (R2=0.86) was well fitted and the obtained regression equation indicated that the significant positive effects was observed in the quercetin and fiber mixture. Based on the statistical results, the best mixing ratio of quercetin and fiber was 72: 28 that led to the lowest predicted glycemic index (pGI). Their interactions on the pGI of starch digestibility were clearly visualized in the 3D surface plot. These results suggested that the mixture of quercetin and fiber interact strongly with wheat flour, consequently retarding starch hydrolysis by 15%.

Firework Plot as a Graphical Exploratory Data Analysis Tool to Evaluate the Impact of Outliers in a Mixture Experiment (혼합물 실험에서 특이값의 영향을 평가하기 위한 그래픽 탐색적 자료분석 도구로서의 불꽃그림)

  • Jang, Dae-Heung;Ahn, SoJin;Kim, Youngil
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.4
    • /
    • pp.629-643
    • /
    • 2014
  • It is common to check the validity of an assumed model with the heavy use of diagnostics tools when conducting data analysis with regression techniques; however, outliers and influential data points often distort the regression output in undesired manner. Jang and Anderson-Cook (2013) proposed a graphical method called a firework plot for exploratory analysis that could visualize the trace of the impact of possible outlying and/or influential data points on individual regression coefficients and the overall residual sum of squares(SSE) measure. They developed 3-D plot as well as pair-wise plot for the appropriate measures of interest. In this paper, the approach was extended further to tell the strength of their approach; in addition, a more meaningful interpretation was possible by adding a measure not mentioned in their paper. This approach was applied to the mixture experiment because we felt that a detailed analysis of statistical measure sensitivity is required in a small experiment.

Mixutre Optimization of Hwangdo Peach (Prunus persica L. Batsch) Dressing by Mixture Experimental Design (혼합물 실험계획법에 의한 황도복숭아 드레싱 재료혼합비의 최적화)

  • Park, Jung Eun;Kim, Yong-Sik
    • Culinary science and hospitality research
    • /
    • v.23 no.7
    • /
    • pp.20-30
    • /
    • 2017
  • This study was conducted for the optimization of ingredients in salad dressing using Hwangdo peach (Prunus persica L. Batsch). The experiment was designed according to the D-optimal design of mixture design, which included 14 experimental points with 4 replicates for three independent variables (olive oil 40~65%, peach puree 27~50%, vinegar 8~20%). The linear regression models for pH, viscosity and color value and the quadratic regression models for emulsion stability, all sensory evaluation of the products were proven to be valid by the F-test for the overall significance of the regression model at a 5% level. Viscosity and pH of the products increased as olive oil content. Color value, viscosity and pH of the products increased as peach puree content. pH, viscosity, redness, and yellowness of the products decreased as vinegar content. Sensory evaluation result of the products showed that general preference for the products were increasingly affected by the increases in contents then decreased as they exceeded the optimum levels. In consequence, according to result from the first stage of the experiment, the optimum ingredients ratios of the raw materials were set in olive oil 52.43%, peach puree 35.07%, and vinegar 13.91% for ingredients of apricot dressing. These results provided the possibility that peach can be applied to the preparation of a dressing, and thereby present baseline data for the development of new dressings. This is also presumed to meet demands of customers who are always in pursuit of new products.

Quantitative Analysis of Indomethacin by the Portable Near-Infrared (NIR) System (근적외분광분석법을 이용한 인도메타신의 정량분석)

  • 김도형;우영아;김효진
    • YAKHAK HOEJI
    • /
    • v.47 no.5
    • /
    • pp.261-265
    • /
    • 2003
  • Near-infrared (NIR) system was used to determine rapidly and simply indomethacin in buffer solution for a dissolution test of tablets and capsules. Indomethacin standards were prepared ranging from 10 to 50 ppm using the mixture of phosphate buffer (pH 7.2) and water (1 : 4). The near-infrared (NIR) transmittance spectra of indomethacin standard solutions were collected by using a quartz cell in 1 mm and 2 mm pathlength. Partial least square regression (PLSR) was explored to develop calibration models over the spectral range 1100∼1700 nm. The model using 1 mm quartz cell was better than that using 2 mm quartz cell. The PLSR models developed gave standard error of prediction (SEP) of 0.858 ppm. In order to validate the developed calibration model, routine analysis was performed using another standard solutions. The NIR routine analysis showed good correlation with actual values. Standard error of prediction (SEP) is 1.414 ppm for 7 indomethacin samples in routine analysis and its error was permeable in the regulation of Korean Pharmacopoeia (VII). These results show the potential use of the real time monitoring for indomethacin during a dissolution test.