• 제목/요약/키워드: statistical regression modeling

검색결과 192건 처리시간 0.021초

MARS Modeling for Ordinal Categorical Response Data: A Case Study

  • Kim, Ji-Hyun
    • Communications for Statistical Applications and Methods
    • /
    • 제7권3호
    • /
    • pp.711-720
    • /
    • 2000
  • A case study of modeling ordinal categorical response data with the MARS method is done. The study is to analyze the effect of some personal characteristics and socioeconomic status on the teenage marijuana use. The MARS method gave a new insight into the data set.

  • PDF

Computing the Repurchase Index Based on Statistical Modeling

  • Bae, Wha-Soo;Jung, Woo-Seok;Lee, Young-Bae
    • 응용통계연구
    • /
    • 제23권4호
    • /
    • pp.739-745
    • /
    • 2010
  • This paper computes the repurchase index based on statistical modeling. Using the transaction record of a certain product, the repurchase index is obtained by fitting the Poisson regression model. The customers are classified into 5 groups based on the index giving the information about the propensity to repurchase.

A review of tree-based Bayesian methods

  • Linero, Antonio R.
    • Communications for Statistical Applications and Methods
    • /
    • 제24권6호
    • /
    • pp.543-559
    • /
    • 2017
  • Tree-based regression and classification ensembles form a standard part of the data-science toolkit. Many commonly used methods take an algorithmic view, proposing greedy methods for constructing decision trees; examples include the classification and regression trees algorithm, boosted decision trees, and random forests. Recent history has seen a surge of interest in Bayesian techniques for constructing decision tree ensembles, with these methods frequently outperforming their algorithmic counterparts. The goal of this article is to survey the landscape surrounding Bayesian decision tree methods, and to discuss recent modeling and computational developments. We provide connections between Bayesian tree-based methods and existing machine learning techniques, and outline several recent theoretical developments establishing frequentist consistency and rates of convergence for the posterior distribution. The methodology we present is applicable for a wide variety of statistical tasks including regression, classification, modeling of count data, and many others. We illustrate the methodology on both simulated and real datasets.

An Algorithm for Hannan and Rissanen's ARMA Modeling Method

  • Chul Eung Kim;Byoung Seon Choi
    • Communications for Statistical Applications and Methods
    • /
    • 제2권2호
    • /
    • pp.85-93
    • /
    • 1995
  • Hannan and Rissanen proposed an innovation regression method of ARMA modeling, which is composed of three stages. Its second-stage is to choose orders of the ARMA model using the BIC, which needs a lot of calculation to estimate several regression models. We are going to present a simple and efficient algorithm for the second stage using a special property of triangular Toeplitz matrices.

  • PDF

Bayesian Analysis for a Functional Regression Model with Truncated Errors in Variables

  • Kim, Hea-Jung
    • Journal of the Korean Statistical Society
    • /
    • 제31권1호
    • /
    • pp.77-91
    • /
    • 2002
  • This paper considers a functional regression model with truncated errors in explanatory variables. We show that the ordinary least squares (OLS) estimators produce bias in regression parameter estimates under misspecified models with ignored errors in the explanatory variable measurements, and then propose methods for analyzing the functional model. Fully parametric frequentist approaches for analyzing the model are intractable and thus Bayesian methods are pursued using a Markov chain Monte Carlo (MCMC) sampling based approach. Necessary theories involved in modeling and computation are provided. Finally, a simulation study is given to illustrate and examine the proposed methods.

Quantitative Analysis for Plasma Etch Modeling Using Optical Emission Spectroscopy: Prediction of Plasma Etch Responses

  • Jeong, Young-Seon;Hwang, Sangheum;Ko, Young-Don
    • Industrial Engineering and Management Systems
    • /
    • 제14권4호
    • /
    • pp.392-400
    • /
    • 2015
  • Monitoring of plasma etch processes for fault detection is one of the hallmark procedures in semiconductor manufacturing. Optical emission spectroscopy (OES) has been considered as a gold standard for modeling plasma etching processes for on-line diagnosis and monitoring. However, statistical quantitative methods for processing the OES data are still lacking. There is an urgent need for a statistical quantitative method to deal with high-dimensional OES data for improving the quality of etched wafers. Therefore, we propose a robust relevance vector machine (RRVM) for regression with statistical quantitative features for modeling etch rate and uniformity in plasma etch processes by using OES data. For effectively dealing with the OES data complexity, we identify seven statistical features for extraction from raw OES data by reducing the data dimensionality. The experimental results demonstrate that the proposed approach is more suitable for high-accuracy monitoring of plasma etch responses obtained from OES.

Semiparametric Bayesian Estimation under Structural Measurement Error Model

  • Hwang, Jin-Seub;Kim, Dal-Ho
    • Communications for Statistical Applications and Methods
    • /
    • 제17권4호
    • /
    • pp.551-560
    • /
    • 2010
  • This paper considers a Bayesian approach to modeling a flexible regression function under structural measurement error model. The regression function is modeled based on semiparametric regression with penalized splines. Model fitting and parameter estimation are carried out in a hierarchical Bayesian framework using Markov chain Monte Carlo methodology. Their performances are compared with those of the estimators under structural measurement error model without a semiparametric component.

신경회로망을 이용한 ITO 박막 성장 공정의 모형화 (Modeling of Indium Tin Oxide(ITO) Film Deposition Process using Neural Network)

  • 민철홍;박성진;윤능구;김태선
    • 한국전기전자재료학회논문지
    • /
    • 제22권9호
    • /
    • pp.741-746
    • /
    • 2009
  • Compare to conventional Indium Tin Oxide (ITO) film deposition methods, cesium assisted sputtering method has been shown superior electrical, mechanical, and optical film properties. However, it is not easy to use cesium assisted sputtering method since ITO film properties are very sensitive to Cesium assisted equipment condition but their mechanism is not yet clearly defined physically or mathematically. Therefore, to optimize deposited ITO film characteristics, development of accurate and reliable process model is essential. For this, in this work, we developed ITO film deposition process model using neural networks and design of experiment (DOE). Developed model prediction results are compared with conventional statistical regression model and developed neural process model has been shown superior prediction results on modeling of ITO film thickness, sheet resistance, and transmittance characteristics.

A Bayesian joint model for continuous and zero-inflated count data in developmental toxicity studies

  • Hwang, Beom Seuk
    • Communications for Statistical Applications and Methods
    • /
    • 제29권2호
    • /
    • pp.239-250
    • /
    • 2022
  • In many applications, we frequently encounter correlated multiple outcomes measured on the same subject. Joint modeling of such multiple outcomes can improve efficiency of inference compared to independent modeling. For instance, in developmental toxicity studies, fetal weight and number of malformed pups are measured on the pregnant dams exposed to different levels of a toxic substance, in which the association between such outcomes should be taken into account in the model. The number of malformations may possibly have many zeros, which should be analyzed via zero-inflated count models. Motivated by applications in developmental toxicity studies, we propose a Bayesian joint modeling framework for continuous and count outcomes with excess zeros. In our model, zero-inflated Poisson (ZIP) regression model would be used to describe count data, and a subject-specific random effects would account for the correlation across the two outcomes. We implement a Bayesian approach using MCMC procedure with data augmentation method and adaptive rejection sampling. We apply our proposed model to dose-response analysis in a developmental toxicity study to estimate the benchmark dose in a risk assessment.

Modeling clustered count data with discrete weibull regression model

  • Yoo, Hanna
    • Communications for Statistical Applications and Methods
    • /
    • 제29권4호
    • /
    • pp.413-420
    • /
    • 2022
  • In this study we adapt discrete weibull regression model for clustered count data. Discrete weibull regression model has an attractive feature that it can handle both under and over dispersion data. We analyzed the eighth Korean National Health and Nutrition Examination Survey (KNHANES VIII) from 2019 to assess the factors influencing the 1 month outpatient stay in 17 different regions. We compared the results using clustered discrete Weibull regression model with those of Poisson, negative binomial, generalized Poisson and Conway-maxwell Poisson regression models, which are widely used in count data analyses. The results show that the clustered discrete Weibull regression model using random intercept model gives the best fit. Simulation study is also held to investigate the performance of the clustered discrete weibull model under various dispersion setting and zero inflated probabilities. In this paper it is shown that using a random effect with discrete Weibull regression can flexibly model count data with various dispersion without the risk of making wrong assumptions about the data dispersion.