• 제목/요약/키워드: Multivariate Regression Model

검색결과 412건 처리시간 0.025초

Use of partial least squares analysis in concrete technology

  • Tutmez, Bulent
    • Computers and Concrete
    • /
    • 제13권2호
    • /
    • pp.173-185
    • /
    • 2014
  • Multivariate analysis is a statistical technique that investigates relationship between multiple predictor variables and response variable and it is a very commonly used statistical approach in cement and concrete industry. During model building stage, however, many predictor variables are included in the model and possible collinearity problems between these predictors are generally ignored. In this study, use of partial least squares (PLS) analysis for evaluating the relationships among the cement and concrete properties is investigated. This regression method is known to decrease the model complexity by reducing the number of predictor variables as well as to result in accurate and reliable predictions. The experimental studies showed that the method can be used in the multivariate problems of cement and concrete industry effectively.

A GEE approach for the semiparametric accelerated lifetime model with multivariate interval-censored data

  • Maru Kim;Sangbum Choi
    • Communications for Statistical Applications and Methods
    • /
    • 제30권4호
    • /
    • pp.389-402
    • /
    • 2023
  • Multivariate or clustered failure time data often occur in many medical, epidemiological, and socio-economic studies when survival data are collected from several research centers. If the data are periodically observed as in a longitudinal study, survival times are often subject to various types of interval-censoring, creating multivariate interval-censored data. Then, the event times of interest may be correlated among individuals who come from the same cluster. In this article, we propose a unified linear regression method for analyzing multivariate interval-censored data. We consider a semiparametric multivariate accelerated failure time model as a statistical analysis tool and develop a generalized Buckley-James method to make inferences by imputing interval-censored observations with their conditional mean values. Since the study population consists of several heterogeneous clusters, where the subjects in the same cluster may be related, we propose a generalized estimating equations approach to accommodate potential dependence in clusters. Our simulation results confirm that the proposed estimator is robust to misspecification of working covariance matrix and statistical efficiency can increase when the working covariance structure is close to the truth. The proposed method is applied to the dataset from a diabetic retinopathy study.

다중선형회귀모형에서의 변수선택기법 평가 (Evaluating Variable Selection Techniques for Multivariate Linear Regression)

  • 류나현;김형석;강필성
    • 대한산업공학회지
    • /
    • 제42권5호
    • /
    • pp.314-326
    • /
    • 2016
  • The purpose of variable selection techniques is to select a subset of relevant variables for a particular learning algorithm in order to improve the accuracy of prediction model and improve the efficiency of the model. We conduct an empirical analysis to evaluate and compare seven well-known variable selection techniques for multiple linear regression model, which is one of the most commonly used regression model in practice. The variable selection techniques we apply are forward selection, backward elimination, stepwise selection, genetic algorithm (GA), ridge regression, lasso (Least Absolute Shrinkage and Selection Operator) and elastic net. Based on the experiment with 49 regression data sets, it is found that GA resulted in the lowest error rates while lasso most significantly reduces the number of variables. In terms of computational efficiency, forward/backward elimination and lasso requires less time than the other techniques.

On a Bayesian Estimation of Multivariate Regression Models with Constrained Coefficient Matrix

  • Kim, Hea-Jung
    • 품질경영학회지
    • /
    • 제26권4호
    • /
    • pp.151-165
    • /
    • 1998
  • Consider the linear multivariate regression model $Y=X_1B_1+X_2B_2+U$, where Vec(U)~N(0, $\sum \bigotimes I_N$). This paper is concerned with Bayes infreence of the model when it is suspected that the elements of $B_2$ are constrained in the form of intervals. The use of the Gibbs sampler as a method for calculating Bayesian marginal posterior desnities of the parameters under a generalized conjugate prior is developed. It is shown that the a, pp.oach is straightforward to specify distributionally and to implement computationally, with output readily adopted for required inference summaries. The method developed is a, pp.ied to a real problem.

  • PDF

Use of GIS to Develop a Multivariate Habitat Model for the Leopard Cat (Prionailurus bengalensis) in Mountainous Region of Korea

  • Rho, Paik-Ho
    • Journal of Ecology and Environment
    • /
    • 제32권4호
    • /
    • pp.229-236
    • /
    • 2009
  • A habitat model was developed to delineate potential habitat of the leopard cat (Prionailurus bengalensis) in a mountainous region of Kangwon Province, Korea. Between 1997 and 2005, 224 leopard cat presence sites were recorded in the province in the Nationwide Survey on Natural Environments. Fifty percent of the sites were used to develop a habitat model, and the remaining sites were used to test the model. Fourteen environmental variables related to topographic features, water resources, vegetation and human disturbance were quantified for 112 of the leopard cat presence sites and an equal number of randomly selected sites. Statistical analyses (e.g., t-tests, and Pearson correlation analysis) showed that elevation, ridges, plains, % water cover, distance to water source, vegetated area, deciduous forest, coniferous forest, and distance to paved road differed significantly (P < 0.01) between presence and random sites. Stepwise logistic regression was used to develop a habitat model. Landform type (e.g., ridges vs. plains) is the major topographic factor affecting leopard cat presence. The species also appears to prefer deciduous forests and areas far from paved roads. The habitat map derived from the model correctly classified 93.75% of data from an independent sample of leopard cat presence sites, and the map at a regional scale showed that the cat's habitats are highly fragmented. Protection and restoration of connectivity of critical habitats should be implemented to preserve the leopard cat in mountainous regions of Korea.

다변량분석을 이용한 터널에서의 효율적인 암반분류에 관한 연구 (A Study of Efficient Rock Mass Rating for Tunnel Using Multivariate Analysis)

  • 위용곤;노상림;윤지선
    • 한국터널지하공간학회 논문집
    • /
    • 제2권2호
    • /
    • pp.41-49
    • /
    • 2000
  • 지하 터널 굴착 등의 암반 공학적 문제에 있어서 암반분류가 널리 적용되고 있다. 하지만, 조사 방법이 체계화되어 있지 않아서 터널 지질 전문가라 할지라도 암반분류에 어려움이 많은 문제점을 가지고 있다. 본 연구에서는 다변량분석을 이용하여 객관적이고 사용하기 간편한 암반분류법을 제시하였다. RMR 요소는 RQD, 절리상태, 지하수, 강도, 보정, 절리간격 순으로 중요도가 결정되었으며, 각각의 단계에서 RMR에 관한 최적의 다중회귀모형식을 제시하였다.

  • PDF

New Dispersion Function in the Rank Regression

  • Choi, Young-Hun
    • Communications for Statistical Applications and Methods
    • /
    • 제9권1호
    • /
    • pp.101-113
    • /
    • 2002
  • In this paper we introduce a new score generating (unction for the rank regression in the linear regression model. The score function compares the $\gamma$'th and s\`th power of the tail probabilities of the underlying probability distribution. We show that the rank estimate asymptotically converges to a multivariate normal. further we derive the asymptotic Pitman relative efficiencies and the most efficient values of $\gamma$ and s under the symmetric distribution such as uniform, normal, cauchy and double exponential distributions and the asymmetric distribution such as exponential and lognormal distributions respectively.

Marginal Likelihoods for Bayesian Poisson Regression Models

  • Kim, Hyun-Joong;Balgobin Nandram;Kim, Seong-Jun;Choi, Il-Su;Ahn, Yun-Kee;Kim, Chul-Eung
    • Communications for Statistical Applications and Methods
    • /
    • 제11권2호
    • /
    • pp.381-397
    • /
    • 2004
  • The marginal likelihood has become an important tool for model selection in Bayesian analysis because it can be used to rank the models. We discuss the marginal likelihood for Poisson regression models that are potentially useful in small area estimation. Computation in these models is intensive and it requires an implementation of Markov chain Monte Carlo (MCMC) methods. Using importance sampling and multivariate density estimation, we demonstrate a computation of the marginal likelihood through an output analysis from an MCMC sampler.

선형다변회귀모델과 LP-PSOLA 합성방식을 이용한 음성변환 (Voice Conversion Using Linear Multivariate Regression Model and LP-PSOLA Synthesis Method)

  • 권홍석;배건성
    • 한국음향학회지
    • /
    • 제20권3호
    • /
    • pp.15-23
    • /
    • 2001
  • 본 논문에서는 임의의 사람이 발성한 음성을 마치 다른 사람이 발성한 것처럼 들리도록 하는 음성변환 기술에 대하여 설명하고, 화자간의 성도 특성과 여기신호 특성 파라미터 변환을 독립적으로 수행하기 위한 변환방법을 실험한다. 성도 특성 파라미터 변환은 입력되는 음성신호에서 LPC (Linear Predictive Cofficient)켑스트럼을 추출하여 선형다변회귀모델에 적용하여 수행하고, 여기신호 특성 파라미터 변환은 잔차신호를 추출하여 LP-PSOLA (Linear Predictive-Pitch Synchronous Overlap and Add) 합성방식을 이용한 화자간의 평균 피치주기 변환으로 수행된다. 실험결과는 선형다변회귀모델과 LP-PSOLA 합성방식을 이용하여 변환된 음성이 대상화자의 음성에 유사함을 보여준다

  • PDF

Analyzing Operation Deviation in the Deasphalting Process Using Multivariate Statistics Analysis Method

  • Park, Joo-Hwang;Kim, Jong-Soo;Kim, Tai-Suk
    • 한국멀티미디어학회논문지
    • /
    • 제17권7호
    • /
    • pp.858-865
    • /
    • 2014
  • In the case of system like MES, various sensors collect the data in real time and save it as a big data to monitor the process. However, if there is big data mining in distributed computing system, whole processing process can be improved. In this paper, system to analyze the cause of operation deviation was built using the big data which has been collected from deasphalting process at the two different plants. By applying multivariate statistical analysis to the big data which has been collected through MES(Manufacturing Execution System), main cause of operation deviation was analyzed. We present the example of analyzing the operation deviation of deasphalting process using the big data which collected from MES by using multivariate statistics analysis method. As a result of regression analysis of the forward stepwise method, regression equation has been found which can explain 52% increase of performance compare to existing model. Through this suggested method, the existing petrochemical process can be replaced which is manual analysis method and has the risk of being subjective according to the tester. The new method can provide the objective analysis method based on numbers and statistic.