• Title/Summary/Keyword: multivariate regression models

Search Result 171, Processing Time 0.027 seconds

Empirical process optimization through response surface experiments and model building

  • PARK, SUNG H.
    • Journal of Korean Society for Quality Management
    • /
    • v.8 no.1
    • /
    • pp.3-7
    • /
    • 1980
  • In many industrial processes, there are more than two responses (i.e., yield, percent impurity, etc.) of interest, and it is desirable to determine the optimal levels of the factors (i.e., temperature, pressure, etc.) that influence the responses. Suppose the response relationships are assumed to be approximated by second-order polynomial regression models. The problems considered in this paper is, first, to propose how to select polynomial terms to fit the multivariate regression surfaces for a given set of data, and, second, to propose how to analyze the data to obtain an optimal operating condition for the factors. The proposed techniques were applied for empirical process optimization in a tire company in Korea. This case is presented as an illustration.

  • PDF

Restricted maximum likelihood estimation of a censored random effects panel regression model

  • Lee, Minah;Lee, Seung-Chun
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.4
    • /
    • pp.371-383
    • /
    • 2019
  • Panel data sets have been developed in various areas, and many recent studies have analyzed panel, or longitudinal data sets. Maximum likelihood (ML) may be the most common statistical method for analyzing panel data models; however, the inference based on the ML estimate will have an inflated Type I error because the ML method tends to give a downwardly biased estimate of variance components when the sample size is small. The under estimation could be severe when data is incomplete. This paper proposes the restricted maximum likelihood (REML) method for a random effects panel data model with a censored dependent variable. Note that the likelihood function of the model is complex in that it includes a multidimensional integral. Many authors proposed to use integral approximation methods for the computation of likelihood function; however, it is well known that integral approximation methods are inadequate for high dimensional integrals in practice. This paper introduces to use the moments of truncated multivariate normal random vector for the calculation of multidimensional integral. In addition, a proper asymptotic standard error of REML estimate is given.

Design and performance evaluation of portable electronic nose systems for freshness evaluation of meats II - Performance analysis of electronic nose systems by prediction of total bacteria count of pork meats (육류 신선도 판별을 위한 휴대용 전자코 시스템 설계 및 성능 평가 II - 돈육의 미생물 총균수 예측을 통한 전자코 시스템 성능 검증)

  • Kim, Jae-Gone;Cho, Byoung-Kwan
    • Korean Journal of Agricultural Science
    • /
    • v.38 no.4
    • /
    • pp.761-767
    • /
    • 2011
  • The objective of this study was to predict total bacteria count of pork meats by using the portable electronic nose systems developed throughout two stages of the prototypes. Total bacteria counts were measured for pork meats stored at $4^{\circ}C$ for 21days and compared with the signals of the electronic nose systems. PLS(Partial least square), PCR (Principal component regression), MLR (Multiple linear regression) models were developed for the prediction of total bacteria count of pork meats. The coefficient of determination ($R_p{^2}$) and root mean square error of prediction (RMSEP) for the models were 0.789 and 0.784 log CFU/g with the 1st system for the pork loin, 0.796 and 0.597 log CFU/g with the 2nd system for the pork belly, and 0.661 and 0.576 log CFU/g with the 2nd system for the pork loin respectively. The results show that the developed electronic system has potential to predict total bacteria count of pork meats.

Analysis of Container Shipping Market Using Multivariate Time Series Models (다변량 시계열 모형을 이용한 컨테이너선 시장 분석)

  • Ko, Byoung-Wook;Kim, Dae-Jin
    • Journal of Korea Port Economic Association
    • /
    • v.35 no.3
    • /
    • pp.61-72
    • /
    • 2019
  • In order to enhance the competitiveness of the container shipping industry and promote its development, based on the empirical analyses using multivariate time series models, this study aims to suggest a few strategies related to the dynamics of the container shipping market. It uses the vector autoregressive (VAR) and vector error correction (VEC) models as analytical methodologies. Additionally, it uses the annual trade volumes, fleets, and freight rates as the dataset. According to the empirical results, we can infer that the most exogenous variable, the trade volume, exerted the highest influence on the total dynamics of the container shipping market. Based on these empirical results, this study suggests some implications for ship investment, freight rate forecasting, and the strategies of shipping firms. Concerning ship investment, since the exogenous trade volume variable contributes most to the uncertainty of freight rates, corporate finance can be considered more appropriate for container ship investment than project finance. Concerning the freight rate forecasting, the VAR and VEC models use the past information and the cointegrating regression model assumes future information, and hence the former models are found better than the latter model. Finally, concerning the strategies of shipping firms, this study recommends the use of cycle-linked repayment scheme and services contract.

Chemical Oxygen Demand (COD) Model for the Assessment of Water Quality in the Han River, Korea (한강수질 평가를 위한 COD (화학적 산소 요구량) 모델 평가)

  • Kim, Jae Hyoun;Jo, Jinnam
    • Journal of Environmental Health Sciences
    • /
    • v.42 no.4
    • /
    • pp.280-292
    • /
    • 2016
  • Objectives: The objective of this study was to build COD regression models for the Han River and evaluate water quality. Methods: Water quality data sets for the dry season (as of January) during a four-year period (2012-2015) were collected from the database of the Han River automatic water quality monitoring stations. Statistical techniques, including combined genetic algorithm-multiple linear regression (GA-MLR) were used to build five-descriptor COD models. Multivariate statistical techniques such as principal component analysis (PCA) and cluster analysis (CA) are useful tools for extracting meaningful information. Results: The $r^2$ of the best COD models provided significant high values (> 0.8) between 2012 and 2015. Total organic carbon (TOC) was a surrogate indicator for COD (as COD/TOC) with high reliability ($r^2=0.63$ in 2012, $r^2=0.75$ for 2013, $r^2=0.79$ for 2014 and $r^2=0.85$ for 2015). The ratios of COD/TOC were calculated as 2.08 in 2012, 1.79 in 2013, 1.52 and 1.45 in 2015, indicating that biodegradability in the water body of the Han River was being sustained, thereby further improving water quality. The BOD/COD ratio supported these findings. The cluster analysis revealed higher annual levels of microorganisms and phosphorous at stations along the Hangang-Seoul and Hantangang areas. Nevertheless, the overall water quality over the last four years showed an observable trend toward continuous improvement. These findings also suggest that non-point pollution control strategies should consider the influence of upstreams and downstreams to protect water quality in the Han River. Conclusion: This data analysis procedure provided an efficient and comprehensive tool to interpret complex water quality data matrices. Results from a trend analysis provided much important information about sources and parameters for Han River water quality management.

Study on Temporal Comparison Analysis of Factors to Affect Travel Time Budget: A Case for Seoul (통행시간예산에 미치는 요인의 시계열적 비교·분석 연구: 서울시를 사례로)

  • Lee, Hyangsook;Choo, Sangho
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.19 no.6
    • /
    • pp.180-191
    • /
    • 2020
  • This study analyzes factors that affect average daily travel time budgets, using the Time Use Survey data from 1999 to 2014 in Seoul. We first developed multivariate regression models for travel time from each year, considering demographic and socio-economic variables as well as non-home activity time. The model results showed that household and personal characteristics and non-home activities significantly affect travel time, and their effects are different over time. In addition, we developed seemingly unrelated regression (SUR) models for time allocation for non-home activity and travel, considering their correlations, and explanatory variables were compared over time. Overall, demographic and socio-economic variables significantly affect travel time as well as non-home activity time.

Hybrid Learning Architectures for Advanced Data Mining:An Application to Binary Classification for Fraud Management (개선된 데이터마이닝을 위한 혼합 학습구조의 제시)

  • Kim, Steven H.;Shin, Sung-Woo
    • Journal of Information Technology Application
    • /
    • v.1
    • /
    • pp.173-211
    • /
    • 1999
  • The task of classification permeates all walks of life, from business and economics to science and public policy. In this context, nonlinear techniques from artificial intelligence have often proven to be more effective than the methods of classical statistics. The objective of knowledge discovery and data mining is to support decision making through the effective use of information. The automated approach to knowledge discovery is especially useful when dealing with large data sets or complex relationships. For many applications, automated software may find subtle patterns which escape the notice of manual analysis, or whose complexity exceeds the cognitive capabilities of humans. This paper explores the utility of a collaborative learning approach involving integrated models in the preprocessing and postprocessing stages. For instance, a genetic algorithm effects feature-weight optimization in a preprocessing module. Moreover, an inductive tree, artificial neural network (ANN), and k-nearest neighbor (kNN) techniques serve as postprocessing modules. More specifically, the postprocessors act as second0order classifiers which determine the best first-order classifier on a case-by-case basis. In addition to the second-order models, a voting scheme is investigated as a simple, but efficient, postprocessing model. The first-order models consist of statistical and machine learning models such as logistic regression (logit), multivariate discriminant analysis (MDA), ANN, and kNN. The genetic algorithm, inductive decision tree, and voting scheme act as kernel modules for collaborative learning. These ideas are explored against the background of a practical application relating to financial fraud management which exemplifies a binary classification problem.

  • PDF

Soft computing-based slope stability assessment: A comparative study

  • Kaveh, A.;Hamze-Ziabari, S.M.;Bakhshpoori, T.
    • Geomechanics and Engineering
    • /
    • v.14 no.3
    • /
    • pp.257-269
    • /
    • 2018
  • Analysis of slope stability failures, as one of the complex natural hazards, is one of the important research issues in the field of civil engineering. Present paper adopts and investigates four soft computing-based techniques for this problem: Patient Rule-Induction Method (PRIM), M5' algorithm, Group Method of data Handling (GMDH) and Multivariate Adaptive Regression Splines (MARS). A comprehensive database consisting of 168 case histories is used to calibrate and test the developed models. Six predictive variables including slope height, slope angle, bulk density, cohesion, angle of internal friction, and pore water pressure ratio were considered to generate new models. The results of test studies are used for feasibility, effectiveness and practicality comparison of techniques with each other, and with the other available well-known methods in the literature. Results show that all methods not only are feasible but also result in better performance than previously developed soft computing based predictive models and tools. It is shown that M5' and PRIM algorithms are the most effective and practical prediction models.

Hedging effectiveness of KOSPI200 index futures through VECM-CC-GARCH model (벡터오차수정모형과 다변량 GARCH 모형을 이용한 코스피200 선물의 헷지성과 분석)

  • Kwon, Dongan;Lee, Taewook
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.6
    • /
    • pp.1449-1466
    • /
    • 2014
  • In this paper, we consider a hedge portfolio based on futures of underlying asset. A classical way to estimate a hedge ratio for a hedge portfolio of a spot and futures is a regression analysis. However, a regression analysis is not capable of reflecting long-run equilibrium between a spot and futures and volatility clustering in the conditional variance of financial time series. In order to overcome such defects, we analyzed KOSPI200 index and futures using VECM-CC-GARCH model and computed a hedge ratio from the estimated conditional covariance-variance matrix. In real data analysis, we compared a regression and VECM-CC-GARCH models in terms of hedge effectiveness based on variance, value at risk and expected shortfall of log-returns of hedge portfolio. The empirical results show that the multivariate GARCH models significantly outperform a regression analysis and improve hedging effectiveness in the period of high volatility.

Prognostic Factors of Hemifacial Spasm after Microvascular Decompression

  • Kim, Hong-Rae;Rhee, Deok-Joo;Kong, Doo-Sik;Park, Kwan
    • Journal of Korean Neurosurgical Society
    • /
    • v.45 no.6
    • /
    • pp.336-340
    • /
    • 2009
  • Objective : The factors that influence the prognosis of patients with hemifacial spasm (HFS) treated by microvascular decompression (MVD) have not been definitely established. We report a prospective study evaluating the prognostic factors in patients undergoing MVD for HFS. Methods : From January 2004 to September 2006, the authors prospectively studied a series of 293 patients who underwent MVD for HFS. We prospectively analyzed a number of variables in order to evaluate the predictive value of independent variables for the prognosis of patients undergoing MVD. The patients were followed-up at regular intervals and divided into as cured and unsatisfactory groups based on symptom relief. Uni- and multivariate analyses were performed using logistic regression models. Results : A total 273 of 293 (94.2%) patients achieved symptom relief within one year after the operation. Intraoperatively, the indentation of the root exit zone was observed in 259 (88.5%) patients. Uni- and multivariate analyses revealed that the symptoms at postoperative 3 months (p<0.001) and indentation of the root exit zone (p=0.036) were associated with good outcomes. Conclusion : The intraoperative finding of root exit zone indentation will help physicians determine the prognosis in patients with HFS. To predict the prognosis of HFS, a regular follow-up period of at least 3 months following MVD should be required.