• Title/Summary/Keyword: Multivariate Statistical Method

Search Result 294, Processing Time 0.025 seconds

Bankruptcy Prediction using Support Vector Machines (Support Vector Machine을 이용한 기업부도예측)

  • Park, Jung-Min;Kim, Kyoung-Jae;Han, In-Goo
    • Asia pacific journal of information systems
    • /
    • v.15 no.2
    • /
    • pp.51-63
    • /
    • 2005
  • There has been substantial research into the bankruptcy prediction. Many researchers used the statistical method in the problem until the early 1980s. Since the late 1980s, Artificial Intelligence(AI) has been employed in bankruptcy prediction. And many studies have shown that artificial neural network(ANN) achieved better performance than traditional statistical methods. However, despite ANN's superior performance, it has some problems such as overfitting and poor explanatory power. To overcome these limitations, this paper suggests a relatively new machine learning technique, support vector machine(SVM), to bankruptcy prediction. SVM is simple enough to be analyzed mathematically, and leads to high performances in practical applications. The objective of this paper is to examine the feasibility of SVM in bankruptcy prediction by comparing it with ANN, logistic regression, and multivariate discriminant analysis. The experimental results show that SVM provides a promising alternative to bankruptcy prediction.

Identifying Multiple Leverage Points ad Outliers in Multivariate Linear Models

  • Yoo, Jong-Young
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.3
    • /
    • pp.667-676
    • /
    • 2000
  • This paper focuses on the problem of detecting multiple leverage points and outliers in multivariate linear models. It is well known that he identification of these points is affected by masking and swamping effects. To identify them, Rousseeuw(1985) used robust estimators of MVE(Minimum Volume Ellipsoids), which have the breakdown point of 50% approximately. And Rousseeuw and van Zomeren(1990) suggested the robust distance based on MVE, however, of which the computation is extremely difficult when the number of observations n is large. In this study, e propose a new algorithm to reduce the computational difficulty of MVE. The proposed method is powerful in identifying multiple leverage points and outlies and also effective in reducing the computational difficulty of MVE.

  • PDF

Unmasking Multiple Outliers in Multivariate Data

  • Yoo Jong-Young
    • Communications for Statistical Applications and Methods
    • /
    • v.13 no.1
    • /
    • pp.29-38
    • /
    • 2006
  • We proposed a procedure for detecting of multiple outliers in multivariate data. Rousseeuw and van Zomeren (1990) have suggested the robust distance $RD_i$ by using the Resampling Algorithm. But $RD_i$ are based on the assumption that X is in the general position.(X is said to be in the general position when every subsample of size p+1 has rank p) From the practical points of view, this is clearly unrealistic. In this paper, we proposed a computing method for approximating MVE, which is not subject to these problems. The procedure is easy to compute, and works well even if subsample is singular or nearly singular matrix.

A Review of the Statistical Analysis used in Clinical Articles Published on Journal of Korean Neurosurgical Society

  • Kang, Wee-Chang
    • Journal of Korean Neurosurgical Society
    • /
    • v.40 no.4
    • /
    • pp.304-308
    • /
    • 2006
  • Statistical analyses used in clinical articles published on the Journal of Korean Neurosurgical Society were identified and appropriateness of statistical aspects in reporting results was assessed. Forty seven clinical articles were selected in this study, which were published from February, 2005 to February, 2006 on the journal. The frequency of statistical analysis was as follows : descriptive statistics only 24 [51.1%]. one type of statistical method 10 [21.3%], two or more methods 13 [27.6%]. An assessment of statistical aspects was performed in 24 clinical articles reporting inferential statistics. Ten articles [41.7%] did not adequately describe or reference all statistical methods used. There were six articles [25.0%] not reporting the confidence level used as the critical criteria of the statistical significance. In thirteen articles [54.2%] it seems more appropriate to implement multivariate analyses in addition to univariate analyses. We recommend that the journal readers should concentrate on improving their knowledge of basic statistics and statistical review for manuscripts submitted should be sought from professionals in the fields of biostatistics and epidemiology.

Multivariate Statistical Analysis Approach to Predict the Reactor Properties and the Product Quality of a Direct Esterification Reactor for PET Synthesis (다변량 통계분석법을 이용한 PET 중합공정 중 직접 에스테르화 반응기의 거동 및 생산제품 예측)

  • Kim Sung Young;Chung Chang Bock;Choi Soo Hyoung;Lee Bomsock;Lee Bomsock
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.11 no.6
    • /
    • pp.550-557
    • /
    • 2005
  • The multivariate statistical analysis methods, using both multiple linear regression(MLR) and partial least square(PLS), have been applied to predict the reactor properties and the product quality of a direct esterification reactor for polyethylene terephthalate(PET) synthesis. On the basis of the set of data including the flow rate of water vapor, the flow rate of EG vapor, the concentration of acid end groups of a product and other operating conditions such as temperature, pressure, reaction times and feed monomer mole ratio, two multi-variable analysis methods have been applied. Their regression and prediction abilities also have been compared. The prediction results are critically compared with the actual plant data and the other mathematical model based results in reliability. This paper shows that PLS method approach can be used for the reasonably accurate prediction of a product quality of a direct esterification reactor in PET synthesis process.

A Test of the Multivariate Normality Based on Likelihood Functions (가능도 함수를 기초로 한 다변량 정규성 검정)

  • Yeo, In-Kwon
    • The Korean Journal of Applied Statistics
    • /
    • v.15 no.2
    • /
    • pp.223-232
    • /
    • 2002
  • The present paper develops a test of the multivariate normality based on nonlinear transformations and the likelihood function. For checking the normality, we test the shape parameter which indexes the family of transformations. A score test and a parametric bootstrap test are used to evaluate the discrepancy between the data and a multivariate normal distribution. In order to compare the performance of our test with the existing tests, a simulation study was carried out for several situations where nuisance parameters have to be estimated. The results showed that the proposed method is superior to the existing methods.

An Alternating Approach of Maximum Likelihood Estimation for Mixture of Multivariate Skew t-Distribution (치우친 다변량 t-분포 혼합모형에 대한 최우추정)

  • Kim, Seung-Gu
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.5
    • /
    • pp.819-831
    • /
    • 2014
  • The Exact-EM algorithm can conventionally fit a mixture of multivariate skew distribution. However, it suffers from highly expensive computational costs to calculate the moments of multivariate truncated t-distribution in E-step. This paper proposes a new SPU-EM method that adopts the AECM algorithm principle proposed by Meng and van Dyk (1997)'s to circumvent the multi-dimensionality of the moments. This method offers a shorter execution time than a conventional Exact-EM algorithm. Some experments are provided to show its effectiveness.

Statistical Matching Techniques Using the Robust Regression Model (로버스트 회귀모형을 이용한 자료결합방법)

  • Jhun, Myoung-Shic;Jung, Ji-Song;Park, Hye-Jin
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.6
    • /
    • pp.981-996
    • /
    • 2008
  • Statistical matching techniques whose aim is to achieve a complete data file from different sources. Since the statistical matching method proposed by Rubin (1986) assumes the multivariate normality for data, using this method to data which violates the assumption would involve some problems. This research proposed the statistical matching method using robust regression as an alternative to the linear regression. Furthermore, we carried out a simulation study to compare the performance of the robust regression model and the linear regression model for the statistical matching.

Estimating Quarterly GRDP Using Benchmarking Method (벤치마킹방법을 이용한 분기 GRDP의 추정)

  • Lee, Geung-Hee
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.1
    • /
    • pp.75-88
    • /
    • 2009
  • Gross Regional Domestic Product (GRDP) is regarded as an essential information to understand regional economy. However, GRDP is hardly used for establishment of regional economic plan and related statistical research due to its late and yearly publication. Therefore, it is necessary to estimate quarterly GRDP to grasp the current regional economy faster In this study, considering the comovement between GDP and GRDP for the same industry, reference series are made. Quarterly GRDP is estimated the following two steps; First, preliminary quarterly GRDP is estimated using Chow-Lin's method based on the reference series to eliminate temporal discrepancies. Second, preliminary quarterly GRDP is adjusted using Denton's multivariate method to eliminate contemporaneous discrepancies.

A Trimmed Spatial Median Estimator Using Bootstrap Method (붓스트랩을 활용한 최적 절사공간중위수 추정량)

  • Lee, Dong-Hee;Jung, Byoung-Cheol
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.2
    • /
    • pp.375-382
    • /
    • 2010
  • In this study, we propose a robust estimator of the multivariate location parameter by means of the spatial median based on data trimming which extending trimmed mean in the univariate setup. The trimming quantity of this estimator is determined by the bootstrap method, and its covariance matrix is estimated by using the double bootstrap method. This extends the work of Jhun et al. (1993) to the multivariate case. Monte Carlo study shows that the proposed trimmed spatial median estimator yields better efficiency than a spatial median, while its covariance matrix based on double bootstrap overcomes the under-estimating problem occurred on single bootstrap method.