• Title/Summary/Keyword: Multivariate Statistical Analysis

Search Result 632, Processing Time 0.027 seconds

Some Diagnostic Results in Discriminant Analysis

  • Bae, Whasoo;Hwang, Soonyoung
    • Journal of the Korean Statistical Society
    • /
    • v.30 no.1
    • /
    • pp.139-151
    • /
    • 2001
  • Although lots of works are done in influence diagnostics, results in the multivariate analysis are quite rare. One of recent works done by Fung(1995) is about the single case influence diagnostics in the linear discriminant analysis. In this paper we extend Fung's results to the multiple cases diagnostics which are necessary in the linear discriminant analysis for two reasons among others; First, the masking effect cannot be detected by single case diagnostics and secondly two populations are concerned in the discriminant analysis, i.e., influential cases can occur in one or both populations.

  • PDF

Application of Multivariate Statistical Analysis Technique in Landfill Investigation (매립물 특성 조사를 위한 다변량 통계분석 기법의 응용)

  • Kwon, Byung-Doo;Kim, Cha-Soup
    • Journal of the Korean earth science society
    • /
    • v.18 no.6
    • /
    • pp.515-521
    • /
    • 1997
  • To investigate the nature of the waste materials in the Nanjido Landfill, we have conducted multivariate statistical analysis of geophysical data set comprised of magnetic, gravity, LandSat TM thermal band and surface depression measurement data. Because these data sets show different responses to the depth, we have transformed the observed total field magnetic data and gravity data to the residual reduced-to-pole(RTP) magnetic anomalies and the three dimensional density anomalies, respectively, and utilized the informations about the upper shallow part of the landfills only in the following process. For the statistical analysis at the points of depression measurement, the magnetic, density and LandSat data values at these points are determined by interpolation process. Since the multivarite statistical analysis technique utilizes a clustering algorithm for classification of data set and we have measured the dissimilarity between objects by using Euclidean distance, standardization was applied prior to distance calculation in order to eliminate any scaling effects due to different measurement unit of each data set. The hierarchial grouping technique was used to construct the dendrogram. The optimum number of statistical groups(clusters), which are classified on the basis of geophysical and geotechnical characteristics, appeared to be six on the resulting dendrogram. The result of this study suggests that the dimension and nature of the multicomponent waste landfills can be identified by application of the multivarite statistical analysis technique to integrated geophysical data sets.

  • PDF

Simple Compromise Strategies in Multivariate Stratification

  • Park, Inho
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.2
    • /
    • pp.97-105
    • /
    • 2013
  • Stratification (among other applications) is a popular technique used in survey practice to improve the accuracy of estimators. Its full potential benefit can be gained by the effective use of auxiliary variables in stratification related to survey variables. This paper focuses on the problem of stratum formation when multiple stratification variables are available. We first review a variance reduction strategy in the case of univariate stratification. We then discuss its use for multivariate situations in convenient and efficient ways using three methods: compromised measures of size, principal components analysis and a K-means clustering algorithm. We also consider three types of compromising factors to data when using these three methods. Finally, we compare their efficiency using data from MU281 Swedish municipality population.

Applications of NMR spectroscopy based metabolomics: a review

  • Yoon, Dahye;Lee, Minji;Kim, Siwon;Kim, Suhkmann
    • Journal of the Korean Magnetic Resonance Society
    • /
    • v.17 no.1
    • /
    • pp.1-10
    • /
    • 2013
  • Metabolomics is the study which detects the changes of metabolites level. Metabolomics is a terminal view of the biological system. The end products of the metabolism, metabolites, reflect the responses to external environment. Therefore metabolomics gives the additional information about understanding the metabolic pathways. These metabolites can be used as biomarkers that indicate the disease or external stresses such as exposure to toxicant. Many kinds of biological samples are used in metabolomics, for example, cell, tissue, and bio fluids. NMR spectroscopy is one of the tools of metabolomics. NMR data are analyzed by multivariate statistical analysis and target profiling technique. Recently, NMR-based metabolomics is a growing field in various studies such as disease diagnosis, forensic science, and toxicity assessment.

Marginal Likelihoods for Bayesian Poisson Regression Models

  • Kim, Hyun-Joong;Balgobin Nandram;Kim, Seong-Jun;Choi, Il-Su;Ahn, Yun-Kee;Kim, Chul-Eung
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.2
    • /
    • pp.381-397
    • /
    • 2004
  • The marginal likelihood has become an important tool for model selection in Bayesian analysis because it can be used to rank the models. We discuss the marginal likelihood for Poisson regression models that are potentially useful in small area estimation. Computation in these models is intensive and it requires an implementation of Markov chain Monte Carlo (MCMC) methods. Using importance sampling and multivariate density estimation, we demonstrate a computation of the marginal likelihood through an output analysis from an MCMC sampler.

Bankruptcy Prediction using Support Vector Machines (Support Vector Machine을 이용한 기업부도예측)

  • Park, Jung-Min;Kim, Kyoung-Jae;Han, In-Goo
    • Asia pacific journal of information systems
    • /
    • v.15 no.2
    • /
    • pp.51-63
    • /
    • 2005
  • There has been substantial research into the bankruptcy prediction. Many researchers used the statistical method in the problem until the early 1980s. Since the late 1980s, Artificial Intelligence(AI) has been employed in bankruptcy prediction. And many studies have shown that artificial neural network(ANN) achieved better performance than traditional statistical methods. However, despite ANN's superior performance, it has some problems such as overfitting and poor explanatory power. To overcome these limitations, this paper suggests a relatively new machine learning technique, support vector machine(SVM), to bankruptcy prediction. SVM is simple enough to be analyzed mathematically, and leads to high performances in practical applications. The objective of this paper is to examine the feasibility of SVM in bankruptcy prediction by comparing it with ANN, logistic regression, and multivariate discriminant analysis. The experimental results show that SVM provides a promising alternative to bankruptcy prediction.

On-Line Condition Monitoring for Rotating Machinery Using Multivariate Statistical Analysis (다변량 통계 분석 방법을 이용한 회전기계 이상 온라인 감시)

  • Kim, Heung-Mook;Lim, Eun-Seop
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2000.06a
    • /
    • pp.1108-1113
    • /
    • 2000
  • A condition monitoring methodology for rotating machinery is proposed based on multivariate statistical analysis. The CMS usually are using the vibration signal amplitude such as acceleration RMS, peak and velocity RMS to detect machine faults but the information is not so enough that CMS cannot perform reliable monitoring. So new parameters are added such as shape factor, crest factor, kurtosis and skewness as time domain parameters and spectrum amplitude of rotating frequency, $2^{nd}$ harmonics and gear mesh frequency etc. as frequency domain parameters. Many parameters are combined to represent the machine state using the Hotelling's $T^2$ statistics. The proposed methodology is tested in laboratory and the on-line experiment has shown that the proposed methodology offers a reliable monitoring for rotating machinery.

  • PDF

Fused inverse regression with multi-dimensional responses

  • Cho, Youyoung;Han, Hyoseon;Yoo, Jae Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.3
    • /
    • pp.267-279
    • /
    • 2021
  • A regression with multi-dimensional responses is quite common nowadays in the so-called big data era. In such regression, to relieve the curse of dimension due to high-dimension of responses, the dimension reduction of predictors is essential in analysis. Sufficient dimension reduction provides effective tools for the reduction, but there are few sufficient dimension reduction methodologies for multivariate regression. To fill this gap, we newly propose two fused slice-based inverse regression methods. The proposed approaches are robust to the numbers of clusters or slices and improve the estimation results over existing methods by fusing many kernel matrices. Numerical studies are presented and are compared with existing methods. Real data analysis confirms practical usefulness of the proposed methods.

Detection of Hotspots on Multivariate Spatial Data

  • Moon, Sung-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.4
    • /
    • pp.1181-1190
    • /
    • 2006
  • Statistical analyses for spatial data are important features for various types of fields. Spatial data are taken at specific locations or within specific regions and their relative positions are recorded. Lattice data are synoptic observation covering an entire spatial region, like cancer rates corresponding to each county in a state. Until now, the echelon analysis has been applied only to univariate spatial data. As a result, it is impossible to detect the hotspots on the multivariate spatial data In this paper, we expand the spatial data to time series structure. And then we analyze them on the time space and detect the hotspots. Echelon dendrogram has been made by piling up each multivariate spatial data to bring time spatial data. We perform the structural analysis of temporal spatial data.

  • PDF

EXPERIMENTAL ANALYSIS OF DRIVING PATTERNS AND FUEL ECONOMY FOR PASSENGER CARS IN SEOUL

  • Sa, J.-S.;Chung, N.-H.;Sunwoo, M.-H.
    • International Journal of Automotive Technology
    • /
    • v.4 no.2
    • /
    • pp.101-108
    • /
    • 2003
  • There are a lot of factors that influence automotive fuel economy such as average trip time per kilometer, average trip speed, the number of times of vehicle stationary, and so forth. These factors depend on road conditions and traffic environment. In this study, various driving data were measured and recorded during road tests in Seoul. The accumulated road test mileage is around 1,300 kilometers. The objective of the study is to identify the driving patterns of the Seoul metropolitan area and to analyze the fuel economy based on these driving patterns. The driving data which was acquired through road tests was analysed statistically in order to obtain the driving characteristics via modal analysis, speed analysis, and speed-acceleration analysis. Moreover, the driving data was analyzed by multivariate statistical techniques including correlation analysis, principal component analysis, and multiple linear regression analysis in order to obtain the relationships between influencing factors on fuel economy. The analyzed results show that the average speed is around 29.2 km/h, and the average fuel economy is 10.23 km/L. The vehicle speed of the Seoul metropolitan area is slower, and the stop-and-go operation is more frequent than FTP-75 test mode which is used for emission and fuel economy tests. The average trip time per kilometer is one of the most important factors in fuel consumption, and the increase of the average speed is desirable for reducing emissions and fuel consumption.