• Title/Summary/Keyword: multivariate data analysis

Search Result 1,402, Processing Time 0.028 seconds

Some Diagnostic Results in Discriminant Analysis

  • Bae, Whasoo;Hwang, Soonyoung
    • Journal of the Korean Statistical Society
    • /
    • v.30 no.1
    • /
    • pp.139-151
    • /
    • 2001
  • Although lots of works are done in influence diagnostics, results in the multivariate analysis are quite rare. One of recent works done by Fung(1995) is about the single case influence diagnostics in the linear discriminant analysis. In this paper we extend Fung's results to the multiple cases diagnostics which are necessary in the linear discriminant analysis for two reasons among others; First, the masking effect cannot be detected by single case diagnostics and secondly two populations are concerned in the discriminant analysis, i.e., influential cases can occur in one or both populations.

  • PDF

Predicting Unknown Composition of a Mixture Using Independent Component Analysis

  • Lee, Hye-Seon;Park, Hae-Sang;Jun, Chi-Hyuck
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2005.04a
    • /
    • pp.127-134
    • /
    • 2005
  • A suitable representation for the conceptual simplicity of the data in statistics and signal processing is essential for a subsequent analysis such as prediction, pattern recognition, and spatial analysis. Independent component analysis (ICA) is a statistical method for transforming an observed high-dimensional multivariate data into statistically independent components. ICA has been applied increasingly in wide fields of spectrum application since ICA is able to extract unknown components of a mixture from spectra. We focus on application of ICA for separating independent sources and predicting each composition using extracted components. The theory of ICA is introduced and an application to a metal surface spectra data will be described, where subsequent analysis using non-negative least square method is performed to predict composition ratio of each sample. Furthermore, some simulation experiments are performed to demonstrate the performance of the proposed approach.

  • PDF

Detecting outliers in multivariate data and visualization-R scripts (다변량 자료에서 특이점 검출 및 시각화 - R 스크립트)

  • Kim, Sung-Soo
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.4
    • /
    • pp.517-528
    • /
    • 2018
  • We provide R scripts to detect outliers in multivariate data and visualization. Detecting outliers is provided using three approaches 1) Robust Mahalanobis distance, 2) High Dimensional data, 3) density-based approach methods. We use the following techniques to visualize detected potential outliers 1) multidimensional scaling (MDS) and minimal spanning tree (MST) with k-means clustering, 2) MDS with fviz cluster, 3) principal component analysis (PCA) with fviz cluster. For real data sets, we use MLB pitching data including Ryu, Hyun-jin in 2013 and 2014. The developed R scripts can be downloaded at "http://www.knou.ac.kr/~sskim/ddpoutlier.html" (R scripts and also R package can be downloaded here).

Big Data Analysis Using Principal Component Analysis (주성분 분석을 이용한 빅데이터 분석)

  • Lee, Seung-Joo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.6
    • /
    • pp.592-599
    • /
    • 2015
  • In big data environment, we need new approach for big data analysis, because the characteristics of big data, such as volume, variety, and velocity, can analyze entire data for inferring population. But traditional methods of statistics were focused on small data called random sample extracted from population. So, the classical analyses based on statistics are not suitable to big data analysis. To solve this problem, we propose an approach to efficient big data analysis. In this paper, we consider a big data analysis using principal component analysis, which is popular method in multivariate statistics. To verify the performance of our research, we carry out diverse simulation studies.

Categorization of the Body Types and Their Characteristics of Obese Korean Men (한국 비만 남성의 체형 분류 및 특성 분석)

  • Nam, Jong-Yong;Park, Sung-Joon;Jung, Eui-S.
    • Journal of the Ergonomics Society of Korea
    • /
    • v.26 no.4
    • /
    • pp.103-111
    • /
    • 2007
  • The purpose of this study is to categorize and analyze the body shape of obese Korean men that are needed for industrial design. Using the anthropometric data that were surveyed through the 5th Size Korea project, this study was conducted in four steps mostly through the multivariate statistical analysis. In the first step, Broca, BMI, WHR indices are used to define obesity and select obese men from Korean adults and teens. After 34 human anthropometric variables are supposed to be related to obesity were extracted through an expect survey. In the second step, a factor analysis was executed for those human anthropometric variables. Through this analysis, we obtained the human body factors that are related to the representation of obesity. Then the third step, we used a cluster analysis from the result of the factor analysis. And ANOVA analysis was also conducted to obtain the critical obese human anthropometric variables. In the final step, we found the characteristics of the body types of obese men according to clusters and ages. The body types of obese men classified in the study are expected to be applied to product design for clothing, furniture, automobile packaging, etc.

Multivariate Meta-Analysis Methods of Comparing the Sensitivity and Specificity of Two Diagnostic Tests (두 진단검사의 비교에 대한 민감도와 특이도의 다변량 메타분석법)

  • Nam, Seon-Young;Song, Hae-Hiang
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.1
    • /
    • pp.57-69
    • /
    • 2011
  • Researchers are continuously trying to find innovative diagnostic tests and published articles are accumulating at an enormous rate in many medical fields. Meta-analysis enables previously published study results to be reviewed and summarized; therefore, an objective assessment of diagnostic tests can be done with a meta-analysis of sensitivities and specificities. Data obtained by applying two diagnostic tests to a well-defined group of diseased patients produce a pair of sensitivity and by applying the same medical tests to a group of non-diseased subjects produce a pair of specificity. The statistical tests in the meta-analysis need to consider the correlatedness of the results from two diagnostic tests applied to the same diseased and non-diseased subjects. The associations between two diagnostic test results are often found to be unequal for the diseased and non-diseased subjects. In this paper, multivariate meta-analytic methods are studied by taking into account the different associations between correlated variables. On the basis of Monte Carlo simulations, we evaluate the performance of the multivariate meta-analysis methods proposed in this paper.

School Safety Education Factors Predicting Injury Prevalence Among Korean Adolescence (학교의 안전교육 관련 특성이 청소년의 사고발생 예측에 미치는 영향)

  • 이명선;박경옥
    • Korean Journal of Health Education and Promotion
    • /
    • v.21 no.2
    • /
    • pp.147-165
    • /
    • 2004
  • Injury is a leading cause of death in the children and adolescent populations. In particular, more than 80% of unintentional injury was related to risk-taking behaviors involved in diverse accidents around school and home. Therefore, educational approaches should be provided for children and adolescent populations, and schools are the essential and appropriate sites to conduct safety education. This study was conducted to identify injury prevalence and safety education at schools among middle and high school students in Korea. About 1,034 middle and high students in 28 schools participated in a self-administered survey. The target schools were selected from the stratified random sampling method throughout schools of seven metropolitan cities in Korea. The questionnaires were delivered to the vice-principals by ground mailing service and the vice-principals administered survey data collection. The questionnaire asked about safety education provided in schools, injury experience in the last year, needs for injury prevention class in school, and demographics. All survey responses were entered into SPSS worksheet. Multivariate analysis of variance (MANOVA) and descriptive discriminant analysis (DDA) were used in statistical analysis with SPSS software 11.1. Multivariate analysis of variance was conducted as a preliminary analysis of DDA. According to the result of multivariate analysis of variance, gender (man), grade (poor), living with both parents, and displaying injury prevention messages on school news board were significantly different between the injured student group and the uninjured student group (p= .00). These four factors also had significant effects on students' injury experience in DDA, although correlation of the four factors with injury experience was weak overall based on their canonical function coefficients. All structure coefficients of the four factors were greater than .30, which means the four factors have discriminant effects on injury prevalence. The sizes of the discriminant effects, in order, were largly from gender, grade, living with both parents, and safety message display on school news boards.

The Relationship between Blood Transfusion and Mortality in Trauma Patients (외상환자에서 수혈과 사망의 연관성)

  • Choi, Se Young;Lee, Jun Ho;Choi, Young Cheol
    • Journal of Trauma and Injury
    • /
    • v.21 no.2
    • /
    • pp.108-114
    • /
    • 2008
  • Purpose: Using a propensity analysis, a recent study reported that blood transfusion might not be an independent predictor of mortality in critically ill patients, which contradicted the results of earlier studies. This study aims to reveal whether or not blood transfusion is an independent predictor of mortality in trauma patients. Methods: A total of three hundred fifty consecutive trauma patients who were admitted to our emergency center from January 2004 to October 2005 and who underwent an arterial blood gas analysis and a venous blood analysis were included in this study. Their medical records were collected prospectively and retrospectively. Using a multivariate logistic analysis, data on the total population and on the propensity-score -matched population were retrospectively analyzed for association with mortality. Results: Of the three hundred fifty patients, one hundred twenty-nine (36.9%) received a blood transfusion. These patients were older (mean age: 48 vs. 44 years; p=0.019) and had a higher mortality rate (27.9% vs. 7.7%; p<0.001). In the total population, the multivariate analysis revealed that the Glasgow coma scale score, the systolic blood pressure, bicarbonate, the need for respiratory support, past medical history of heart disease, the amount of blood transfusion for 24 hours, and hemoglobin were associated with mortality. In thirty-seven pairs of patients matched with a propensity score, potassium, new injury severity score, amount of blood transfusion for 24 hours, and pulse rate were associated with mortality in the multivariate analysis. Therefore, blood transfusion was a significant independent predictor of mortality in trauma patients. Conclusion: Blood transfusion was revealed to be a significant independent predictor of mortality in the total population of trauma patients and in the propensity-score-matched population.