• 제목/요약/키워드: multivariate data analysis

검색결과 1,402건 처리시간 0.031초

Some Diagnostic Results in Discriminant Analysis

  • Bae, Whasoo;Hwang, Soonyoung
    • Journal of the Korean Statistical Society
    • /
    • 제30권1호
    • /
    • pp.139-151
    • /
    • 2001
  • Although lots of works are done in influence diagnostics, results in the multivariate analysis are quite rare. One of recent works done by Fung(1995) is about the single case influence diagnostics in the linear discriminant analysis. In this paper we extend Fung's results to the multiple cases diagnostics which are necessary in the linear discriminant analysis for two reasons among others; First, the masking effect cannot be detected by single case diagnostics and secondly two populations are concerned in the discriminant analysis, i.e., influential cases can occur in one or both populations.

  • PDF

Predicting Unknown Composition of a Mixture Using Independent Component Analysis

  • Lee, Hye-Seon;Park, Hae-Sang;Jun, Chi-Hyuck
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 한국데이터정보과학회 2005년도 춘계학술대회
    • /
    • pp.127-134
    • /
    • 2005
  • A suitable representation for the conceptual simplicity of the data in statistics and signal processing is essential for a subsequent analysis such as prediction, pattern recognition, and spatial analysis. Independent component analysis (ICA) is a statistical method for transforming an observed high-dimensional multivariate data into statistically independent components. ICA has been applied increasingly in wide fields of spectrum application since ICA is able to extract unknown components of a mixture from spectra. We focus on application of ICA for separating independent sources and predicting each composition using extracted components. The theory of ICA is introduced and an application to a metal surface spectra data will be described, where subsequent analysis using non-negative least square method is performed to predict composition ratio of each sample. Furthermore, some simulation experiments are performed to demonstrate the performance of the proposed approach.

  • PDF

다변량 자료에서 특이점 검출 및 시각화 - R 스크립트 (Detecting outliers in multivariate data and visualization-R scripts)

  • 김성수
    • 응용통계연구
    • /
    • 제31권4호
    • /
    • pp.517-528
    • /
    • 2018
  • 다변량 자료에서 특이점을 검출하고, 검출된 특이점을 시각화와 연결한 R 스크립트를 제공한다. 개발된 R 스크립트는 특이점을 검출하는 방법으로서 1) Robust Mahalanobis distance, 2) High Dimensional data, 3) Density-based approach 방법을 이용하였다. 특이점을 연결하면서 데이터 구조를 파악하기 위한 시각화 방법으로는 1) multidimensional scaling (MDS)와 minimal spanning tree (MST)를 K-means 군집분석과 연결하여 표시하는 방법, 2) MDS를 fviz cluster와 연결하는 방법, 3) principal component analysis (PCA)를 fviz cluster와 연결한 방법을 이용하였다. 사례분석의 예로서는 Major League Baseball (MLB) 자료에서 류현진이 적극적으로 활동하던 2013년, 2014년 투수자료를 이용하였다. 개발된 R 스트립트는 "http://www.knou.ac.kr/~sskim/ddpoutlier.html (R 스크립트와 R 패키지도 다운로드 받을 수 있다. 실행방법도 설명되어 있다.)"에서 다운받으면 된다.

주성분 분석을 이용한 빅데이터 분석 (Big Data Analysis Using Principal Component Analysis)

  • 이승주
    • 한국지능시스템학회논문지
    • /
    • 제25권6호
    • /
    • pp.592-599
    • /
    • 2015
  • 빅 데이터 환경에서 빅데이터를 분석하기 위한 새로운 방법의 필요성이 대두되고 있다. 데이터의 크기, 다양성, 그리고 적재 속도 등의 빅데이터 특성으로 인해 모집단의 추론에서 전체 데이터의 분석이 가능해졌기 때문이다. 그러나 전통적인 통계분석 방법은 모집단으로부터 추출된 확률표본에 초점이 맞추어져 있다. 따라서 기존의 통계적 접근방법은 빅데이터 분석에 적합하지 않은 경우가 발생한다. 이와 같은 문제점을 해결하기 위하여 본 논문에서는 빅데이터분석을 위한 새로운 접근방법에 대하여 제안하였다. 특히 대표적인 다변량 통계분석 기법인 주성분 분석을 이용하여 효율적인 빅데이터분석을 위한 방법론을 연구하였다. 제안방법의 성능평가를 위하여 통계적 모의실험을 실시하였다.

한국 비만 남성의 체형 분류 및 특성 분석 (Categorization of the Body Types and Their Characteristics of Obese Korean Men)

  • 남종용;박성준;정의승
    • 대한인간공학회지
    • /
    • 제26권4호
    • /
    • pp.103-111
    • /
    • 2007
  • The purpose of this study is to categorize and analyze the body shape of obese Korean men that are needed for industrial design. Using the anthropometric data that were surveyed through the 5th Size Korea project, this study was conducted in four steps mostly through the multivariate statistical analysis. In the first step, Broca, BMI, WHR indices are used to define obesity and select obese men from Korean adults and teens. After 34 human anthropometric variables are supposed to be related to obesity were extracted through an expect survey. In the second step, a factor analysis was executed for those human anthropometric variables. Through this analysis, we obtained the human body factors that are related to the representation of obesity. Then the third step, we used a cluster analysis from the result of the factor analysis. And ANOVA analysis was also conducted to obtain the critical obese human anthropometric variables. In the final step, we found the characteristics of the body types of obese men according to clusters and ages. The body types of obese men classified in the study are expected to be applied to product design for clothing, furniture, automobile packaging, etc.

두 진단검사의 비교에 대한 민감도와 특이도의 다변량 메타분석법 (Multivariate Meta-Analysis Methods of Comparing the Sensitivity and Specificity of Two Diagnostic Tests)

  • 남선영;송혜향
    • Communications for Statistical Applications and Methods
    • /
    • 제18권1호
    • /
    • pp.57-69
    • /
    • 2011
  • 질병에 대한 새로운 진단검사 방법이 의학 연구자들에 의해 끊임없이 개발되고 있으며, 기존 진단검사 방법과 새로운 진단검사 방법을 비교하는 연구논문이 계속 출간되어 누적되고 있다. 메타분석법으로 다수 연구논문의 결과를 종합하여 정확성이 높은 진단검사에 대해 객관적인 결론을 내리게 된다. 이와같이 출간된 두 진단검사를 비교하는 각 연구논문은 각각 질병을 가진 개체와 질병을 가지지 않은 개체에 두 검사를 모두 실시하여 한 쌍의 민감도와 특이도를 구하여 비교한다. 이러한 연구논문의 결과를 종합하는 메타분석은 동일 개체에 실시한 두 검사로 인해 한 쌍의 민감도간의 연관성과 한 쌍의 특이도 간의 연관성을 고려한 메타분석법을 본 논문에서 제시한다. 논문예제 자료와 모의시험으로 메타분석 검정통계량의 효율성을 평가한다.

학교의 안전교육 관련 특성이 청소년의 사고발생 예측에 미치는 영향 (School Safety Education Factors Predicting Injury Prevalence Among Korean Adolescence)

  • 이명선;박경옥
    • 보건교육건강증진학회지
    • /
    • 제21권2호
    • /
    • pp.147-165
    • /
    • 2004
  • Injury is a leading cause of death in the children and adolescent populations. In particular, more than 80% of unintentional injury was related to risk-taking behaviors involved in diverse accidents around school and home. Therefore, educational approaches should be provided for children and adolescent populations, and schools are the essential and appropriate sites to conduct safety education. This study was conducted to identify injury prevalence and safety education at schools among middle and high school students in Korea. About 1,034 middle and high students in 28 schools participated in a self-administered survey. The target schools were selected from the stratified random sampling method throughout schools of seven metropolitan cities in Korea. The questionnaires were delivered to the vice-principals by ground mailing service and the vice-principals administered survey data collection. The questionnaire asked about safety education provided in schools, injury experience in the last year, needs for injury prevention class in school, and demographics. All survey responses were entered into SPSS worksheet. Multivariate analysis of variance (MANOVA) and descriptive discriminant analysis (DDA) were used in statistical analysis with SPSS software 11.1. Multivariate analysis of variance was conducted as a preliminary analysis of DDA. According to the result of multivariate analysis of variance, gender (man), grade (poor), living with both parents, and displaying injury prevention messages on school news board were significantly different between the injured student group and the uninjured student group (p= .00). These four factors also had significant effects on students' injury experience in DDA, although correlation of the four factors with injury experience was weak overall based on their canonical function coefficients. All structure coefficients of the four factors were greater than .30, which means the four factors have discriminant effects on injury prevalence. The sizes of the discriminant effects, in order, were largly from gender, grade, living with both parents, and safety message display on school news boards.

외상환자에서 수혈과 사망의 연관성 (The Relationship between Blood Transfusion and Mortality in Trauma Patients)

  • 최세영;이준호;최영철
    • Journal of Trauma and Injury
    • /
    • 제21권2호
    • /
    • pp.108-114
    • /
    • 2008
  • Purpose: Using a propensity analysis, a recent study reported that blood transfusion might not be an independent predictor of mortality in critically ill patients, which contradicted the results of earlier studies. This study aims to reveal whether or not blood transfusion is an independent predictor of mortality in trauma patients. Methods: A total of three hundred fifty consecutive trauma patients who were admitted to our emergency center from January 2004 to October 2005 and who underwent an arterial blood gas analysis and a venous blood analysis were included in this study. Their medical records were collected prospectively and retrospectively. Using a multivariate logistic analysis, data on the total population and on the propensity-score -matched population were retrospectively analyzed for association with mortality. Results: Of the three hundred fifty patients, one hundred twenty-nine (36.9%) received a blood transfusion. These patients were older (mean age: 48 vs. 44 years; p=0.019) and had a higher mortality rate (27.9% vs. 7.7%; p<0.001). In the total population, the multivariate analysis revealed that the Glasgow coma scale score, the systolic blood pressure, bicarbonate, the need for respiratory support, past medical history of heart disease, the amount of blood transfusion for 24 hours, and hemoglobin were associated with mortality. In thirty-seven pairs of patients matched with a propensity score, potassium, new injury severity score, amount of blood transfusion for 24 hours, and pulse rate were associated with mortality in the multivariate analysis. Therefore, blood transfusion was a significant independent predictor of mortality in trauma patients. Conclusion: Blood transfusion was revealed to be a significant independent predictor of mortality in the total population of trauma patients and in the propensity-score-matched population.