• Title/Summary/Keyword: multivariate data analysis

Search Result 1,405, Processing Time 0.028 seconds

KCYP data analysis using Bayesian multivariate linear model (베이지안 다변량 선형 모형을 이용한 청소년 패널 데이터 분석)

  • Insun, Lee;Keunbaik, Lee
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.6
    • /
    • pp.703-724
    • /
    • 2022
  • Although longitudinal studies mainly produce multivariate longitudinal data, most of existing statistical models analyze univariate longitudinal data and there is a limitation to explain complex correlations properly. Therefore, this paper describes various methods of modeling the covariance matrix to explain the complex correlations. Among them, modified Cholesky decomposition, modified Cholesky block decomposition, and hypersphere decomposition are reviewed. In this paper, we review these methods and analyze Korean children and youth panel (KCYP) data are analyzed using the Bayesian method. The KCYP data are multivariate longitudinal data that have response variables: School adaptation, academic achievement, and dependence on mobile phones. Assuming that the correlation structure and the innovation standard deviation structure are different, several models are compared. For the most suitable model, all explanatory variables are significant for school adaptation, and academic achievement and only household income appears as insignificant variables when cell phone dependence is a response variable.

A Study on the Principal Component Analysis of Anthropometric Data (인체계측치(人體計測値)의 주성분분석(主成分分析)에 관한 연구(硏究))

  • Lee, Sang-Do;Jeong, Jung-Hui;Kim, Geuk-Bae
    • Journal of the Ergonomics Society of Korea
    • /
    • v.2 no.1
    • /
    • pp.3-11
    • /
    • 1983
  • Anthropometric data is most basic materials in the all studies related with it. Therefore, in anthropometric data, not only consideration of the state of variance, but more various analysis is needed. This study selected the 13 parts that properly show a whole characteristics of human body and, anthropometric data were obtained through the actual measurements for male and female workers who were engaged in production factory. And, to interpret anthropometric data, principal component analysis of multivariate analysis methods was applied.

  • PDF

Multivariate confidence region using quantile vectors

  • Hong, Chong Sun;Kim, Hong Il
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.6
    • /
    • pp.641-649
    • /
    • 2017
  • Multivariate confidence regions were defined using a chi-square distribution function under a normal assumption and were represented with ellipse and ellipsoid types of bivariate and trivariate normal distribution functions. In this work, an alternative confidence region using the multivariate quantile vectors is proposed to define the normal distribution as well as any other distributions. These lower and upper bounds could be obtained using quantile vectors, and then the appropriate region between two bounds is referred to as the quantile confidence region. It notes that the upper and lower bounds of the bivariate and trivariate quantile confidence regions are represented as a curve and surface shapes, respectively. The quantile confidence region is obtained for various types of distribution functions that are both symmetric and asymmetric distribution functions. Then, its coverage rate is also calculated and compared. Therefore, we conclude that the quantile confidence region will be useful for the analysis of multivariate data, since it is found to have better coverage rates, even for asymmetric distributions.

Inverter-Based Solar Power Prediction Algorithm Using Artificial Neural Network Regression Model (인공 신경망 회귀 모델을 활용한 인버터 기반 태양광 발전량 예측 알고리즘)

  • Gun-Ha Park;Su-Chang Lim;Jong-Chan Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.2
    • /
    • pp.383-388
    • /
    • 2024
  • This paper is a study to derive the predicted value of power generation based on the photovoltaic power generation data measured in Jeollanam-do, South Korea. Multivariate variables such as direct current, alternating current, and environmental data were measured in the inverter to measure the amount of power generation, and pre-processing was performed to ensure the stability and reliability of the measured values. Correlation analysis used only data with high correlation with power generation in time series data for prediction using partial autocorrelation function (PACF). Deep learning models were used to measure the amount of power generation to predict the amount of photovoltaic power generation, and the results of correlation analysis of each multivariate variable were used to increase the prediction accuracy. Learning using refined data was more stable than when existing data were used as it was, and the solar power generation prediction algorithm was improved by using only highly correlated variables among multivariate variables by reflecting the correlation analysis results.

Long-term Prognosis in Hepatocellular Carcinoma Patients after Hepatectomy

  • Zhou, Lei;Liu, Chang;Meng, Fan-Di;Qu, Kai;Tian, Feng;Tai, Ming-Hui;Wei, Ji-Chao;Wang, Rui-Tao
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.13 no.2
    • /
    • pp.483-486
    • /
    • 2012
  • Background: The hepatocellular carcinoma is very common in China. Our aim in this report was to investigate clinical and pathological factors based on the current decade data that could influence prognosis of HCC patients after hepatectomy. Methods: Between 2002 and 2009, all patients undergoing hepatectomy for HCC were followed up and reviewed retrospectively. Prognostic factors were studied by univariate and multivariate analysis, with Kaplan-Meier and Cox multivariate survival analyses. Results: Complete clinicopathologic and follow-up data were available for 114 patients. The estimated cumulative survival rates at 1, 3, and 5 yr were 84.6%, 60.2% and 51.8%, respectively. On univariate analysis, key prognostic factors were AFP level, GGT level, tumor size, number of tumors, portal vein invasion, liver cirrhosis status and TNM stage. In the multivariate analysis, tumor size, GGT level, liver cirrhosis status and portal vein invasion were significantly associated with patients' prognosis. Conclusion: Through follow-up of a relatively large cohort of Chinese patients, tumor size, GGT level, liver cirrhosis status, portal vein invasion were revealed as important factors for long-term survival after hepatectomy. Early diagnosis for tumor and the improvement of liver function before surgery are important ways to improve the prognosis.

A study on the efficiency of multidimensional scalin using bootstrap method (붓스트랩을 이용한 다차원척도법의 효율성 연구)

  • Kim, Woo-Jong;Kang, Kee-Hoon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.2
    • /
    • pp.301-309
    • /
    • 2009
  • Multidimensional scaling(MDS) is a statistical multivariate analysis technique that is often used in information visualization for exploring similarities or dissimilarities in data. In order to analyse and visualize data, MDS measures the dissimilarities between objects and uses them or their mean if they are repeatedly measured. When there exist outliers or when the variation of data is too large, we can hardly get reliable results on the research using MDS. In this paper, we consider the MDS based on bootstrap method when the variation of data is large. Standardized residual sum of squares is considered as measuring goodness-of-fit of the model. A real data analysis is include to examine our approach.

  • PDF

An Analysis of Engine Failures Using Multivariate Data Analysis Method (다변량해석법을 이용한 기관고장분석)

  • 윤석훈
    • Journal of the Korean Society of Fisheries and Ocean Technology
    • /
    • v.23 no.4
    • /
    • pp.198-203
    • /
    • 1987
  • The basis of all approaches to improve reliability of marine engines exists in analyzing the field data of troubles and failures on marine engines. This paper analyses the data of troubles and failures on marine engines by Principal Component Analysis Method, one of Multivariate Data Analysis Method. The total number of data investigated is 211 and the observation period is 9 years. The analyzed factors are categorized among five groups respectively; electric.automatic control equipments, auxiliary machinery, pipings, refrigerators.air conditioners, and main engine. The failures in main engine are discovered by a definite fact of disorder, on the contrary, the failures in auxiliary machinery, refrigerators and air conditioners are discovered by sensible judgement of the operators.

  • PDF

The Evaluation of Water Quality in the Mankyung River using Multivariate Analysis (다변량해석기법을 이용한 수계의 수질평가)

  • O, Yeon Chan;Lee, Nam Do;Kim, Jong Gu
    • Journal of Environmental Science International
    • /
    • v.13 no.3
    • /
    • pp.233-244
    • /
    • 2004
  • This study was conducted to evaluate water quality in the Mankyung River using multivariate analysis. The analysis data which was surveyed from January 1996 to December 2002 in Mankyung river was aquired by the ministry of environment. Twelve water quality parameters were determined on each survey. The results were summarized as follow; Water quality in the Mankyung River could be explained up to 74.90% by four factors which were included in loading of organic matter and nutrients by the tributaries(43.28%), seasonal variation(10.40%), loading of pathogenic bacteria by domestic sewage of Gapcheon (12.41%) and internal metabolism in river(8.81%). The result of cluster analysis by station was classified into three group that has different water quality characteristics. Especially, Iksan river was appeared to considerable water quality characteristics against other station. In monthly cluster analysis, three group was classified by seasonal characteristics. Also, in yearly cluster analysis, three group was classified. It is necessary to control the pollutant loadings by domestic sewage and livestock waste for water quality management of Mankyung river.

Multivariate quantile regression tree (다변량 분위수 회귀나무 모형에 대한 연구)

  • Kim, Jaeoh;Cho, HyungJun;Bang, Sungwan
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.3
    • /
    • pp.533-545
    • /
    • 2017
  • Quantile regression models provide a variety of useful statistical information by estimating the conditional quantile function of the response variable. However, the traditional linear quantile regression model can lead to the distorted and incorrect results when analysing real data having a nonlinear relationship between the explanatory variables and the response variables. Furthermore, as the complexity of the data increases, it is required to analyse multiple response variables simultaneously with more sophisticated interpretations. For such reasons, we propose a multivariate quantile regression tree model. In this paper, a new split variable selection algorithm is suggested for a multivariate regression tree model. This algorithm can select the split variable more accurately than the previous method without significant selection bias. We investigate the performance of our proposed method with both simulation and real data studies.

The Relationship between Private Tutoring and Academic Achievement - An Application of a Multivariate Latent Growth Model -

  • Nam, Su-Jung
    • International Journal of Human Ecology
    • /
    • v.14 no.1
    • /
    • pp.29-39
    • /
    • 2013
  • The study examined how changes in time invested in private tutoring and academic achievement influenced each other through a multivariate latent growth model by using the data from the first to the third year presented in the KYPS. This study identifies not only how changes in the private tutoring experience exerted a direct influence on changes in academic achievement, but also measures what kind of changes in private tutoring and academic achievement had emerged over time. The detailed study results are as follows. First, the analysis of time invested in private tutoring showed that the higher the grades, the greater were the amount of time invested in private tutoring in the case of Korean language study. On the other hand, the results showed that in the case of English and mathematics, the higher the grades, the lesser was the amount of time invested in private tutoring. Second, private tutoring and academic achievement were all in a linear relationship. Third, it was shown that the time invested in private tutoring and academic achievement exerted a negative influence on each other according to the passage of time.