• 제목/요약/키워드: multivariate data analysis

검색결과 1,402건 처리시간 0.034초

풍력발전 설비 효율화를 위한 다변량 분석을 이용한 풍력발전단지 단기 출력 예측 방법 (Short-term Wind Farm Power Forecasting Using Multivariate Analysis to Improve Wind Power Efficiency)

  • 위영민
    • 조명전기설비학회논문지
    • /
    • 제29권7호
    • /
    • pp.54-61
    • /
    • 2015
  • This paper presents short-term wind farm power forecasting method using multivariate analysis and time series. Based on factor analysis, the proposed method makes new independent variables which newly composed by raw independent variables such as wind speed, ramp rate, wind power. Newly created variables are used in the time series model for forecasting wind farm power. To demonstrate the improved accuracy, the proposed method is compared with persistence model commonly used as reference in wind power forecasting using data from Jeju Island. The results of case studies are presented to show the effectiveness of the proposed forecasting method.

Multivariate Analysis of the Prognosis of 37 Chondrosarcoma Patients

  • Yang, Zheng-Ming;Tao, Hui-Min;Ye, Zhao-Ming;Li, Wei-Xu;Lin, Nong;Yang, Di-Sheng
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제13권4호
    • /
    • pp.1171-1176
    • /
    • 2012
  • Objective: The current study aimedto screen for possible factors which affect prognosis of chondrosarcoma. Methods: Thirty seven cases were selected and analyzed statistically. The patients received surgical treatment at our hospital between December 2005 and March 2008. All of them had complete follow-up data. The survival rates were calculated by univariate analysis using the Kaplan-Meier method and tested by Log-rank. ${\chi}^2$ or Fisher exact tests were carried out for the numeration data. The significant indexes after univariate analysis were then analyzed by multivariate analysis using COX regression model. Based on the literature, factors of gender, age, disease course, tumor location, Enneking grades, surgical approaches, distant metastasis and local recurrence were examined. Results: Univariate analysis showed that there were significant differences in Enneking grades, surgical approaches and distant metastasis related to the patients' 3-year survival rate after surgery (P<0.001). No significant difference was not found in gender, age, disease course, tumor location or local recurrence (P>0.05). Multivariate analysis showed that Enneking grade (P=0.007) and surgical approaches (P=0.010) were independent factors affecting the prognosis of chondrosarcoma, but distant metastasis was not (P=0.942). Conclusion: Enneking grades, surgical approaches and distant metastasis are risk factors for prognosis of chondrosarcoma, among which the former two are independent factors.

다변량통계기법을 이용한 지하저장시설 주변의 지하수질 변동에 관한 연구 (Use of Multivariate Statistical Approaches for Decoding Chemical Evolution of Groundwater near Underground Storage Caverns)

  • 이정훈
    • 한국지구과학회지
    • /
    • 제35권4호
    • /
    • pp.225-236
    • /
    • 2014
  • 다변량통계기법은 수리지구화학 자료의 분석 및 해석에 많이 이용되어 왔다. 본 연구에서 대응분석과 주성분분석을 동시에 사용하여 인위적인 활동에 의한 지하수의 특징을 살펴보았다. 본 연구의 목적은 NETPATH 프로그램 속의 WATEQ4F를 이용하여 지하수 화학성분의 분화를 계산하고 이를 다변량통계기법을 이용하여 지구화학적인 정보를 추출하는 것이다. 연구지역은 한반도의 남동쪽에 위치한 울산의 LPG 저장시설이다. 본 연구지역에서는 다른 저장시설에서 관찰되는 초염기성의 조성을 가지는 지하수가 관찰되었다. 이러한 인위적인 영향에 의한 높은 pH를 가지는 지하수로 인해 Al의 분화특징과 탄산염의 침전을 유발할 수 있다. 본 연구에서는 연구지역에 지하수에 영향을 주는 두 인위적인 요소(세정작용와 시멘트영향)에 의해서 수리지구화학적인 특징과 상이 어떻게 변하는 가에 초점을 두었다. 이전 연구결과와 두 통계분석을 통해 제시된 결과를 비교하여 지구화학적인 정보를 이용한 주성분분석과 대응분석인 수리지구화학 연구에서 기초연구로 활용될 수 있음을 알 수 있다.

다변수통계방법을 이용한 산지분류에 관한 연구 (A Study on Forest Land Classification Using Multivariate Statistical Methods : A Case Study at Mt. Kwanak)

  • 정순오
    • 한국조경학회지
    • /
    • 제13권1호
    • /
    • pp.43-66
    • /
    • 1985
  • Korea needs proper and rational public policies on conservation and use of forest land and other natural resources because of the accelerating expansion of national land developments in recent years. Unfortunately, there is no systematic planning system to support the needs. Generally, forest land use planning needs suitability analysis based on efficient land classification system. The goal of this study was to classify a forest land using multivariate satistical methods. A case study was carried out in winter of 1983 on a mountainous area higher than 100m above sea level located at Mt. Kwanak in Anyang -city, Kyung-gi-do (province). The study area was 19.80 km$^2$wide and was divided into 1, 383 Operational Taxonomic Units (OTU's) by a 120m$\times$120m grid. Fourteen descriptors were identified and quantified for each OTU from existing national land data : elevation, slope, aspect, terrain form, geologic material, surface soil permeability, topsoil type, depth of the solum, soil acidity, forest cover type, stand size class, stand age class, stand density class, and simple forest soil capability class. For this study, a FORTRAN IV program was written for input and output map data, and the computer statistics packages, SPSS and BMD, were used to perform the multivariate statistical analysis. Fourteen variables were analyzed to investigate the characteristics of their fire quench distribution and to estimate the correlation coefficients among them. Principal component analysis was executed to find the dimensions of forest land characteristics, and factor scores were used for proper samples of OTU throughout the study area. In order to develop the classes of forest land classification based on 102 surrogates, cluster and discriminant analyses of principal descriptor variable matrix were undertaken. Results obtained through a series of multivariate statistical analyses were as follows ; 1) Principal component analysis was proved to be a useful tool for data selection and identification of principal descriptor variables which represented the characteristics of forest land and facilitated the selection of samples.

  • PDF

국민건강영양조사 자료의 복합표본설계효과와 통계적 추론 (Complex sample design effects and inference for Korea National Health and Nutrition Examination Survey data)

  • 정진은
    • Journal of Nutrition and Health
    • /
    • 제45권6호
    • /
    • pp.600-612
    • /
    • 2012
  • Nutritional researchers world-wide are using large-scale sample survey methods to study nutritional health epidemiology and services utilization in general, non-clinical populations. This article provides a review of important statistical methods and software that apply to descriptive and multivariate analysis of data collected in sample surveys, such as national health and nutrition examination survey. A comparative data analysis of the Korea National Health and Nutrition Examination Survey (KNHANES) was used to illustrate analytical procedures and design effects for survey estimates of population statistics, model parameters, and test statistics. This article focused on the following points, method of approach to analyze of the sample survey data, right software tools available to perform these analyses, and correct survey analysis methods important to interpretation of survey data. It addresses the question of approaches to analysis of complex sample survey data. The latest developments in software tools for analysis of complex sample survey data are covered, and empirical examples are presented that illustrate the impact of survey sample design effects on the parameter estimates, test statistics, and significance probabilities (p values) for univariate and multivariate analyses.

INVITED PAPER MULTIVARIATE ANALYSIS FOR THE CASE WHEN THE DIMENSION IS LARGE COMPARED TO THE SAMPLE SIZE

  • Fujikoshi, Yasunori
    • Journal of the Korean Statistical Society
    • /
    • 제33권1호
    • /
    • pp.1-24
    • /
    • 2004
  • This paper is concerned with statistical methods for multivariate data when the number p of variables is large compared to the sample size n. Such data appear typically in analysis of DNA microarrays, curve data, financial data, etc. However, there is little statistical theory for high dimensional data. On the other hand, there are some asymptotic results under the assumption that both and p tend to $\infty$, in some ratio p/n ${\rightarrow}$c. The results suggest that the new asymptotic results are more useful and insightful than the classical large sample asymptotics. The main purpose of this paper is to review some asymptotic results for high dimensional statistics as well as classical statistics under a high dimensional asymptotic framework.

Multi-sensor data-based anomaly detection and diagnosis of a pumped storage hydropower plant

  • Sojin Shin;Cheolgyu Hyun;Seongpil Cho;Phill-Seung Lee
    • Structural Engineering and Mechanics
    • /
    • 제88권6호
    • /
    • pp.569-581
    • /
    • 2023
  • This paper introduces a system to detect and diagnose anomalies in pumped storage hydropower plants. We collect data from various types of sensors, including those monitoring temperature, vibration, and power. The data are classified according to the operation modes (pump and turbine operation modes) and normalized to remove the influence of the external environment. To detect anomalies and diagnose their types, we adopt a multivariate normal distribution analysis by learning the distribution of the normal data. The feasibility of the proposed system is evaluated using actual monitoring data of a pumped storage hydropower plant. The proposed system can be used to implement condition monitoring systems for other plants through modifications.

다변량 통계분석법을 이용한 PET 중합공정 중 직접 에스테르화 반응기의 거동 및 생산제품 예측 (Multivariate Statistical Analysis Approach to Predict the Reactor Properties and the Product Quality of a Direct Esterification Reactor for PET Synthesis)

  • 김성영;정창복;최수형;이범석;이범석
    • 제어로봇시스템학회논문지
    • /
    • 제11권6호
    • /
    • pp.550-557
    • /
    • 2005
  • The multivariate statistical analysis methods, using both multiple linear regression(MLR) and partial least square(PLS), have been applied to predict the reactor properties and the product quality of a direct esterification reactor for polyethylene terephthalate(PET) synthesis. On the basis of the set of data including the flow rate of water vapor, the flow rate of EG vapor, the concentration of acid end groups of a product and other operating conditions such as temperature, pressure, reaction times and feed monomer mole ratio, two multi-variable analysis methods have been applied. Their regression and prediction abilities also have been compared. The prediction results are critically compared with the actual plant data and the other mathematical model based results in reliability. This paper shows that PLS method approach can be used for the reasonably accurate prediction of a product quality of a direct esterification reactor in PET synthesis process.

다변량 통계기법을 이용한 K및 n의 산정에 관한 연구 (A Study on the Estimation of Coefficients K and n Using Multivariate Data Analysis)

  • 백용진;최재성;배동명;김경진
    • 한국소음진동공학회논문집
    • /
    • 제13권8호
    • /
    • pp.583-590
    • /
    • 2003
  • For the preestimate of the vibration level of the ground next to a dwelling, a multivariate statistical analysis on the experiment data acquired from a variety of construction sites was performed, and then a new estimate model for the value of K and n that can be applied in the diagnosis of the damage was offered. The results maybe summarized as follows : First, the $K_{95}$ and n showed high correlation at P$\leq$0.05. Specially the correlation coefficient about $W_{max}$, S were higher in $K_{95}$ than in n. indicating that $K_{95}$ is generally associated with source conditions. Second, the factor analysis permitted to identify two major sources in each fraction. These sources accounted for at least 73 % of valiance of $K_{95}$. Third, the multiple regression model for the estimate of $K_{95}$ was developed from Fac1 which depend upon the source conditions and Fac2 which depend upon the transmission conditions. The n value is able to determine from the correlation relationship associated with $K_{95}$./.

A multivariate latent class profile analysis for longitudinal data with a latent group variable

  • Lee, Jung Wun;Chung, Hwan
    • Communications for Statistical Applications and Methods
    • /
    • 제27권1호
    • /
    • pp.15-35
    • /
    • 2020
  • In research on behavioral studies, significant attention has been paid to the stage-sequential process for multiple latent class variables. We now explore the stage-sequential process of multiple latent class variables using the multivariate latent class profile analysis (MLCPA). A latent profile variable, representing the stage-sequential process in MLCPA, is formed by a set of repeatedly measured categorical response variables. This paper proposes the extended MLCPA in order to explain an association between the latent profile variable and the latent group variable as a form of a two-dimensional contingency table. We applied the extended MLCPA to the National Longitudinal Survey on Youth 1997 (NLSY97) data to investigate the association between of developmental progression of depression and substance use behaviors among adolescents who experienced Authoritarian parental styles in their youth.