• 제목/요약/키워드: Principal Component Factor

검색결과 367건 처리시간 0.022초

독립성분 행렬도 (Independent Component Biplot)

  • 이수진;최용석
    • 응용통계연구
    • /
    • 제27권1호
    • /
    • pp.31-41
    • /
    • 2014
  • 행렬도(biplot)는 이원표 자료행렬(two-way data matrix)의 행과 열을 한 그림에 동시에 나타내는 탐색적 방법으로, 복잡한 다변량 분석 결과를 보다 쉽게 파악할 수 있는 장점이 있다. 특히 주성분인자 행렬도(principal component factor biplot; PCFB)는 인자분석을 통해서 변수들 간의 상호의존 구조를 탐색하기 위한 시각적 도구이다. 자료에 따라 잠재된 변수들이 독립(independent)이고 비가우시안(non-Gaussian) 분포를 가진다는 사전 정보가 있을 때, Jutten과 Herault (1991)가 제안한 독립성분분석(independent component analysis)을 이용한다. 이 경우 주성분법을 이용한 인자분석을 적용하면 원래 변수들의 상호 관계를 잘못 해석할 수도 있다. 따라서 본 논문에서는 자료에 따라 잠재된 변수들이 독립이고 비가우시안 분포를 가진다는 사전 정보가 있을 때, 독립성분분석을 응용하여 원래 변수들 간의 상호 관계를 기하학적으로 살펴볼 수 있는 시각적 도구인 독립성분 행렬도(independent component biplot; ICB)를 제안하려 한다.

Comparison of hydrochemical informations of groundwater obtained from two different underground storage systems

  • Lee, Jeonghoon;Kim, Jun-Mo;Chang, Ho-Wan
    • 한국지하수토양환경학회:학술대회논문집
    • /
    • 한국지하수토양환경학회 2002년도 총회 및 춘계학술발표회
    • /
    • pp.110-113
    • /
    • 2002
  • Statistical- based, principal component analysis (PCA) was applied to chemical data from two underground storage systems containing LPG to assess the usefulness of such technique at the initial stage (Pyeongtaek) or middle stage (Ulsan) of hydrochemical studies. For the first case, both natural and anthropogenic contamination characterize regional groundwater. Saline water buffered by Namyang lake affects as a natural factor, whereas cement grouting influence as an artificial factor. For the second study area, contaminations due to operation of LPG caverns, such as disinfection activity and cement grouting effect, deteriorate groundwater quality. This study indicates that principal component analysis would be particularly useful for summarizing large data set for the purpose of subsurface characterization, assessing their vulnerability to contamination and protecting recharge zones.

  • PDF

다차원 데이터의 군집분석을 위한 차원축소 방법: 주성분분석 및 요인분석 비교 (A dimensional reduction method in cluster analysis for multidimensional data: principal component analysis and factor analysis comparison)

  • 홍준호;오민지;조용빈;이경희;조완섭
    • 한국빅데이터학회지
    • /
    • 제5권2호
    • /
    • pp.135-143
    • /
    • 2020
  • 본 논문은 농식품 소비자패널 데이터에서 소비자의 유형을 나눌 때에 변수간 연관성이 많은 장바구니 분석에서 전처리 방법과 차원축소의 방법을 제안한다. 군집분석은 다변량 자료에서 관측 개체를 몇 개의 군집으로 나눌 때 널리 사용되는 분석기법이다. 하지만 여러 개의 변수가 연관성을 가진 경우에는 차원축소를 통한 군집분석이 더 효과적일 수 있다. 본 논문은 1,987 가구를 대상으로 조사한 식품소비 데이터를 K-means 방법을 사용하여 군집화하였으며, 군집을 나누기 위해 17개의 변수를 선정하였고, 17개의 다중공선성 문제와 군집을 나누기 위한 차원축소의 방법 중 주성분 분석과 요인분석을 비교하였다. 본 연구에서는 주성분분석과 요인분석 모두 2개의 차원으로 축소하였으며 주성분분석에서는 3개의 군집으로 나뉘었지만 분석하고자 하였던 소비 패턴에 대한 군집의 특성이 잘 나타나지 않았으며 요인분석에서는 분석가가 보고자 하는 소비 패턴의 특징이 잘 나타났다.

다변량분석법을 이용한 금강 유역의 수질오염특성 연구 (Evaluation of the Geum River by Multivariate Analysis: Principal Component Analysis and Factor Analysis)

  • 김미아;이재관;조경덕
    • 한국물환경학회지
    • /
    • 제23권1호
    • /
    • pp.161-168
    • /
    • 2007
  • The main aim of this work is focus on the Geum river water quality evaluation of pollution data obtained by monitoring measurement during the period 2001-2005. The complex data matrix 19 (entire monitoring stations)*13 (parameters), 60 (month)*13 (parameters) and 20 (season)*13 (parameters) were treated with different multivariate techniques such as factor analysis/principal component analysis (FA/PCA). FA/PCA identified two factor (19*13) classified pollutant Loading factor (BOD, COD, pH, Cond, T-N, T-P, $NH_3$-N, $NO_3$-N, $PO_4$-P, Chl-a), seasonal factor (water temp, SS) and three Factor (60*13, 20*13) classified pollutant Loading factor (BOD, COD, Cond, T-N, T-P, $NH_3$-N, $NO_3$-N, $PO_4$-P), seasonal factor (water temp, SS) and metabolic factor (Chl-a, pH). Loadings of pollutant factor is potent influence main factor in the Geum river which is explained by loadings of pollutant factor at whole sampling stations (71.16%), month (52.75%) and season (56.57%) of main water quality stations. Result of this study is that pollutant loading factor is affected at Gongju 1, 2, Buyeo 1, 2, Gangkyeong, Yeongi stations by entire stations and entire month (Gongju 1, Cheongwon stations), April, May, July and August (buyeo 1) by month. Also the pollutant Loading factor is season gives an influence in winter (Gongju 1, buyeo 1) from main sampling stations, but Cheongwon characteristic is non-seasonal influenced. This study presents necessity and usefulness of multivariate statistic techniques for evaluation and interpretation of large complex data set with a view to get better information data effective management of water sources.

International Inflation Synchronization and Implications

  • CHON, SORA
    • KDI Journal of Economic Policy
    • /
    • 제42권2호
    • /
    • pp.57-84
    • /
    • 2020
  • This study analyzes global inflation synchronization and derives policy implications for the Korean economy. Unlike previous studies that assume a single global inflation factor, this study investigates if inflation in Korea can be explained further by other global inflation factors. Our principal component analysis provides three principal components for global inflation that are linked to the Korea inflation rate - the first component is closely related to OECD inflation, and the second and third components reflect China's inflation. This study empirically demonstrates via in-sample fitting and out-of-sample forecasting that the three principal components of global inflation play a significant role in explaining and predicting Korean inflation in the short-term, while their role is limited in the mid-term. Domestic macroeconomic variables are found to be more important for the mid-term movements of the Korean inflation rate. The empirical results here suggest that the Bank of Korea should focus more on domestic economic conditions than on global inflation when implementing monetary policy because global factors are likely to be already reflected in domestic macro-variables in the mid-term.

Assessment of Water Quality using Multivariate Statistical Techniques: A Case Study of the Nakdong River Basin, Korea

  • Park, Seongmook;Kazama, Futaba;Lee, Shunhwa
    • Environmental Engineering Research
    • /
    • 제19권3호
    • /
    • pp.197-203
    • /
    • 2014
  • This study estimated spatial and seasonal variation of water quality to understand characteristics of Nakdong river basin, Korea. All together 11 parameters (discharge, water temperature, dissolved oxygen, 5-day biochemical oxygen demand, chemical oxygen demand, pH, suspended solids, electrical conductivity, total nitrogen, total phosphorus, and total organic carbon) at 22 different sites for the period of 2003-2011 were analyzed using multivariate statistical techniques (cluster analysis, principal component analysis and factor analysis). Hierarchical cluster analysis grouped whole river basin into three zones, i.e., relatively less polluted (LP), medium polluted (MP) and highly polluted (HP) based on similarity of water quality characteristics. The results of factor analysis/principal component analysis explained up to 83.0%, 81.7% and 82.7% of total variance in water quality data of LP, MP, and HP zones, respectively. The rotated components of PCA obtained from factor analysis indicate that the parameters responsible for water quality variations were mainly related to discharge and total pollution loads (non-point pollution source) in LP, MP and HP areas; organic and nutrient pollution in LP and HP zones; and temperature, DO and TN in LP zone. This study demonstrates the usefulness of multivariate statistical techniques for analysis and interpretation of multi-parameter, multi-location and multi-year data sets.

Varietal Classification by Multivariate Analysis on Quantitative Traits in Pecan

  • Shin, Dong-Young;Nou, Ill-Sup
    • Plant Resources
    • /
    • 제2권2호
    • /
    • pp.75-80
    • /
    • 1999
  • Twenty two varieties of pecan including wild types were classified based on 6 characters measured by principal component analysis score distance. The results are summarized as fellow. Twenty two varieties were classified into 5 groups based in PCA score distance. Five groups were distinctly characterized by many morphological characters. Total variation could be explained by 51%, 95%, 99% with first, third and fifth principal components respectively. Varimax rotation of the factor loading of the first factors indicated that the first component was highly loaded with leaf characters, the second component with fruit characters, but fruit length was negative loaded. The second, the third and the fourths groups of cultivars had very close genetic parentage similarity.

  • PDF

Shrinkage Structure of Ridge Partial Least Squares Regression

  • Kim, Jong-Duk
    • Journal of the Korean Data and Information Science Society
    • /
    • 제18권2호
    • /
    • pp.327-344
    • /
    • 2007
  • 다중공선성의 데이터에 사용되는 대표적인 편향회귀방법은 능형회귀(RR), 주성분회귀(PCR), 부분최소제곱회귀(PLS) 등이다. 이 회귀방법들은 계수베거 추정량의 놈(norm)이 모두 보통 최소제곱회귀(OLS)의 추정량의 놈보다 작아진다는 의미에서 축소회귀라 부른다. 새로운 회귀방법으로 RR과 PCR을 결합한 능형주성분회귀(RPCR)가 있고 RR과 PLS를 결합한 능형부분최소제곱회귀(RPLS)가 있으며 이들도 또한 축소회귀이다. 이들 추정량은 X'X의 고유벡터들의 선형결합으로 나타낼 수 있고 따라서 각 고유방향에서 OLS에 비해 얼마나 축소되는지를 연구할 수 있다. 본 논문에서는 먼저 이들 추정량을 일반적인 축소인자의 식으로 나타내고 이를 이용하여 MSE의 일반식을 구하였으며 PLS 추정량의 MSE 식도 구하였다. 그리고 RPLS의 축소인자 식을 두 가지 다른 형태로 유도하였다. RPLS의 경우도 이 축소인자 식을 MSE의 일반식에 대입하면 MSE 식이 바로 얻어진다. 그러나 PLS나 RPLS의 축소인자는 y의 복잡한 비선형이 되어 결정적이 아니므로 이들 추정량의 MSE는 근사적인 식이라 할 수 있다. 따라서 PLS나 RPLS를 평가하기 위해 이 MSE를 사용하는 것은 제한적이며, 경험적인 방법으로 이들 회귀의 수행성을 평가하는 것이 필요하다. 다중공선성의 대표적인 데이터인 근적외선 분광 데이터를 이용하여 이 유도된 회귀의 축소인자 값이 인자수에 따라 어떻게 변화하는지와 전체적인 축소 비율도 살펴보았다. 이들의 축소 형태를 잘 이해하면 회귀방법들의 예측력과 안정성을 파악하는데 많은 도움이 되리라 판단된다.

  • PDF

Assessment of water quality variations under non-rainy and rainy conditions by principal component analysis techniques in Lake Doam watershed, Korea

  • Bhattrai, Bal Dev;Kwak, Sungjin;Heo, Woomyung
    • Journal of Ecology and Environment
    • /
    • 제38권2호
    • /
    • pp.145-156
    • /
    • 2015
  • This study was based on water quality data of the Lake Doam watershed, monitored from 2010 to 2013 at eight different sites with multiple physiochemical parameters. The dataset was divided into two sub-datasets, namely, non-rainy and rainy. Principal component analysis (PCA) and factor analysis (FA) techniques were applied to evaluate seasonal correlations of water quality parameters and extract the most significant parameters influencing stream water quality. The first five principal components identified by PCA techniques explained greater than 80% of the total variance for both datasets. PCA and FA results indicated that total nitrogen, nitrate nitrogen, total phosphorus, and dissolved inorganic phosphorus were the most significant parameters under the non-rainy condition. This indicates that organic and inorganic pollutants loads in the streams can be related to discharges from point sources (domestic discharges) and non-point sources (agriculture, forest) of pollution. During the rainy period, turbidity, suspended solids, nitrate nitrogen, and dissolved inorganic phosphorus were identified as the most significant parameters. Physical parameters, suspended solids, and turbidity, are related to soil erosion and runoff from the basin. Organic and inorganic pollutants during the rainy period can be linked to decayed matters, manure, and inorganic fertilizers used in farming. Thus, the results of this study suggest that principal component analysis techniques are useful for analysis and interpretation of data and identification of pollution factors, which are valuable for understanding seasonal variations in water quality for effective management.

식생활 외부화에 관한 한일 비교 연구 -주성분 분석을 이용하여- (Comparison of Dietary Externalization in Korea and Japan -by Principal Component Analysis-)

  • 최현숙
    • 동아시아식생활학회지
    • /
    • 제16권1호
    • /
    • pp.23-28
    • /
    • 2006
  • The purpose of this paper was to clarify the actual conditions of the 'Dietary externalization' mainly by using the economic and nutrition-related data, accompanied by the economic development in Korea and Japan. 'Modernization of food style' and other modernization have taken place, among which 'Dietary externalization' in particular has recently drawn interest. At the time this paper clarified with econometric analysis whether there are differences between the two countries in term of the modernization of food style and dietary externalization trend. The trends of Dietary externalization of both Korea and Japan were studied using Principal Component Analysis method. The food subgroup were investigated based on the annual report on the household income and expenditure survey of Korea and the annual report on the family income and expenditure survey of Japan. The statistical data from both country were analyzed by SAS program. The results are as follows; 1. In Korea, the ratio of carbohydrates in the total calorie intake is quite high and animal protein is rather low compared to those in Japan. 2. Traditional food such as grains and vegetables are consumed much more in Korea than in Japan. 3. The Principal Component 1, 2 were extracted in both countries during the whole analysis period, which suggested the 'Dietary externalization' 4. Principal Component 1 has a positive factor loaded in all food items including meals outside the home and process food. In other words, it is apparent that the 'Dietary externalization' tread in Korea has a simple pattern suggesting that all externalization related items are on the rise. 5. Principal component 1, 2 which indicated the dietary externalization, were detected in Japan.

  • PDF