• Title/Summary/Keyword: canonical correlation & principal component

Search Result 13, Processing Time 0.024 seconds

Equivalence study of canonical correspondence analysis by weighted principal component analysis and canonical correspondence analysis by Gaussian response model (가중주성분분석을 활용한 정준대응분석과 가우시안 반응 모형에 의한 정준대응분석의 동일성 연구)

  • Jeong, Hyeong Chul
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.6
    • /
    • pp.945-956
    • /
    • 2021
  • In this study, we considered the algorithm of Legendre and Legendre (2012), which derives canonical correspondence analysis from weighted principal component analysis. And, it was proved that the canonical correspondence analysis based on the weighted principal component analysis is exactly the same as Ter Braak's (1986) canonical correspondence analysis based on the Gaussian response model. Ter Braak (1986)'s canonical correspondence analysis derived from a Gaussian response curve that can explain the abundance of species in ecology well uses the basic assumption of the species packing model and then conducts generalized linear model and canonical correlation analysis. It is derived by way of binding. However, the algorithm of Legendre and Legendre (2012) is calculated in a method quite similar to Benzecri's correspondence analysis without such assumptions. Therefore, if canonical correspondence analysis based on weighted principal component analysis is used, it is possible to have some flexibility in using the results. In conclusion, this study shows that the two methods starting from different models have the same site scores, species scores, and species-environment correlations.

Development of a Compound Classification Process for Improving the Correctness of Land Information Analysis in Satellite Imagery - Using Principal Component Analysis, Canonical Correlation Classification Algorithm and Multitemporal Imagery - (위성영상의 토지정보 분석정확도 향상을 위한 응용체계의 개발 - 다중시기 영상과 주성분분석 및 정준상관분류 알고리즘을 이용하여 -)

  • Park, Min-Ho
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.28 no.4D
    • /
    • pp.569-577
    • /
    • 2008
  • The purpose of this study is focused on the development of compound classification process by mixing multitemporal data and annexing a specific image enhancement technique with a specific image classification algorithm, to gain more accurate land information from satellite imagery. That is, this study suggests the classification process using canonical correlation classification technique after principal component analysis for the mixed multitemporal data. The result of this proposed classification process is compared with the canonical correlation classification result of one date images, multitemporal imagery and a mixed image after principal component analysis for one date images. The satellite images which are used are the Landsat 5 TM images acquired on July 26, 1994 and September 1, 1996. Ground truth data for accuracy assessment is obtained from topographic map and aerial photograph, and all of the study area is used for accuracy assessment. The proposed compound classification process showed superior efficiency to appling canonical correlation classification technique for only one date image in classification accuracy by 8.2%. Especially, it was valid in classifying mixed urban area correctly. Conclusively, to improve the classification accuracy when extracting land cover information using Landsat TM image, appling canonical correlation classification technique after principal component analysis for multitemporal imagery is very useful.

Canonical correlation analysis based fault diagnosis method for structural monitoring sensor networks

  • Huang, Hai-Bin;Yi, Ting-Hua;Li, Hong-Nan
    • Smart Structures and Systems
    • /
    • v.17 no.6
    • /
    • pp.1031-1053
    • /
    • 2016
  • The health conditions of in-service civil infrastructures can be evaluated by employing structural health monitoring technology. A reliable health evaluation result depends heavily on the quality of the data collected from the structural monitoring sensor network. Hence, the problem of sensor fault diagnosis has gained considerable attention in recent years. In this paper, an innovative sensor fault diagnosis method that focuses on fault detection and isolation stages has been proposed. The dynamic or auto-regressive characteristic is firstly utilized to build a multivariable statistical model that measures the correlations of the currently collected structural responses and the future possible ones in combination with the canonical correlation analysis. Two different fault detection statistics are then defined based on the above multivariable statistical model for deciding whether a fault or failure occurred in the sensor network. After that, two corresponding fault isolation indices are deduced through the contribution analysis methodology to identify the faulty sensor. Case studies, using a benchmark structure developed for bridge health monitoring, are considered in the research and demonstrate the superiority of the new proposed sensor fault diagnosis method over the traditional principal component analysis-based and the dynamic principal component analysis-based methods.

Exploring COVID-19 in mainland China during the lockdown of Wuhan via functional data analysis

  • Li, Xing;Zhang, Panpan;Feng, Qunqiang
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.1
    • /
    • pp.103-125
    • /
    • 2022
  • In this paper, we analyze the time series data of the case and death counts of COVID-19 that broke out in China in December, 2019. The study period is during the lockdown of Wuhan. We exploit functional data analysis methods to analyze the collected time series data. The analysis is divided into three parts. First, the functional principal component analysis is conducted to investigate the modes of variation. Second, we carry out the functional canonical correlation analysis to explore the relationship between confirmed and death cases. Finally, we utilize a clustering method based on the Expectation-Maximization (EM) algorithm to run the cluster analysis on the counts of confirmed cases, where the number of clusters is determined via a cross-validation approach. Besides, we compare the clustering results with some migration data available to the public.

An Analytical Study on the Stem-Growth by the Principal Component and Canonical Correlation Analyses (주성분(主成分) 및 정준상관분석(正準相關分析)에 의(依)한 수간성장(樹幹成長) 해석(解析)에 관(關)하여)

  • Lee, Kwang Nam
    • Journal of Korean Society of Forest Science
    • /
    • v.70 no.1
    • /
    • pp.7-16
    • /
    • 1985
  • To grasp canonical correlations, their related backgrounds in various growth factors of stem, the characteristics of stem by synthetical dispersion analysis, principal component analysis and canonical correlation analysis as optimum method were applied to Larix leptolepis. The results are as follows; 1) There were high or low correlation among all factors (height ($x_1$), clear height ($x_2$), form height ($x_3$), breast height diameter (D. B. H.: $x_4$), mid diameter ($x_5$), crown diameter ($x_6$) and stem volume ($x_7$)) except normal form factor ($x_8$). Especially stem volume showed high correlation with the D.B.H., height, mid diameter (cf. table 1). 3) (1) Canonical correlation coefficients and canonical variate between stem volume and composite variate of various height growth factors ($x_1$, $x_2$ and $x_3$) are ${\gamma}_{u1,v1}=0.82980^{**}$, $\{u_1=1.00000x_7\\v_1=1.08323x_1-0.04299x_2-0.07080x_3$. (2) Those of stem volume and composite variate of various diameter growth factors ($x_4$, $x_5$ and $x_6$) are ${\gamma}_{u1,v1}=0.98198^{**}$, $\{{u_1=1.00000x_7\\v_1=0.86433x_4+0.11996x_5+0.02917x_6$. (3) And canonical correlation between stem volume and composite variate of six factors including various heights and diameters are ${\gamma}_{u1,v1}=0.98700^{**}$, $\{^u_1=1.00000x_7\\v1=0.12948x_1+0.00291x_2+0.03076x_3+0.76707x_4+0.09107x_5+0.02576x_6$. All the cases showed the high canonical correlation. Height in the case of (1), D.B.H. in that of (2), and the D.B.H, and height in that of (3) respectively make an absolute contribution to the canonical correlation. Synthetical characteristics of each qualitative growth are largely affected by each factor. Especially in the case of (3) the influence by the D.B.H. is the most significant in the above six factors (cf. table 2). 3) Canonical correlation coefficient and canonical variate between composite variate of various height growth factors and that of the various diameter factors are ${\gamma}_{u1,v1}=0.78556^{**}$, $\{u_1=1.20569x_1-0.04444x_2-0.21696x_3\\v_1=1.09571x_4-0.14076x_5+0.05285x_6$. As shown in the above facts, only height and D.B.H. affected considerably to the canonical correlation. Thus, it was revealed that the synthetical characteristics of height growth was determined by height and those of the growth in thickness by D.B.H., respectively (cf. table 2). 4) Synthetical characteristics (1st-3rd principal component) derived from eight growth factors of stem, on the basis of 85% accumulated proportion aimed, are as follows; Ist principal component ($z_1$): $Z_1=0.40192x_1+0.23693x_2+0.37047x_3+0.41745x_4+0.41629x_5+0.33454x_60.42798x_7+0.04923x_8$, 2nd principal component ($z_2$): $z_2=-0.09306x_1-0.34707x_2+0.08372x_3-0.03239x_4+0.11152x_5+0.00012x_6+0.02407x_7+0.92185x_8$, 3rd principal component ($z_3$): $Z_3=0.19832x_1+0.68210x_2+0.35824x_3-0.22522x_4-0.20876x_5-0.42373x_6-0.15055x_7+0.26562x_8$. The first principal component ($z_1$) as a "size factor" showed the high information absorption power with 63.26% (proportion), and its principal component score is determined by stem volume, D.B.H., mid diameter and height, which have considerably high factor loading. The second principal component ($z_2$) is the "shape factor" which indicates cubic similarity of the stem and its score is formed under the absolute influence of normal form factor. The third principal component ($z_3$) is the "shape factor" which shows the degree of thickness and length of stem. These three principal components have the satisfactory information absorption power with 88.36% of the accumulated percentage. variance (cf. table 3). 5) Thus the principal component and canonical correlation analyses could be applied to the field of forest measurement, judgement of site qualities, management diagnoses for the forest management and the forest products industries, and the other fields which require the assessment of synthetical characteristics.

  • PDF

An Analytical Study on Stem Growth of Chamaecyparis obtusa (편백(扁栢)의 수간성장(樹幹成長)에 관(關)한 해석적(解析的) 연구(硏究))

  • An, Jong Man;Lee, Kwang Nam
    • Journal of Korean Society of Forest Science
    • /
    • v.77 no.4
    • /
    • pp.429-444
    • /
    • 1988
  • Considering the recent trent toward the development of multiple-use of forest trees, investigations for comprehensive information on these young stands of Hinoki cypress are necessary for rational forest management. From this point of view, 83 sample trees were selected and cut down from 23-ear old stands of Hinoki cypress at Changsung-gun, Chonnam-do. Various stem growth factors of felled trees were measured and canonical correlaton analysis, principal component analysis and factor analysis were applied to investigate the stem growth characteristics, relationships among stem growth factors, and to get potential information and comprehensive information. The results are as follows ; Canonical correlation coefficient between stem volume and quality growth factor was 0.9877. Coefficient of canonical variates showed that DBH among diameter growth factors and height among height growth factors had important effects on stem volume. From the analysis of relationship between stem-volume and canonical variates, which were linearly combined DBH with height as one set, DBH had greater influence on volume growth than height. The 1st-2nd principal components here adopted to fit the effective value of 85% from the pincipal component analysis for 12 stem growth factors. The result showed that the 1st-2nd principal component had cumulative contribution rate of 88.10%. The 1st and the 2nd principal components were interpreted as "size factor" and "shape factor", respectively. From summed proportion of the efficient principal component fur each variate, information of variates except crown diameter, clear length and form height explained more than 87%. Two common factors were set by the eigen value obtained from SMC (squared multiple correlation) of diagonal elements of canonical matrix. There were 2 latent factors, $f_1$ and $f_2$. The former way interpreted as nature of diameter growth system. In inherent phenomenon of 12 growth factor, communalities except clear length and crown diameter had great explanatory poorer of 78.62-98.30%. Eighty three sample trees could he classified into 5 stem types as follows ; medium type within a radius of ${\pm}1$ standard deviation of factor scores, uniformity type in diameter and height growth in the 1st quadrant, slim type in the 2nd quadrant, dwarfish type in the 3rd quadrant, and fall-holed type in the 4 th quadrant.

  • PDF

Relationship between Physical Environmental Factors and Biological Indices of A Mountain Valley Stream (Mt. Cheoggye) (산간계류(청계산)의 물리적 환경요인과 생물지수의 관계)

  • Minjeong Yeo;Dongsoo Kong
    • Journal of Korean Society on Water Environment
    • /
    • v.39 no.4
    • /
    • pp.288-301
    • /
    • 2023
  • This study aims to identify benthic macroinvertebrate fauna inhabiting at the mountain valley stream (Mt. Cheonggye) in Korea and the relationship between physical environmental factors and biological indices. Benthic macroinvertebrates were collected at five locations on August 24 and October 14, 2020, and were identified as 4 phyla, 7 classes, 16 orders, 42 families, and 72 species. Dominance ranged from 0.38 to 0.59, diversity 2.81 to 3.75, richness 3.25 to 4.63, evenness 0.65 to 0.84, and %EPT (Ephemeroptera-PlecopteraTrichoptera) richness value 42% to 73%, respectively. All sites were evaluated as a very good status by mostly biological indices based on tolerance of indicator organisms in Korea. As a result of principal component analysis, biological indices are classified into species-level indices and higher cartegory-level indices according to the taxonomic level of the indicator organism considered in each index. As a result of canonical correspondence analysis, it was confirmed that current velocity was a major factor that increased species richness and classified biological indices according to taxonomic category level. Water depth was a major factor related to the community indices, and the deeper the water depth, the lower the diversity and the evenness.

Missing Value Estimation and Sensor Fault Identification using Multivariate Statistical Analysis (다변량 통계 분석을 이용한 결측 데이터의 예측과 센서이상 확인)

  • Lee, Changkyu;Lee, In-Beum
    • Korean Chemical Engineering Research
    • /
    • v.45 no.1
    • /
    • pp.87-92
    • /
    • 2007
  • Recently, developments of process monitoring system in order to detect and diagnose process abnormalities has got the spotlight in process systems engineering. Normal data obtained from processes provide available information of process characteristics to be used for modeling, monitoring, and control. Since modern chemical and environmental processes have high dimensionality, strong correlation, severe dynamics and nonlinearity, it is not easy to analyze a process through model-based approach. To overcome limitations of model-based approach, lots of system engineers and academic researchers have focused on statistical approach combined with multivariable analysis such as principal component analysis (PCA), partial least squares (PLS), and so on. Several multivariate analysis methods have been modified to apply it to a chemical process with specific characteristics such as dynamics, nonlinearity, and so on.This paper discusses about missing value estimation and sensor fault identification based on process variable reconstruction using dynamic PCA and canonical variate analysis.

An Analytical Study on the Growth Factors of Bamboo Culm by the Multivariate Analysis (다변량분석(多變量分析)에 의(依)한 죽간(竹稈)의 성장해석(成長解析)에 관(關)하여)

  • Lee, Kwang Nam;Cha, Gyung Soo
    • Journal of Korean Society of Forest Science
    • /
    • v.76 no.4
    • /
    • pp.338-347
    • /
    • 1987
  • The research was carried out to investigate the related phenomena, the latent structures and synthetical characteristics in various growth factors of Phyllostachys bambusoides Sieb. et Zucc. growing at Damyang gun, Chollanamdo, using multivariate analysis. 1. By synthetical characteristics in canonical correlation between height-growth factor group and diameter-growth factor group, the former was determined by the culm height ($x_1$), and the latter by the. diameter of the largest internode($x_7$). And for those between quantitative growth factor group and qualitative growth factor group, the former was determined by the surface area($x_{10}$), and the latter by the diameter of the largest internode ($x_7$). 2. The ten growth factors of bamboo culm were simplified by two principal components on the basis of accumulated proportion aimed at 90%. The first principal component($Z_1$) as a "size factor" showed high correlation with growth factors except eye-height diameter($x_5$). The second principal component($Z_2$) as a "shape factor" showed high correlation only with $x_5$. 3. The bamboo culm, and the latent phenomenon between their growth factors could be determined by two common factors showing high communality(94.16%). The ten growth factors can be grouped into two attribute factors: quantity and quality. 4. The bamboo culms can be classified into five types: total, volume, shape-quality, inferior and middle.

  • PDF

Develpoment of Customer Satisfaction Model of Providing Traffic Information through VMS on the Freeway (교통정보 제공에 따른 이용자 만족도 모형 개발 - 고속도로상의 VMS 정보제공을 중심으로 -)

  • Kim, Jang Wook;Kim, Tae Hee;Lee, Soo Beom
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.28 no.5D
    • /
    • pp.597-607
    • /
    • 2008
  • ATIS(Advanced Traffic Information System) provide valuable information as the travel time and traffic congestion, detour, traffic accident information to drivers, so it is being in the spotlight. But so far, the study on the consumer satisfaction with providing traffic information is incomplete. So, this study run a Canonical discriminant analysis and a Canonical correlation analysis by a QuantificationIItheory based on a Traffic Information Satisfaction image data through questionnaires, and found out the factors with influence on the consumer satisfaction. And this study definitely found out the correlation between consumer's recognition and traffic information satisfaction through understanding the change on the recognition about traffic information satisfaction by a QuantificationItheory. Finally, this study found out the change on the sensibility recognition of drivers by running the principal component anlysis, developed the traffic information satisfaction evaluation model considering the change on the recognition by using the structural equation model.