• Title/Summary/Keyword: 주성분회귀분석

Search Result 152, Processing Time 0.028 seconds

Multi-currencies portfolio strategy using principal component analysis and logistic regression (주성분 분석과 로지스틱 회귀분석을 이용한 다국 통화포트폴리오 전략)

  • Shim, Kyung-Sik;Ahn, Jae-Joon;Oh, Kyong-Joo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.1
    • /
    • pp.151-159
    • /
    • 2012
  • This paper proposes to develop multi-currencies portfolio strategy using principal component analysis (PCA) and logistic regression (LR) in foreign exchange market. While there is a great deal of literature about the analysis of exchange market, there is relatively little work on developing trading strategies in foreign exchange markets. There are two objectives in this paper. The first objective is to suggest portfolio allocation method by applying PCA. The other objective is to determine market timing which is the strategy of making buy or sell decision using LR. The results of this study show that proposed model is useful trading strategy in foreign exchange market and can be desirable solution which gives lots of investors an important investment information.

Suggestion of batter ability index in Korea baseball - focusing on the sabermetrics statistics WAR (한국프로야구에서 타자능력지수 제안 - 대체선수대비승수(WAR)을 중심으로)

  • Lee, Jea-Young;Kim, Hyeon-Gyu
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.7
    • /
    • pp.1271-1281
    • /
    • 2016
  • Wins above replacement (WAR) is one of the most widely used statistic among sabermatrics statistics that measure the ability of a batter in baseball. WAR has a great advantage that is to represent the attack power of the player and the base running ability, defensive ability as a single value. In this study, we proposed a hitter ability index using the sabermetrics statistics that can replace WAR based on Korea Baseball Record Data of the last three years (2013-2015). First, we calculated Batter ability index through the arithmetic mean method, the weighted average method, principal component regression and selected the method that had high correlation with WAR.

A study on principal component analysis using penalty method (페널티 방법을 이용한 주성분분석 연구)

  • Park, Cheolyong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.4
    • /
    • pp.721-731
    • /
    • 2017
  • In this study, principal component analysis methods using Lasso penalty are introduced. There are two popular methods that apply Lasso penalty to principal component analysis. The first method is to find an optimal vector of linear combination as the regression coefficient vector of regressing for each principal component on the original data matrix with Lasso penalty (elastic net penalty in general). The second method is to find an optimal vector of linear combination by minimizing the residual matrix obtained from approximating the original matrix by the singular value decomposition with Lasso penalty. In this study, we have reviewed two methods of principal components using Lasso penalty in detail, and shown that these methods have an advantage especially in applying to data sets that have more variables than cases. Also, these methods are compared in an application to a real data set using R program. More specifically, these methods are applied to the crime data in Ahamad (1967), which has more variables than cases.

Principal Components Logistic Regression based on Robust Estimation (로버스트추정에 바탕을 둔 주성분로지스틱회귀)

  • Kim, Bu-Yong;Kahng, Myung-Wook;Jang, Hea-Won
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.3
    • /
    • pp.531-539
    • /
    • 2009
  • Logistic regression is widely used as a datamining technique for the customer relationship management. The maximum likelihood estimator has highly inflated variance when multicollinearity exists among the regressors, and it is not robust against outliers. Thus we propose the robust principal components logistic regression to deal with both multicollinearity and outlier problem. A procedure is suggested for the selection of principal components, which is based on the condition index. When a condition index is larger than the cutoff value obtained from the model constructed on the basis of the conjoint analysis, the corresponding principal component is removed from the logistic model. In addition, we employ an algorithm for the robust estimation, which strives to dampen the effect of outliers by applying the appropriate weights and factors to the leverage points and vertical outliers identified by the V-mask type criterion. The Monte Carlo simulation results indicate that the proposed procedure yields higher rate of correct classification than the existing method.

Analysis of Air Temperature Factors Related to Difference of Fruit Characteristics According to Cultivating Areas of Persimmon (Diospyros kaki Thunb.) (감 재배지 간 과실 품질 차이에 관계한 기온요인 분석)

  • Kim, Ho-Cheol;Jeon, Kyung-Soo;Kim, Tae-Choon
    • Journal of Bio-Environment Control
    • /
    • v.17 no.2
    • /
    • pp.124-131
    • /
    • 2008
  • To investigate main air temperature factors correlated to difference of fruit characteristics according to cultivating areas, fruit and air temperature characteristics of eight cultivating areas of 'Fuyu' persimmon were analyzed by principle components and multiple regression analysis. The first principal components extracted from 16 air temperature factors was annual mean temperature, mean temperature during October, annual mean minimum extreme temperature, mean temperature during growing period, and so forth. The second principal components was mean temperature during May and June and so forth. And cumulative contribution was 91.4%. The five of eight cultivating area had clearly the difference of main factors or the correlated direction among cultivating areas. In multiple regression analysis between the extracted main factors and fruit characteristics, fruit hight were highly correlated with mean temperature during growing period ($X_8$) and cumulative temperature ($X_6$), and the regression equation was $Y=150.55-5.375X_8+ 0.014X_6(r^2=0.843)$. Also this regression equation was affected by mean minimum temperature during growing period, cumulative temperature, and mean temperature during August. Fruit diameter was negatively correlated with mean temperature during growing period, flesh browning rate and Hunter a value of peel color were positively correlated with mean minimum temperature during growing period and annual minimum air temperature, respectively.

Analysis on Correlation between AE Parameters and Stress Intensity Factor using Principal Component Regression and Artificial Neural Network (주성분 회귀분석 및 인공신경망을 이용한 AE변수와 응력확대계수와의 상관관계 해석)

  • Kim, Ki-Bok;Yoon, Dong-Jin;Jeong, Jung-Chae;Park, Phi-Iip;Lee, Seung-Seok
    • Journal of the Korean Society for Nondestructive Testing
    • /
    • v.21 no.1
    • /
    • pp.80-90
    • /
    • 2001
  • The aim of this study is to develop the methodology which enables to identify the mechanical properties of element such as stress intensity factor by using the AE parameters. Considering the multivariate and nonlinear properties of AE parameters such as ringdown count, rise time, energy, event duration and peak amplitude from fatigue cracks of machine element the principal component regression(PCR) and artificial neural network(ANN) models for the estimation of stress intensity factor were developed and validated. The AE parameters were found to be very significant to estimate the stress intensity factor. Since the statistical values including correlation coefficients, standard mr of calibration, standard error of prediction and bias were stable, the PCR and ANN models for stress intensity factor were very robust. The performance of ANN model for unknown data of stress intensity factor was better than that of PCR model.

  • PDF

Prediction of damages induced by Snow using Multiple-linear regression and Artificial Neural Network model (다중선형회귀 및 인공신경망 모형을 이용한 대설피해에 따른 피해액 예측에 관한 연구)

  • Kwon, Soon Ho;Lee, Eui Hoon;Chung, Gunhui;Kim, Joong Hoon
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2017.05a
    • /
    • pp.20-20
    • /
    • 2017
  • 최근 기후변화 영향에 따라 전 세계적으로 인명피해 및 재산피해를 유발하는 자연재난이 지속적으로 증가하고 있으며, 그로 인한 자연재해의 규모가 점점 더 커지고 있다. 실제로 우리나라에서도 지난 1994 년에서 2013 년까지 지난 20 년간 자연재해에 의한 피해액은 12조 3천억 원으로 집계되었으며, 이 중 강우와 태풍에 의한 피해가 85 % 이고, 대설에 의한 피해는 약 13 % 로 자연재해 중 대부분의 피해는 강우 및 태풍에서 발생하지만, 폭설에 의한 피해도 적지 않은 것으로 나타났다. 이에 따라, 정확한 예측을 위해 신뢰도 높은 자료 구축을 통한 대설피해 예측에 관한 연구가 필요한 시점이다. 본 연구에서는 대설피해액 예측을 위해 우리나라의 63개 기상 관측소에서 관측한 적설심 자료 및 기상관측 자료와 사회 경제 자료 총 11개를 대설피해 예측을 위한 입력변수로 선정하고, 이를 기상관측소가 속한 도시의 면적에 따라 3개의 지역으로 구분하였다. 주성분분석을 활용하여 선정된 입력변수들을 4개의 주성분으로 구분하고, 인공신경망 및 다중선형 회귀 모형을 구성하여 각 지역별 대설피해 예측의 오차를 분석하였다. 적용결과, 인공신경망 모형을 이용한 대설피해 예측의 수정결정계수는 22.8 %~48.2 %를 나타냈고, 다중선형회귀 모형의 수정결정 계수는 9.2 %~39.7% 로 나타났다. 그러므로 인공신경망 모형이 다중회귀 모형보다 선택된 입력자료를 활용하여 대설피해를 예측하는 목적으로 조금 더 우수한 결과를 나타내었다. 향후 자료를 보완 및 모형의 고도화를 통해 보다 정확한 대설피해 예측 함수 개발이 가능할 것으로 기대된다.

  • PDF

Development of Regression Models Resolving High-Dimensional Data and Multicollinearity Problem for Heavy Rain Damage Data (호우피해자료에서의 고차원 자료 및 다중공선성 문제를 해소한 회귀모형 개발)

  • Kim, Jeonghwan;Park, Jihyun;Choi, Changhyun;Kim, Hung Soo
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.38 no.6
    • /
    • pp.801-808
    • /
    • 2018
  • The learning of the linear regression model is stable on the assumption that the sample size is sufficiently larger than the number of explanatory variables and there is no serious multicollinearity between explanatory variables. In this study, we investigated the difficulty of model learning when the assumption was violated by analyzing a real heavy rain damage data and we proposed to use a principal component regression model or a ridge regression model after integrating data to overcome the difficulty. We evaluated the predictive performance of the proposed models by using the test data independent from the training data, and confirmed that the proposed methods showed better predictive performances than the linear regression model.

A Study on Patterning and Grading by the Impact of Traffic Culture Index (교통문화지수 영향요인에 의한 유형화와 영향정도에 관한 연구)

  • Jeong Cheal-Woo;Jung Hun-Young;Ko Sang-Sean
    • Journal of Navigation and Port Research
    • /
    • v.30 no.1 s.107
    • /
    • pp.35-43
    • /
    • 2006
  • This study suggests strategies to prevent traffic accidents by utilizing impact factors per each cluster and the typical patterns of 81 cities based on the statistical analysis of the data concerning the TCI which was developed from the partnership of the Traffic Safety Authority and the Green Traffic Movement Corporation in 2002 and 2003. The Principal Component Analysis and Cluster Analysis on impact factors and TCI result in 4 components and 4 clusters. Also as the results of Stepwise Multiple Regression Analysis examining the relationship between impact factors and TCI, R2 values of these models show high to all clusters. According to the results, we suggest strategies to prevent traffic accidents per cluster concretely and it is necessary to analyze how effective the invested facilities are in reducing traffic accidents in the future.

Factors Contributing to Winning in Ice Hockey: Analysis of 2017 Ice Hockey World Championship (2017 International Ice Hockey Federation World Championship의 승리 결정요인 분석)

  • Lee, Jusung;Kim, Hyeyoung;Kim, Chaeeun;Pathak, Prabhat;Moon, Jeheon
    • 한국체육학회지인문사회과학편
    • /
    • v.57 no.4
    • /
    • pp.387-394
    • /
    • 2018
  • The purpose of this study is to provide information regarding the strategies by identifying the main variables that determines the winning team based on the records of all games of the 2017 IIHF World Championship Top league. 64 matches were analyzed for the study. 6 variables were analyzed which included ratio of saves, shots on goal, penalties in minutes, time for power play, power play goals, and face off wins. Logistic regression analysis (LRA), multiple regression analysis (MRA), and principal component analysis (PCA) were implemented to examine the relationship between win and loss. In case of LRA, shots on goal (p<.001), face-off wins (p<.001) had significantly positive relation to winning of game whereas, penalties in minutes (p<.01) and time on power play (p<.01) had significantly negative. Using MRA, win percentage was calculated which had significant positive correlation to ratio of saves (p<.01) and face-off wins (p<.001) whereas, a significant negative with penalties in minutes (p<.001). For PCA, the winning team consisted of penalty, attack, and defense factors whereas, losing teams consisted only the attack and defense factors.