• Title/Summary/Keyword: 회귀분석 방법

Search Result 3,605, Processing Time 0.035 seconds

Predicting Korea Pro-Baseball Rankings by Principal Component Regression Analysis (주성분회귀분석을 이용한 한국프로야구 순위)

  • Bae, Jae-Young;Lee, Jin-Mok;Lee, Jea-Young
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.3
    • /
    • pp.367-379
    • /
    • 2012
  • In baseball rankings, prediction has been a subject of interest for baseball fans. To predict these rankings, (based on 2011 data from Korea Professional Baseball records) the arithmetic mean method, the weighted average method, principal component analysis, and principal component regression analysis is presented. By standardizing the arithmetic average, the correlation coefficient using the weighted average method, using principal components analysis to predict rankings, the final model was selected as a principal component regression model. By practicing regression analysis with a reduced variable by principal component analysis, we propose a rank predictability model of a pitcher part, a batter part and a pitcher batter part. We can estimate a 2011 rank of pro-baseball by a predicted regression model. By principal component regression analysis, the pitcher part, the other part, the pitcher and the batter part of the ranking prediction model is proposed. The regression model predicts the rankings for 2012.

Prediction of Daily Maximum Ozone Concentration using Multi-Regression (중회귀 모형을 이용한 일최고 오존 농도 예측성 검토에 관한 연구)

  • 김영은;조석연
    • Proceedings of the Korea Air Pollution Research Association Conference
    • /
    • 1999.10a
    • /
    • pp.203-204
    • /
    • 1999
  • 대기질의 통계예측모형은 주로 오존 농도 예측에 사용된다. 통계예측 방법은 중회귀 모형, 신경망 모형, Fuzzy 논리 모형 등이 있다. 중회귀 모형은 종래 통계분석 방법으로 예전부터 많이 사용되고 있는 방법인 반면에 신경망 모형과 Fuzzy 논리 모형은 최근에 개발되어 적용가능성을 검토 중인 방법이다. 국내외 연구결과에 의하면 각 방법에 의한 고농도 오존 예측성은 크게 다르지 않았다. 국내에서는 중회귀 모형과 신경망 모형이 적용되었는데, 상관계수는 0.6-0.7저도로 보고되었다.(중략)

  • PDF

Prediction of Retention Time for PAH Molecule in HPLC (고속액체 크로마토그래피에서 PAH분자의 구조에 따른 용리시간 예측)

  • Kim, Young-Gu
    • Journal of the Korean Chemical Society
    • /
    • v.44 no.2
    • /
    • pp.102-108
    • /
    • 2000
  • Relative retention times (RRTs) of RAH molecules in HPLC are trained and predicted intesting sets using a multiple linear regression (NLR) and an artificial neural network (ANN). The maindescriptors in QSRR are molecular connectivity ($^1X_v,\;^2X_v$), the length-to-breadth ratios (L/B), and molecular dipole moment(D). L/B which is related with slot model is a good descripter in ANN, but isn't in MLR. Varainces which show the accuracy of prediction times in testing sets are 0.0099, 0.0114 for ANN and MLR, respectively. It was shown that ANN can exceed the MLR in prediction accuracy.

  • PDF

Comparison of Customer Satisfaction Indices Using Different Methods of Weight Calculation (가중치 산출방법에 따른 고객만족도지수의 비교)

  • Lee, Sang-Jun;Kim, Yong-Tae;Kim, Seong-Yoon
    • Journal of Digital Convergence
    • /
    • v.11 no.12
    • /
    • pp.201-211
    • /
    • 2013
  • This study compares Customer Satisfaction Index(CSI) and the weight for each dimension by applying various methods of weight calculation and attempts to suggest some implications. For the purpose, the study classified the methods of weight calculation into the subjective method and the statistical method. Constant sum scale was used for the subjective method, and the statistical method was again segmented into correlation analysis, principal component analysis, factor analysis, structural equation model. The findings showed that there is difference between the weights from the subjective method and the statistical method. The order of the weights by the analysis methods were classified with similar patterns. Besides, the weight for each dimension by different methods of weight calculation showed considerable deviation and revealed the difference of discrimination and stability among the dimensions. Lastly, the CSI calculated by various methods of weight calculation showed to be the highest in structural equation model, followed by in the order of regression analysis, correlation analysis, arithmetic mean, principal component analysis, constant sum scale and factor analysis. The CSI calculated by each method showed to have statistically significant difference.

Settlement Prediction Accuracy Analysis of Weighted Nonlinear Regression Hyperbolic Method According to the Weighting Method (가중치 부여 방법에 따른 가중 비선형 회귀 쌍곡선법의 침하 예측 정확도 분석)

  • Kwak, Tae-Young ;Woo, Sang-Inn;Hong, Seongho ;Lee, Ju-Hyung;Baek, Sung-Ha
    • Journal of the Korean Geotechnical Society
    • /
    • v.39 no.4
    • /
    • pp.45-54
    • /
    • 2023
  • The settlement prediction during the design phase is primarily conducted using theoretical methods. However, measurement-based settlement prediction methods that predict future settlements based on measured settlement data over time are primarily used during construction due to accuracy issues. Among these methods, the hyperbolic method is commonly used. However, the existing hyperbolic method has accuracy issues and statistical limitations. Therefore, a weighted nonlinear regression hyperbolic method has been proposed. In this study, two weighting methods were applied to the weighted nonlinear regression hyperbolic method to compare and analyze the accuracy of settlement prediction. Measured settlement plate data from two sites located in Busan New Port were used. The settlement of the remaining sections was predicted by setting the regression analysis section to 30%, 50%, and 70% of the total data. Thus, regardless of the weight assignment method, the settlement prediction based on the hyperbolic method demonstrated a remarkable increase in accuracy as the regression analysis section increased. The weighted nonlinear regression hyperbolic method predicted settlement more accurately than the existing linear regression hyperbolic method. In particular, despite a smaller regression analysis section, the weighted nonlinear regression hyperbolic method showed higher settlement prediction performance than the existing linear regression hyperbolic method. Thus, it was confirmed that the weighted nonlinear regression hyperbolic method could predict settlement much faster and more accurately.

回歸分析에 있어서의 多共線性과 名稱을 保全시키는 資料變換 技法

  • 兪浣
    • Journal of the Korean Statistical Society
    • /
    • v.8 no.2
    • /
    • pp.109-116
    • /
    • 1979
  • 두 개의 변수의 대체효과(substitution effect)를 연구하기 위하여 수요 또는 공급의 모형을 만들었을 경우 이에 관련된 변수들의 이름이 중요시 된다. 실제 관측 자료를 사용하였을 경우 흔히 일어나는 다공선성(multicollinearity) 문제를 다루기 위한 대안으로써 선형회귀선을 예로 들어 능형회귀기법(ridge regression technique)과 요인분석기법(factor analytic technique)을 소개하였으며 이에서 얻어지는 계수(coefficient)를 OLS 추정치로 설명하기 위하여 원래의 자료를 변환하였다. 실지 수요와 공급의 모형이 비선형일 경우 일반적으로 능형회귀나 요인분석을 쓰지 못한다는 점을 감안, 이러한 방법을 자료의 변환방법으로 설명함으로써 비선형모형에서도 다공선성문제를 위하여 능형회귀분석법이나 요인분석기법을 사용할 수 있도록 하였다.

  • PDF

Comparative Analysis of Determination of Method Location between Classes (클래스 간 메소드 위치 결정 방법의 비교)

  • Jung, Young-Ae;Park, Young-B.
    • The Journal of the Korea Contents Association
    • /
    • v.6 no.12
    • /
    • pp.80-88
    • /
    • 2006
  • In Object-Oriented Paradigm, various cohesion measurements have been studied taking into account reference relation among components - like attributes and methods - that belong to a class. In addition, a number of methods have taken into research utilizing manual analysis, that is performed by developer's intuition and experience, and automatic analysis in refactoring field. The verification of objective criteria is demanded in order to process automatic refactoring. In this paper, we propose a method exploiting logistic regression and neural network for analysis of the relationship between six factors considering reference relation and method location among classes. Experimental results demonstrate that the logistic regression predicts the results up to 97% and the neural network predicts the outcomes up to 90%. Hence, we conclude that the logistic regression based method is more effective to predict the method location. Moreover, more than 90% of experimental results from both methods show that the six factors used in Move Method in refactoring are suitable to be used as an objective criteria.

  • PDF

Developing the high-risk drinking predictive model in Korea using the data mining technique (데이터마이닝 기법을 활용한 한국인의 고위험 음주 예측모형 개발 연구)

  • Park, Il-Su;Han, Jun-Tae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.6
    • /
    • pp.1337-1348
    • /
    • 2017
  • In this paper, we develop the high-risk drinking predictive model in Korea using the cross-sectional data from Korea Community Health Survey (2014). We perform the logistic regression analysis, the decision tree analysis, and the neural network analysis using the data mining technique. The results of logistic regression analysis showed that men in their forties had a high risk and the risk of office workers and sales workers were high. Especially, current smokers had higher risk of high-risk drinking. Neural network analysis and logistic regression were the most significant in terms of AUROC (area under a receiver operation characteristic curve) among the three models. The high-risk drinking predictive model developed in this study and the selection method of the high-risk intensive drinking group can be the basis for providing more effective health care services such as hazardous drinking prevention education, and improvement of drinking program.

Drought risk assessment by monthly precipitation regression in multipurpose dams (다목적 댐의 월강우량 회귀분석에 의한 가뭄위험 평가)

  • Park, Chang Eon;Kim, Da Rae
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2016.05a
    • /
    • pp.263-263
    • /
    • 2016
  • 기후변화 등에 따른 가뭄위험을 평가하기 위한 다양한 방법이 연구되어 왔으며, 기상학적인 가뭄이나 생물학적인 가뭄 등으로 정의되는 가뭄지수들이 개발되어졌다. 그러나 궁극적으로 가뭄의 판정은 수원으로부터 더 이상의 용수를 공급할 수 없는 상황에 처해졌을 때 비로소 결정되는 것이므로, 수원공의 가뭄위험에 대한 평가가 우선적일 것으로 판단된다. 본 연구에서는 수도권의 생공용수 공급을 책임지고 있다고 해도 과언이 아닌 다목적댐인 소양강댐과 충주댐의 가뭄위험을 평가하기 위하여 월강우량 자료로부터 특정시기의 저수율을 예측할 수 있는 방법을 개발하도록 하였다. 월강우량 변화에 따른 저수율의 변화양상을 예측하기 위하여 저수지 유입량과 방류량에 따른 물수지 분석이 정교하게 이루어져야 하지만, 실질적으로 상류에 또 다른 댐이 존재하는 상황에서 유입량을 정확하게 예측하는 것도 어렵지만 수시로 상황에 따라 이루어지는 방류량을 적절히 예측하는 것은 거의 불가능하므로, 물수지 분석에 의한 저수율 예측은 어느 정도의 불확실성을 가질수밖에 없을 것으로 판단되어 댐 관리관행에 따라 나타나는 월강우량과 저수율 사이의 회귀분석을 통하여 일정한 법칙을 만들 수 있는지 시도하였다. 다목적 댐인 소양강댐과 충주댐의 1984-2013년의 일별 저수율 자료로부터 저수율 관리관행을 파악할 수 있었는데, 다목적 댐인 관계로 호우시의 홍수피해 예방을 위하여 6월말에는 25-35% 정도의 저수율을 유지하도록 관리가 이루어지고 있었으며 호우가 발생된 이후에는 일정량을 수시로 방류하여 다음 호우를 준비하고 있는 것으로 나타났다. 또한 각 댐의 최저 저수율은 3월말 - 4월에 발생하는 것으로 나타났으며, 4월과 5월에 일정 정도의 강우량만 존재한다면 가뭄피해는 발생하지 않는 것으로 나타났다. 이와 같은 저수율 관리 관행을 적용하여 예측되는 강우량 패턴에 따른 저수율 변화를 예측하기 위하여 월강우량 자료와 4월 1일 기준의 저수율 자료 사이의 회귀분석을 실시하여 전년의 7월부터 당해 3월까지의 월강우량으로부터 4월 1일의 저수율을 예측할 수 있는 의미 있는 결과를 도출하였다. 이러한 결과는 기후변화 등에 따른 미래에 예측되는 월강우량 자료로부터 각 댐의 4월 1일 기준 저수율 자료를 예측할 수 있으며, 4월 및 5월의 월강우량과 함께 분석함으로써 가뭄위험을 평가할 수 있는 한 방법으로 적절한 활용이 가능할 것으로 판단된다.

  • PDF

Comparison of Bias Correction Methods for the Rare Event Logistic Regression (희귀 사건 로지스틱 회귀분석을 위한 편의 수정 방법 비교 연구)

  • Kim, Hyungwoo;Ko, Taeseok;Park, No-Wook;Lee, Woojoo
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.2
    • /
    • pp.277-290
    • /
    • 2014
  • We analyzed binary landslide data from the Boeun area with logistic regression. Since the number of landslide occurrences is only 9 out of 5000 observations, this can be regarded as a rare event data. The main issue of logistic regression with the rare event data is a serious bias problem in regression coefficient estimates. Two bias correction methods were proposed before and we quantitatively compared them via simulation. Firth (1993)'s approach outperformed and provided the most stable results for analyzing the rare-event binary data.