• Title/Summary/Keyword: 회귀분포

Search Result 981, Processing Time 0.026 seconds

A comparison study of Bayesian high-dimensional linear regression models (베이지안 고차원 선형 회귀분석에서의 비교연구)

  • Shin, Ju-Won;Lee, Kyoungjae
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.3
    • /
    • pp.491-505
    • /
    • 2021
  • We consider linear regression models in high-dimensional settings (p ≫ n) and compare various classes of priors. The spike and slab prior is one of the most widely used priors for Bayesian regression models, but its model space is vast, resulting in a bad performance in finite samples. As an alternative, various continuous shrinkage priors, including the horseshoe prior and its variants, have been proposed. Although each of the above priors has been investigated separately, exhaustive comparative studies of their performance have been conducted very rarely. In this study, we compare the spike and slab prior, the horseshoe prior and its variants in various simulation settings. The performance of each method is demonstrated in terms of the regression coefficient estimation and variable selection. Finally, some remarks and suggestions are given based on comprehensive simulation studies.

A Comparative Study on the Genetic Algorithm and Regression Analysis in Urban Population Surface Modeling (도시인구분포모형 개발을 위한 GA모형과 회귀모형의 적합성 비교연구)

  • Choei, Nae-Young
    • Spatial Information Research
    • /
    • v.18 no.5
    • /
    • pp.107-117
    • /
    • 2010
  • Taking the East-Hwasung area as the case, this study first builds gridded population data based on the municipal population survey raw data, and then measures, by way of GIS tools, the major urban spatial variables that are thought to influence the composition of the regional population. For the purpose of comparison, the urban models based on the Genetic Algorithm technique and the regression technique are constructed using the same input variables. The findings indicate that the GA output performed better in differentiating the effective variables among the pilot model variables, and predicted as much consistent and meaningful coefficient estimates for the explanatory variables as the regression models. The study results indicate that GA technique could be a very useful and supplementary research tool in understanding the urban phenomena.

Log-density Ratio with Two Predictors in a Logistic Regression Model (로지스틱 회귀모형에서 이변량 정규분포에 근거한 로그-밀도비)

  • Kahng, Myung Wook;Yoon, Jae Eun
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.1
    • /
    • pp.141-149
    • /
    • 2013
  • We present methods for studying the log-density ratio that enables the selection of the predictors and the form to be included in the logistic regression model. Under bivariate normal distributional assumptions, we investigate the form of the log-density ratio as a function of two predictors. If two covariance matrices are equal, then the crossproduct and quadratic terms are not needed. If the variables are uncorrelated, we do not need the crossproduct terms, but we still need the linear and quadratic terms. We also explore other conditions in which the crossproduct and quadratic terms are not needed in the logistic regression model.

Temporal distritution analysis of design rainfall by significance test of regression coefficients (회귀계수의 유의성 검정방법에 따른 설계강우량 시간분포 분석)

  • Park, Jin Heea;Lee, Jae Joon
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.4
    • /
    • pp.257-266
    • /
    • 2022
  • Inundation damage is increasing every year due to localized heavy rain and an increase of rainfall exceeding the design frequency. Accordingly, the importance of hydraulic structures for flood control and defense is also increasing. The hydraulic structures are designed according to its purpose and performance, and the amount of flood is an important calculation factor. However, in Korea, design rainfall is used as input data for hydrological analysis for the design of hydraulic structures due to the lack of sufficient data and the lack of reliability of observation data. Accurate probability rainfall and its temporal distribution are important factors to estimate the design rainfall. In practice, the regression equation of temporal distribution for the design rainfall is calculated using the cumulative rainfall percentage of Huff's quartile method. In addition, the 6th order polynomial regression equation which shows high overall accuracy, is uniformly used. In this study, the optimized regression equation of temporal distribution is derived using the variable selection method according to the principle of parsimony in statistical modeling. The derived regression equation of temporal distribution is verified through the significance test. As a result of this study, it is most appropriate to derive the regression equation of temporal distribution using the stepwise selection method, which has the advantages of both forward selection and backward elimination.

자기회귀계수에 대한 소표본 점근추론

  • Na, Jong-Hwa;Kim, Jeong-Suk;Jang, Yeong-Mi
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2005.05a
    • /
    • pp.209-213
    • /
    • 2005
  • 본 논문에서는 1차 자기회귀모형에서 자기회귀계수에 대한 여러 가지 추정량들의 분포함수에 대한 근사적추론 방법에 대해 연구하였다. 이차형식에 대한 안장점근사의 결과를 이용한 이 근사법은 여러 형태의 추정량들에 대해 근사분포의 유도과정이 불필요하며, 소표본은 물론 통계적 추론의 주요 관심영역에서의 근사정도가 매우 뛰어난 장점을 가지고 있다. 모의실험을 통해 Edgeworth근사를 비롯한 기존의 여러 근사법보다 효율이 뛰어남을 확인하였다.

  • PDF

Statistical significance test of polynomial regression equation for Huff's quartile method of design rainfall (설계강우량의 Huff 4분위 방법 다항회귀식에 대한 유의성 검정)

  • Park, Jinhee;Lee, Jaejoon;Lee, Sungho
    • Journal of Korea Water Resources Association
    • /
    • v.51 no.3
    • /
    • pp.263-272
    • /
    • 2018
  • For the design of hydraulic structures, the design flood discharge corresponding to a specific frequency is generally used by using the design storm calculated according to the rainfall-runoff relationship. In the past, empirical equations such as rational equations were used to calculate the peak flow rate. However, as the duration of rainfall is prolonged, the outflow patterns are different from the actual events, so the accuracy of the temporal distribution of the probability rainfall becomes important. In the present work, Huff's quartile method is used for the temporal distribution of rainfall, and the third quartile is generally used. The regression equation for Huff's quadratic curve applies a sixth order polynomial equation because of its high accuracy throughout the duration of rainfall. However, in statistical modeling, the regression equation needs to be concise in accordance with the principle of simplicity, and it is necessary to determine the regression coefficient based on the statistical significance level. Therefore, in this study, the statistical significance test for regression equation for temporal distribution of the Huff's quartile method, which is used as the temporal distribution method of design rainfall, is conducted for 69 rainfall observation stations under the jurisdiction of the Korea Meteorological Administration. It is statistically significant that the regression equation of the Huff's quartile method can be considered only up to the 4th order polynomial equation, as the regression coefficient is significant in most of the 69 rainfall observation stations.

A new regression analysis method in network model (네트워크 모델을 이용한 새로운 회귀분석방법)

  • 김기복;인치호;김희석
    • Proceedings of the IEEK Conference
    • /
    • 2003.07a
    • /
    • pp.410-413
    • /
    • 2003
  • 본 논문에서는 네트워크가 막연히 무작위적이라고 하기에는 사회나 세포, 인터넷 등이 어떤 법칙에 따라 짜연진 것처럼 보인다. 하지만 복잡한 네트워크의 모습이 네트워크의 모델과 실제로 똑같은지를 비교하기는 그리 쉬운 문제가 아니다. 무작위적 네트워크의 경우는 수학적으로 엄밀히 말하자면 쁘아송분포를 따른다. 쁘아송분포에서는 모든 점들이 동일한 확률로 여러 점들에 연결되는 기회를 갖는다. 즉 균일한 분포이다. 따라서 상당히 적거나 반대로 상당히 많은 수의 연결선을 가진 점은 극히 드물다. 이 경우 연결선 분포가 종 모양이 된다. 대부분의 점들이 곡선에 해당하는 연결선 수를 갖게 된다. 본 논문에서 쁘아송분포와 회귀분석을 통하여 하나 또는 둘 이상의 변수들 사이에 어떤 관계를 함수관계로 나타내어 분석하는 방법을 보이고 회귀분석 방법에 의해서 미래를 예측하고자 한다.

  • PDF

Development of Return flow rate Prediction Algorithm with Data Variation based on LSTM (LSTM기반의 자료 변동성을 고려한 하천수 회귀수량 예측 알고리즘 개발연구)

  • Lee, Seung Yeon;Yoo, Hyung Ju;Lee, Seung Oh
    • Journal of Korean Society of Disaster and Security
    • /
    • v.15 no.2
    • /
    • pp.45-56
    • /
    • 2022
  • The countermeasure for the shortage of water during dry season and drought period has not been considered with return flowrate in detail. In this study, the outflow of STP was predicted through a data-based machine learning model, LSTM. As the first step, outflow, inflow, precipitation and water elevation were utilized as input data, and the distribution of variance was additionally considered to improve the accuracy of the prediction. When considering the variability of the outflow data, the residual between the observed value and the distribution was assumed to be in the form of a complex trigonometric function and presented in the form of the optimal distribution of the outflow along with the theoretical probability distribution. It was apparently found that the degree of error was reduced when compared to the case not considering where the variance distribution. Therefore, it is expected that the outflow prediction model constructed in this study can be used as basic data for establishing an efficient river management system as more accurate prediction is possible.

Variable Selection with Log-Density in Logistic Regression Model (로지스틱회귀모형에서 로그-밀도비를 이용한 변수의 선택)

  • Kahng, Myung-Wook;Shin, Eun-Young
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.1
    • /
    • pp.1-11
    • /
    • 2012
  • We present methods to study the log-density ratio of the conditional densities of the predictors given the response variable in the logistic regression model. This allows us to select which predictors are needed and how they should be included in the model. If the conditional distributions are skewed, the distributions can be considered as gamma distributions. A simulation study shows that the linear and log terms are required in general. If the conditional distributions of xjy for the two groups overlap significantly, we need both the linear and log terms; however, only the linear or log term is needed in the model if they are well separated.

Unmanned AerialVehicles Images Based Tidal Flat Surface Sedimentary Facies Mapping Using Regression Kriging (회귀 크리깅을 이용한 무인기 영상 기반의 갯벌 표층 퇴적상 분포도 작성)

  • Geun-Ho Kwak;Keunyong Kim;Jingyo Lee;Joo-Hyung Ryu
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_1
    • /
    • pp.537-549
    • /
    • 2023
  • The distribution characteristics of tidal flat sediment components are used as an essential data for coastal environment analysis and environmental impact assessment. Therefore, a reliable classification map of surface sedimentary facies is essential. This study evaluated the applicability of regression kriging to generate a classification map of the sedimentary facies of tidal flats. For this aim, various factors such as the number of field survey data and remote sensing-based auxiliary data, the effect of regression models on regression kriging, and the comparison with other prediction methods (univariate kriging and regression analysis) on surface sedimentary facies classification were investigated. To evaluate the applicability of regression kriging, a case study using unmanned aerial vehicle (UAV) data was conducted on the Hwang-do tidal flat located at Anmyeon-do, Taean-gun, Korea. As a result of the case study, it was most important to secure an appropriate amount of field survey data and to use topographic elevation and channel density as auxiliary data to produce a reliable tidal flat surface sediment facies classification map. In addition, regression kriging, which can consider detailed characteristics of the sediment distributions using ultra-high resolution UAV data, had the best prediction performance compared to other prediction methods. It is expected that this result can be used as a guideline to produce the tidal flat surface sedimentary facies classification map.