• 제목/요약/키워드: linear regression models

검색결과 937건 처리시간 0.024초

우도거리에 의한 결정계수 $R^2$에의한 통합적 접근 (Unified Approach to Coefficient of Determination $R^2$ Using Likelihood Distancd)

  • 허명회;이종한;정진환
    • 응용통계연구
    • /
    • 제4권2호
    • /
    • pp.117-127
    • /
    • 1991
  • 결정계수 $R^2$은 회귀분석에서 실제적으로는 매우 이용도가 높은 기술 측도라고 하겠으나, 회귀모형이 절편향을 포함하는 표준적인 선형회귀모형 이외인 경우에는 결정계수의 정의에 관하여 여러 논란이 있어 왔다. 절편항이 없는 선형회귀모형에서와 가중선형회귀모형, 로버스트 선형회귀모형에서의 결정계수의 적절한 정의와 용법이 대표적인 문제라고 하겠다. 기존의 여러 연구, 예를 들어 Kvalseth(1985) 나 Willet and Singer(1988)에서는 이러한 각 경우에 각기 적용될 수 있는 결정계수의 여러 변형들을 제안 $\cdot$ 이런 기존의 연구들이 일반적인 원칙이 없이 경우별로 단편적으로 대응하고 있을뿐더러 약간의 오류를 포함하고 있어 오히려 통계전문가가 아닌 통계 이용자들에게 혼란을 불러 일으킬 염려가 있다. 따라서 결정계수의 일반적 정의를 제안한 본 연구는 현재와 같은 결정계수의 여러변종의 범람으로 인한 혼란을 없애는 데 기여하리라고 생각된다. 이 통합결정계수는 尤度거리(likelihood distance)를 이용하여 정의되는데, 선형회귀모형 이외에도 비선형 회귀모형과 일반화 선형모형에 일관되게 적용 가능하다는 장점을 갖는다.

  • PDF

다중 선형 회귀와 랜덤 포레스트 기반의 코로나19 신규 확진자 예측 (Prediction of New Confirmed Cases of COVID-19 based on Multiple Linear Regression and Random Forest)

  • 김준수;최병재
    • 대한임베디드공학회논문지
    • /
    • 제17권4호
    • /
    • pp.249-255
    • /
    • 2022
  • The COVID-19 virus appeared in 2019 and is extremely contagious. Because it is very infectious and has a huge impact on people's mobility. In this paper, multiple linear regression and random forest models are used to predict the number of COVID-19 cases using COVID-19 infection status data (open source data provided by the Ministry of health and welfare) and Google Mobility Data, which can check the liquidity of various categories. The data has been divided into two sets. The first dataset is COVID-19 infection status data and all six variables of Google Mobility Data. The second dataset is COVID-19 infection status data and only two variables of Google Mobility Data: (1) Retail stores and leisure facilities (2) Grocery stores and pharmacies. The models' performance has been compared using the mean absolute error indicator. We also a correlation analysis of the random forest model and the multiple linear regression model.

Multiple Structural Change-Point Estimation in Linear Regression Models

  • Kim, Jae-Hee
    • Communications for Statistical Applications and Methods
    • /
    • 제19권3호
    • /
    • pp.423-432
    • /
    • 2012
  • This paper is concerned with the detection of multiple change-points in linear regression models. The proposed procedure relies on the local estimation for global change-point estimation. We propose a multiple change-point estimator based on the local least squares estimators for the regression coefficients and the split measure when the number of change-points is unknown. Its statistical properties are shown and its performance is assessed by simulations and real data applications.

퍼지선형회귀를 이용한 상지부위의 CTDs 위험요인 평가 (An evaluation of CTDs risk factors of upper extremity using fuzzy linear regression)

  • 이동춘;부진후
    • 산업경영시스템학회지
    • /
    • 제23권55호
    • /
    • pp.33-42
    • /
    • 2000
  • It is difficult to estimate the effective factors upon Cumulative Trauma Disorders in real workplace because those are developed by combination of various risk factors for time. The purpose of this paper was to evaluate relative level of CTDs risk factors such as task-related factors, anthropometric factors, joint deviation factors and personal factors using fuzzy linear regression models. And the models are built corresponding to each category with the survey data from telephone operators. The coefficient of fuzzy models are described as the relative level of variable to present risk factors upon CTDs.

  • PDF

실시간 수위 예측을 위한 다중선형회귀 모형의 비교 (Comparison of Different Multiple Linear Regression Models for Real-time Flood Stage Forecasting)

  • 최승용;한건연;김병현
    • 대한토목학회논문집
    • /
    • 제32권1B호
    • /
    • pp.9-20
    • /
    • 2012
  • 최근 수위 예측을 위한 개념적 기반, 수문학적, 물리적 기반 모형 등의 단점을 극복하고자 홍수예측을 위해 자료지향형 모형 중의 하나인 다중선형회귀 모형이 널리 도입되고 있다. 본 연구의 목적은 이러한 다중선형회귀 모형의 서로 다른 회귀계수 선정 방법에 따른 홍수예측 성능을 비교 검토하고 이를 통해 적절한 다중회귀 홍수예측 모형을 구축하는 것이다. 이를 위해 입력자료의 자기상관분석을 통해 독립변수의 시간 규모를 결정한 후 최소 자승법, 가중 최소 자승법, 단계별 선택법의 각기 다른 회귀계수 산정 방법을 이용한 홍수예측 모형을 구축하고 중랑천 유역의 다양한 홍수사상에 대해 적용하였다. 구축된 모형들의 성능을 평가하기 위해 평균제곱근오차, Nash-Suttcliffe 효율계수, 평균절대오차, 수정 결정계수와 같이 4개의 통계지표들을 사용하였다. 모의결과 단계별 선택법을 이용한 다중선형회귀 홍수예측 모형이 가장 정확한 예측 결과를 보였고, 최소자승법을 이용한 홍수예측 모형이 가중 최소자승법을 이용한 홍수예측 모형보다 좀 더 나은 예측 결과를 나타냈다.

Improvement of WRF forecast meteorological data by Model Output Statistics using linear, polynomial and scaling regression methods

  • Jabbari, Aida;Bae, Deg-Hyo
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2019년도 학술발표회
    • /
    • pp.147-147
    • /
    • 2019
  • The Numerical Weather Prediction (NWP) models determine the future state of the weather by forcing current weather conditions into the atmospheric models. The NWP models approximate mathematically the physical dynamics by nonlinear differential equations; however these approximations include uncertainties. The errors of the NWP estimations can be related to the initial and boundary conditions and model parameterization. Development in the meteorological forecast models did not solve the issues related to the inevitable biases. In spite of the efforts to incorporate all sources of uncertainty into the forecast, and regardless of the methodologies applied to generate the forecast ensembles, they are still subject to errors and systematic biases. The statistical post-processing increases the accuracy of the forecast data by decreasing the errors. Error prediction of the NWP models which is updating the NWP model outputs or model output statistics is one of the ways to improve the model forecast. The regression methods (including linear, polynomial and scaling regression) are applied to the present study to improve the real time forecast skill. Such post-processing consists of two main steps. Firstly, regression is built between forecast and measurement, available during a certain training period, and secondly, the regression is applied to new forecasts. In this study, the WRF real-time forecast data, in comparison with the observed data, had systematic biases; the errors related to the NWP model forecasts were reflected in the underestimation of the meteorological data forecast by the WRF model. The promising results will indicate that the post-processing techniques applied in this study improved the meteorological forecast data provided by WRF model. A comparison between various bias correction methods will show the strength and weakness of the each methods.

  • PDF

통계적모형을 통한 고해상도 일별 평균기온 산정 (Generating high resolution of daily mean temperature using statistical models)

  • 윤상후
    • Journal of the Korean Data and Information Science Society
    • /
    • 제27권5호
    • /
    • pp.1215-1224
    • /
    • 2016
  • 고해상도 격자 단위 기후정보는 농업, 관광학, 생태학, 질병학 등 다양한 분야의 현상을 설명하는 중요 요인이다. 고해상도 기후정보는 동적 모형과 통계적 모형을 통해 얻을 수 있다. 통계적 모형은 동적 모형에 비해 계산 시간이 저렴하여 시공간 해상도가 높은 기후자료 생성에 주로 이용한다. 본 연구에서는 2003년부터 2012년까지 1월에 관측된 일 평균기온자료를 토대로 통계적 모형의 일 평균 기온을 생성하였다. 통계적 모형으로 선형모형을 기반으로한 일반선형모형, 일반화가법모형, 공간선형모형, 베이지안공간선형모형을 고려하였다. 예측성능평가를 위해 60개소의 지상관측소에서 관측된 일 평균기온을 모형적합 자료로 사용하여 352개소의 자동기상관측의 일 평균기온을 검증하였다. 평균제곱오차와 상관계수를 보면 베이지안공간모형의 예측성능이 다른 모형에 비해 상대적으로 우수하였다. 최종적으로 $1km{\times}1km$ 격자 단위 일 평균기온 지도를 생성하였다.

강제환기식 돈사의 환기량 추정을 위한 회귀모델의 비교 (Comparison of Regression Models for Estimating Ventilation Rate of Mechanically Ventilated Swine Farm)

  • 조광곤;하태환;윤상후;장유나;정민웅
    • 한국농공학회논문집
    • /
    • 제62권1호
    • /
    • pp.61-70
    • /
    • 2020
  • To estimate the ventilation volume of mechanically ventilated swine farms, various regression models were applied, and errors were compared to select the regression model that can best simulate actual data. Linear regression, linear spline, polynomial regression (degrees 2 and 3), logistic curve, generalized additive model (GAM), and gompertz curve were compared. Overfitting models were excluded even when the error rate was small. The evaluation criteria were root mean square error (RMSE) and mean absolute percentage error (MAPE). The evaluation results indicated that degree 3 exhibited the lowest error rate; however, an overestimation contradiction was observed in a certain section. The logistic curve was the most stable and superior to all the models. In the estimation of ventilation volume by all of the models, the estimated ventilation volume of the logistic curve was the smallest except for the model with a large error rate and the overestimated model.

음성인식을 위한 변환 공간 모델에 근거한 순차 적응기법 (Sequential Adaptation Algorithm Based on Transformation Space Model for Speech Recognition)

  • 김동국;장준혁;김남수
    • 음성과학
    • /
    • 제11권4호
    • /
    • pp.75-88
    • /
    • 2004
  • In this paper, we propose a new approach to sequential linear regression adaptation of continuous density hidden Markov models (CDHMMs) based on transformation space model (TSM). The proposed TSM which characterizes the a priori knowledge of the training speakers associated with maximum likelihood linear regression (MLLR) matrix parameters is effectively described in terms of the latent variable models. The TSM provides various sources of information such as the correlation information, the prior distribution, and the prior knowledge of the regression parameters that are very useful for rapid adaptation. The quasi-Bayes (QB) estimation algorithm is formulated to incrementally update the hyperparameters of the TSM and regression matrices simultaneously. Experimental results showed that the proposed TSM approach is better than that of the conventional quasi-Bayes linear regression (QBLR) algorithm for a small amount of adaptation data.

  • PDF

선형함수 fitting을 위한 선형회귀분석, 역전파신경망 및 성현 Hebbian 신경망의 성능 비교 (Performance Evaluation of Linear Regression, Back-Propagation Neural Network, and Linear Hebbian Neural Network for Fitting Linear Function)

  • 이문규;허해숙
    • 한국경영과학회지
    • /
    • 제20권3호
    • /
    • pp.17-29
    • /
    • 1995
  • Recently, neural network models have been employed as an alternative to regression analysis for point estimation or function fitting in various field. Thus far, however, no theoretical or empirical guides seem to exist for selecting the tool which the most suitable one for a specific function-fitting problem. In this paper, we evaluate performance of three major function-fitting techniques, regression analysis and two neural network models, back-propagation and linear-Hebbian-learning neural networks. The functions to be fitted are simple linear ones of a single independent variable. The factors considered are size of noise both in dependent and independent variables, portion of outliers, and size of the data. Based on comutational results performed in this study, some guidelines are suggested to choose the best technique that can be used for a specific problem concerned.

  • PDF