• Title/Summary/Keyword: 평균회귀과정

Search Result 229, Processing Time 0.034 seconds

Time series analysis for Korean COVID-19 confirmed cases: HAR-TP-T model approach (한국 COVID-19 확진자 수에 대한 시계열 분석: HAR-TP-T 모형 접근법)

  • Yu, SeongMin;Hwang, Eunju
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.2
    • /
    • pp.239-254
    • /
    • 2021
  • This paper studies time series analysis with estimation and forecasting for Korean COVID-19 confirmed cases, based on the approach of a heterogeneous autoregressive (HAR) model with two-piece t (TP-T) distributed errors. We consider HAR-TP-T time series models and suggest a step-by-step method to estimate HAR coefficients as well as TP-T distribution parameters. In our proposed step-by-step estimation, the ordinary least squares method is utilized to estimate the HAR coefficients while the maximum likelihood estimation (MLE) method is adopted to estimate the TP-T error parameters. A simulation study on the step-by-step method is conducted and it shows a good performance. For the empirical analysis on the Korean COVID-19 confirmed cases, estimates in the HAR-TP-T models of order p = 2, 3, 4 are computed along with a couple of selected lags, which include the optimal lags chosen by minimizing the mean squares errors of the models. The estimation results by our proposed method and the solely MLE are compared with some criteria rules. Our proposed step-by-step method outperforms the MLE in two aspects: mean squares error of the HAR model and mean squares difference between the TP-T residuals and their densities. Moreover, forecasting for the Korean COVID-19 confirmed cases is discussed with the optimally selected HAR-TP-T model. Mean absolute percentage error of one-step ahead out-of-sample forecasts is evaluated as 0.0953% in the proposed model. We conclude that our proposed HAR-TP-T time series model with optimally selected lags and its step-by-step estimation provide an accurate forecasting performance for the Korean COVID-19 confirmed cases.

An improved method of NDVI correction through pattern-response low-peak detection on time series (시계열 패턴 반응형 Low-peak 탐지 기법을 통한 NDVI 보정방법 개선)

  • Lee, Kyeong-Sang;Han, Kyung-Soo
    • Korean Journal of Remote Sensing
    • /
    • v.30 no.4
    • /
    • pp.505-510
    • /
    • 2014
  • Normalized Difference Vegetation Index (NDVI) is a major indicator for monitoring climate change and detecting vegetation coverage. In order to retrieve NDVI, it is preprocessed using cloud masking and atmospheric correction. However, the preprocessed NDVI still has abnormally low values known as noise which appears in the long-term time series due to rainfall, snow and incomplete cloud masking. An existing method of using polynomial regression has some problems such as overestimation and noise detectability. Thereby, this study suggests a simple method using amoving average approach for correcting NDVI noises using SPOT/VEGETATION S10 Product. The results of the moving average method were compared with those of the polynomial regression. The results showed that the moving average method is better than the former approach in correcting NDVI noise.

A Study on Predictive Models based on the Machine Learning for Evaluating the Extent of Hazardous Zone of Explosive Gases (기계학습 기반의 가스폭발위험범위 예측모델에 관한 연구)

  • Jung, Yong Jae;Lee, Chang Jun
    • Korean Chemical Engineering Research
    • /
    • v.58 no.2
    • /
    • pp.248-256
    • /
    • 2020
  • In this study, predictive models based on machine learning for evaluating the extent of hazardous zone of explosive gases are developed. They are able to provide important guidelines for installing the explosion proof apparatus. 1,200 research data sets including 12 combustible gases and their extents of hazardous zone are generated to train predictive models. The extent of hazardous zone is set to an output variable and 12 variables affecting an output are set as input variables. Multiple linear regression, principal component regression, and artificial neural network are employed to train predictive models. Mean absolute percentage errors of multiple linear regression, principal component regression, and artificial neural network are 44.2%, 49.3%, and 5.7% and root mean square errors are 1.389m, 1.602m, and 0.203 m respectively. Therefore, it can be concluded that the artificial neural network shows the best performance. This model can be easily used to evaluate the extent of hazardous zone for explosive gases.

Hydrometeorological Drivers of Particulate Matter Using Satellite and Reanalysis Data (인공위성 및 재분석 자료를 이용한 미세먼지 농도와 수문기상인자의 상관성 분석)

  • Lee, Seul Chan;Jeong, Jae Hwan;Choi, Min Ha
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2019.05a
    • /
    • pp.100-100
    • /
    • 2019
  • 최근 대기 중 미세먼지의 농도가 높은 일수가 급증하면서, 미세먼지를 저감하고자 하는 연구가 활발히 이루어지고 있다. 미세먼지는 주로 자동차 혹은 공장 등 인간 활동에 의한 오염물질 배출에 의해 발생하는 것으로 알려져 있으며, 태양복사에너지, 토양수분, 강우, 풍속 등의 수문기상학적 인자에 의해 발생, 이동, 소멸의 과정을 거친다. 현재 우리나라에서는 미세먼지 농도를 관측하기 위해 지점 기반의 관측소를 운영하고 있으며, 관측소가 위치하지 않은 지역의 미세먼지 농도는 선형 보간법 등을 활용한 내삽 기법을 통해 제공하고 있다. 그러나 미세먼지 농도는 다양한 수문기상인자들의 영향에 의한 차이가 크게 나타나기 때문에 지점 기반의 자료로는 해당 지역의 미세먼지 농도를 추정하는 데 어려움이 많다. 본 연구에서는 미세먼지의 공간적인 분포를 추정하고자 MODerate resolution Imaging Spectroradiometer (MODIS) 에어로졸 자료와 Global Land Data Assimilation System (GLDAS) 수문기상인자를 활용하여 미세먼지 농도에 영향을 주는 것으로 판단되는 다양한 수문기상인자들과의 상관성을 분석하였다. 미세먼지와 각 인자간의 상관성을 분석하여 높은 상관성을 갖는 수문기상인자들을 도출하고 최적의 선형회귀분석 모델을 구축하기 위해 베이지안 모델 평균(Bayesian Model Averaging, BMA)을 사용하였으며, 지점 데이터와의 비교를 통해 활용성을 검증하였다. 전체적으로 수문기상인자를 사용한 선형회귀분석 결과에서는 미세먼지농도 변화의 경향을 반영하고 있는 것을 확인할 수 있었으나, 계절별, 지역별 등 대기 특성을 고려하지 않아 각 기간의 급격한 농도 변화를 감지하기에 어려움이 있었다. 이러한 연구를 바탕으로 수문기상인자와 미세먼지 농도의 패턴이 더욱 정확히 분석된다면, 미세먼지 농도 모니터링과 정확한 예보 시스템의 구축에 효과적으로 활용 될 것으로 기대된다.

  • PDF

A Study on Impact of Topographic Characteristics and Land use and Transport Characteristics of Residential Area On the Average Trip Distance of the Senior Citizens: for Busan Metropolitan City (행정동별 주거지의 지형적 특성과 토지이용·교통특성이 고령자의 평균통행거리에 미치는 영향 분석 - 부산광역시를 대상으로 -)

  • Jung, Seungjin;Go, Seungwook;Lee, Seungil
    • Journal of the Korean Regional Science Association
    • /
    • v.38 no.4
    • /
    • pp.3-17
    • /
    • 2022
  • This study aims to empirically analyze the impacts of the topographic characteristics of residential areas and the characteristics of land use and transportation on average trip distance of the senior citizens in Busan Metropolitan city. Multiple regression is conducted to accomplish the purpose of this study, and the conclusions and policy implications of the analysis are as follows. First, the average and standard deviation of the residential areas are significantly related to average trip distance of the senior citizens. Thus, urban transportation policies need to take account of the topographic characteristics of the residential areas. Second, average distance from the nearest subway station and density of bus stops have positive and negative association respectively. Mobility improvement polices for senior citizens should consider urban spatial structure and different approach processes to transportation facilities by modes. Third, mobility and residential environment improvement policies for senior citizens should take into account different sociodemographic characteristics by locations. This shows that the mobility convenience policy for senior citizens is more necessary than any other policy for administrative dong, where traffic access is relatively low and the single senior citizen population is concentrated.

Development of Multiple Regression Models for the Prediction of Daily Ammonia Nitrogen Concentrations (일별 암모니아성 질소(NH3-N)농도 예측을 위한 다중회귀모형 개발)

  • Chug, Se-Woong
    • Journal of Korea Water Resources Association
    • /
    • v.36 no.6
    • /
    • pp.1047-1058
    • /
    • 2003
  • Seasonal occurrence of high ammonia nitrogen(NH3-N) concentrations has hampered chemical treatment processes of a water plant that intakes water at Buyeo site of Geum river. Thus it is often needed to quantify the effect of Daecheong Dam ouflow on the mitigation of $NH_3$-N contamination. In this study, multiple regression models were developed for forecasting daily $NH_3$-N concentrations using 8 years of water quality and dam outflow data, and verified with another 2 years of data set. During model development, the coefficients of determination($R^2$) and model efficiency($E_{m}$) were greater than 0.95. The verification results were also satisfactory although those statistical indices were slightly reduced to 0.84∼0.94 and 0.77∼0.93, respectively. The validated model was applied to assess the effect of different amounts of dam outflow on the reduction of $NH_3$-N concentrations in 2002. The NH3-N concentrations dropped by 0.332∼0.583 mg/L on average during January∼March as outflow increases from 5 to 50cms, and was most significant on February. The results of this research show that the multiple regression approach has potential for efficient cause and effect analysis between dam outflow and downstream water quality.

A comparison of imputation methods for the consecutive missing temperature data (연속적 결측이 존재하는 기온 자료에 대한 결측복원 기법의 비교)

  • Kim, Hee-Kyung;Kang, In-Kyeong;Lee, Jae-Won;Lee, Yung-Seop
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.3
    • /
    • pp.549-557
    • /
    • 2016
  • Consecutive missing values are likely to occur in long climate data due to system error or defective equipment. Furthermore, it is difficult to impute missing values. However, these complicated problems can be overcame by imputing missing values with reference time series. Reference time series must be composed of similar time series to time series that include missing values. We performed a simulation to compare three missing imputation methods (the adjusted normal ratio method, the regression method and the IDW method) to complete the missing values of time series. A comparison of the three missing imputation methods for the daily mean temperatures at 14 climatological stations indicated that the IDW method was better thanx others at south seaside stations. We also found the regression method was better than others at most stations (except south seaside stations).

An Improved Frequency Modeling Corresponding to the Location of the Anjok of the Gayageum (가야금 안족의 위치에 따른 개선된 주파수 모델링)

  • Kwon, Sundeok;Cho, Sangjin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.33 no.2
    • /
    • pp.146-151
    • /
    • 2014
  • This paper analyzes the previous Anjok model of the Gayageum and describes a method to improve the frequency modeling based on previous model. In the previous work, relation between the fundamental frequency and Anjok's location on the body is assumed as an exponential function and these frequencies are integrated by a first-order leaky integrator. Finally, a parameter of the formula to calculate the fundamental frequency is obtained by applying integrated frequencies to the linear regression. This model shows 2.5 Hz absolute deviation on average and has maximum error 7.75 Hz for the low fundamental frequencies. In order to overcome this problem, this paper proposes that the Anjok's locations are grouped according to the rate of error increase and linear regression is applied to each group. To find the optimal parameter, the RMSE(Root Mean Square Error) between measured and calculated fundamental frequencies is used. The proposed model shows substantial reduction in errors, especially maximum three times.

Development of Urban Flood Warning System Using Regression Analysis (회귀분석에 의한 도시홍수 예보시스템의 개발)

  • Lee, BeumHee
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.30 no.4B
    • /
    • pp.347-359
    • /
    • 2010
  • A simple web-based flood forecasting system using data from stage and rainfall monitoring stations was developed to solve the difficulty that real-time forecasting model could not get the reliabilities because of assumption of future rainfall duration and intensity. The regression model in this research could forecast future water level of maximum 2 hours after using data from stage and rainfall monitoring stations in Daejeon area. Real time stage and rainfall data were transformed from web-sites of Geum River Flood Control Office & Han River Flood Control Office based MS-Excel 2007. It showed stable forecasts by its maximum standard deviation of 5 cm, means of 1~4 cm and most of improved coefficient of determinations were over 0.95. It showed also more researches about the stationarity of watershed and time-series approach are necessary.

Detecting Space-Time Clusters in Linear Point Data (선형 점자료에 있어서의 시.공 복합 군집의 탐색)

  • 홍상기
    • Journal of the Korean Geographical Society
    • /
    • v.33 no.2
    • /
    • pp.325-338
    • /
    • 1998
  • 본 연구에서는 시.공 복합적인 선형 점 자료를 대상으로 시간과 공간을 함께 고려했을 때 자료 내에 군집(cluster)-시.공 복합 군집(space-time cluster)-이 존재하는 가를 검증하는 방법에 대해 논의하고, 실제 교통사고지점의 분포자료를 분석하여 군집의 유무를 통계적으로 검증하였다. 통계 분석의 결과 다음과 같은 사실이 확인되었다. 첫째, Knox의 분할표 방법과 Mantel의 역수 변환을 이용한 일반화된 회귀분석방법 모두 임계 거리 및 임계 시간 간격의 선택이 분석결과에 영향을 미친다. 둘째, 이러한 임의성을 극복하기 위해 다양한 임계 거리 및 임계 시간 간격(혹은 부가 상수)에 대해 반복 실험한 결과, 일부 임계값의 조합에서 시간과 공간이 서로 독립적이라는 귀무가설을 기각할 수 있는 증거가 발견되었다. 셋째, 시.공 복합 군집의 파악에 가장 적합한 임계 거리와 임계 시간 간격은 공간적으로는 7000m, 시간적으로는 14일 혹은 21일이다. 마지막으로, 통계 분석과정에서 자료에 존재하는 중복 기록 사고들의 존재가 밝혀짐으로써 시.공 복합군집 검증이 탐험적 자료 분석(exploratory data analysis)의 도구로서 가지는 가치를 확인할 수 있었다.

  • PDF