• Title/Summary/Keyword: Regression estimator

Search Result 311, Processing Time 0.03 seconds

Calibration by Median Regression

  • Jinsan Yang;Lee, Seung-Ho
    • Journal of the Korean Statistical Society
    • /
    • v.28 no.2
    • /
    • pp.265-277
    • /
    • 1999
  • Classical and inverse estimation methods are two well known methods in statistical calibration problems. When there are outliers, both methods have large MSE's and could not estimate the input value correctly. We suggest median calibration estimation based on the LD-statistics. To investigate the robust performances, the influence function of the median calibration estimator is calculated and compared with other methods. When there are outliers in the response variables, the influence function is found to be bounded. In simulation studies, the MSE's for each calibration methods are compared. The estimated inputs as well as the performance of the influence functions are calculated.

  • PDF

Note on Working Correlation in the GEE of Longitudinal Counts Data

  • Jeong, Kwang-Mo
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.6
    • /
    • pp.751-759
    • /
    • 2011
  • The method of generalized estimating equations(GEE) is widely used in the analysis of a correlated dataset that consists of repeatedly observed responses within subjects. The GEE uses a quasi-likelihood equations to find the parameter estimates without assuming a specific distribution for the correlated responses. In this paper we study the importance of specifying the working correlation structure appropriately in fitting GEE for correlated counts data. We investigate the empirical coverages of confidence intervals for the regression coefficients according to four kinds of working correlations where one structure should be specified by the users. The confidence intervals are computed based on the asymptotic normality and the sandwich variance estimator.

Non-convex penalized estimation for the AR process

  • Na, Okyoung;Kwon, Sunghoon
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.5
    • /
    • pp.453-470
    • /
    • 2018
  • We study how to distinguish the parameters of the sparse autoregressive (AR) process from zero using a non-convex penalized estimation. A class of non-convex penalties are considered that include the smoothly clipped absolute deviation and minimax concave penalties as special examples. We prove that the penalized estimators achieve some standard theoretical properties such as weak and strong oracle properties which have been proved in sparse linear regression framework. The results hold when the maximal order of the AR process increases to infinity and the minimal size of true non-zero parameters decreases toward zero as the sample size increases. Further, we construct a practical method to select tuning parameters using generalized information criterion, of which the minimizer asymptotically recovers the best theoretical non-penalized estimator of the sparse AR process. Simulation studies are given to confirm the theoretical results.

Control of Single Propeller Pendulum with Supervised Machine Learning Algorithm

  • Tengis, Tserendondog;Batmunkh, Amar
    • International journal of advanced smart convergence
    • /
    • v.7 no.3
    • /
    • pp.15-22
    • /
    • 2018
  • Nowadays multiple control methods are used in robot control systems. A model, predictor or error estimator is often used as feedback controller to control a robot. While robots have become more and more intensive with algorithms capable to acquiring independent knowledge from raw data. This paper represents experimental results of real time machine learning control that does not require explicit knowledge about the plant. The controller can be applied on a broad range of tasks with different dynamic characteristics. We tested our controller on the balancing problem of a single propeller pendulum. Experimental results show that the use of a supervised machine learning algorithm in a single propeller pendulum allows the stable swing of a given angle.

Development of Nth Highest Hourly Traffic Volume Forecasting Models (고속국도에서의 연평균일교통량에 따른 N번째 고순위 시간교통량 추정모형 개발에 관한 연구)

  • Oh, Ju-Sam
    • International Journal of Highway Engineering
    • /
    • v.9 no.3
    • /
    • pp.13-20
    • /
    • 2007
  • For calculating the number of lane, it is essential to gain the 30th or 100th highest design hourly volume. The design hourly volume obtained from AADT multiplied by design hour factor. In this paper, we developed the regression models fur estimating the 30th highest hour volume and 100th highest hour volume as defined by AADT 50,000 criterion based on the data obtained the 34 monitoring sites in highway. By comparing the performance of the proposed models and conventional models using MAPE, the proposed model for 30th highest design hourly volume reduced the estimator error of 11.83% than that of conventional methods for less than AADT 50,000 and decreased estimation error of 22.17% than that of conventional method for more than AADT 50,000. Moreover, the proposed model for 100th highest design hourly volume reduced the estimator error of 8.16% than that of conventional methods for less than AADT 50,000 and decreased estimation error of 15.25% than that of conventional method for more than AADT 50,000.

  • PDF

Overview of estimating the average treatment effect using dimension reduction methods (차원축소 방법을 이용한 평균처리효과 추정에 대한 개요)

  • Mijeong Kim
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.4
    • /
    • pp.323-335
    • /
    • 2023
  • In causal analysis of high dimensional data, it is important to reduce the dimension of covariates and transform them appropriately to control confounders that affect treatment and potential outcomes. The augmented inverse probability weighting (AIPW) method is mainly used for estimation of average treatment effect (ATE). AIPW estimator can be obtained by using estimated propensity score and outcome model. ATE estimator can be inconsistent or have large asymptotic variance when using estimated propensity score and outcome model obtained by parametric methods that includes all covariates, especially for high dimensional data. For this reason, an ATE estimation using an appropriate dimension reduction method and semiparametric model for high dimensional data is attracting attention. Semiparametric method or sparse sufficient dimensionality reduction method can be uesd for dimension reduction for the estimation of propensity score and outcome model. Recently, another method has been proposed that does not use propensity score and outcome regression. After reducing dimension of covariates, ATE estimation can be performed using matching. Among the studies on ATE estimation methods for high dimensional data, four recently proposed studies will be introduced, and how to interpret the estimated ATE will be discussed.

Generalization of modified systematic sampling and regression estimation for population with a linear trend (선형추세를 갖는 모집단에 대한 변형계통표집의 일반화와 회귀추정법)

  • Kim, Hyuk-Joo;Kim, Jeong-Hyeon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.6
    • /
    • pp.1103-1118
    • /
    • 2009
  • When we wish to estimate the mean or total of a finite population, the numbering of the population units is of importance. In this paper, we have proposed two methods for estimating the mean or total of a population having a linear trend, for the case when the reciprocal of the sampling fraction is an even number and the sample size is an odd number. The first method involves drawing a sample by using a method which is a generalization of Singh et al's (1968) modified systematic sampling, and using interpolation in determining the estimator. The second method involves selecting a sample by modified systematic sampling, and estimating the population parameters by the regression estimation method. Under the criterion of the expected mean square error based on Cochran's (1946) infinite superpopulation model, the proposed methods have been compared with existing methods. We have also made a comparison between the two proposed methods.

  • PDF

Applications of Gaussian Process Regression to Groundwater Quality Data (가우시안 프로세스 회귀분석을 이용한 지하수 수질자료의 해석)

  • Koo, Min-Ho;Park, Eungyu;Jeong, Jina;Lee, Heonmin;Kim, Hyo Geon;Kwon, Mijin;Kim, Yongsung;Nam, Sungwoo;Ko, Jun Young;Choi, Jung Hoon;Kim, Deog-Geun;Jo, Si-Beom
    • Journal of Soil and Groundwater Environment
    • /
    • v.21 no.6
    • /
    • pp.67-79
    • /
    • 2016
  • Gaussian process regression (GPR) is proposed as a tool of long-term groundwater quality predictions. The major advantage of GPR is that both prediction and the prediction related uncertainty are provided simultaneously. To demonstrate the applicability of the proposed tool, GPR and a conventional non-parametric trend analysis tool are comparatively applied to synthetic examples. From the application, it has been found that GPR shows better performance compared to the conventional method, especially when the groundwater quality data shows typical non-linear trend. The GPR model is further employed to the long-term groundwater quality predictions based on the data from two domestically operated groundwater monitoring stations. From the applications, it has been shown that the model can make reasonable predictions for the majority of the linear trend cases with a few exceptions of severely non-Gaussian data. Furthermore, for the data shows non-linear trend, GPR with mean of second order equation is successfully applied.

Estimation of Duck House Litter Evaporation Rate Using Machine Learning (기계학습을 활용한 오리사 바닥재 수분 발생량 분석)

  • Kim, Dain;Lee, In-bok;Yeo, Uk-hyeon;Lee, Sang-yeon;Park, Sejun;Decano, Cristina;Kim, Jun-gyu;Choi, Young-bae;Cho, Jeong-hwa;Jeong, Hyo-hyeog;Kang, Solmoe
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.63 no.6
    • /
    • pp.77-88
    • /
    • 2021
  • Duck industry had a rapid growth in recent years. Nevertheless, researches to improve duck house environment are still not sufficient enough. Moisture generation of duck house litter is an important factor because it may cause severe illness and low productivity. However, the measuring process is difficult because it could be disturbed with animal excrements and other factors. Therefore, it has to be calculated according to the environmental data around the duck house litter. To cut through all these procedures, we built several machine learning regression model forecasting moisture generation of litter by measured environment data (air temperature, relative humidity, wind velocity and water contents). 5 models (Multi Linear Regression, k-Nearest Neighbors, Support Vector Regression, Random Forest and Deep Neural Network). have been selected for regression. By using R-Square, RMSE and MAE as evaluation metrics, the best accurate model was estimated according to the variables for each machine learning model. In addition, to address the small amount of data acquired through lab experiments, bootstrapping method, a technique utilized in statistics, was used. As a result, the most accurate model selected was Random Forest, with parameters of n-estimator 200 by bootstrapping the original data nine times.