• Title/Summary/Keyword: 조건부 분위수

Search Result 11, Processing Time 0.021 seconds

Divide and conquer kernel quantile regression for massive dataset (대용량 자료의 분석을 위한 분할정복 커널 분위수 회귀모형)

  • Bang, Sungwan;Kim, Jaeoh
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.5
    • /
    • pp.569-578
    • /
    • 2020
  • By estimating conditional quantile functions of the response, quantile regression (QR) can provide comprehensive information of the relationship between the response and the predictors. In addition, kernel quantile regression (KQR) estimates a nonlinear conditional quantile function in reproducing kernel Hilbert spaces generated by a positive definite kernel function. However, it is infeasible to use the KQR in analysing a massive data due to the limitations of computer primary memory. We propose a divide and conquer based KQR (DC-KQR) method to overcome such a limitation. The proposed DC-KQR divides the entire data into a few subsets, then applies the KQR onto each subsets and derives a final estimator by aggregating all results from subsets. Simulation studies are presented to demonstrate the satisfactory performance of the proposed method.

Nonparametric estimation of conditional quantile with censored data (조건부 분위수의 중도절단을 고려한 비모수적 추정)

  • Kim, Eun-Young;Choi, Hyemi
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.2
    • /
    • pp.211-222
    • /
    • 2013
  • We consider the problem of nonparametrically estimating the conditional quantile function from censored data and propose new estimators here. They are based on local logistic regression technique of Lee et al. (2006) and "double-kernel" technique of Yu and Jones (1998) respectively, which are modified versions under random censoring. We compare those with two existing estimators based on a local linear fits using the check function approach. The comparison is done by a simulation study.

A comparison study of multiple linear quantile regression using non-crossing constraints (비교차 제약식을 이용한 다중 선형 분위수 회귀모형에 관한 비교연구)

  • Bang, Sungwan;Shin, Seung Jun
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.5
    • /
    • pp.773-786
    • /
    • 2016
  • Multiple quantile regression that simultaneously estimate several conditional quantiles of response given covariates can provide a comprehensive information about the relationship between the response and covariates. Some quantile estimates can cross if conditional quantiles are separately estimated; however, this violates the definition of the quantile. To tackle this issue, multiple quantile regression with non-crossing constraints have been developed. In this paper, we carry out a comparison study on several popular methods for non-crossing multiple linear quantile regression to provide practical guidance on its application.

Stepwise Estimation for Multiple Non-Crossing Quantile Regression using Kernel Constraints (커널 제약식을 이용한 다중 비교차 분위수 함수의 순차적 추정법)

  • Bang, Sungwan;Jhun, Myoungshic;Cho, HyungJun
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.6
    • /
    • pp.915-922
    • /
    • 2013
  • Quantile regression can estimate multiple conditional quantile functions of the response, and as a result, it provide comprehensive information of the relationship between the response and the predictors. However, when estimating several conditional quantile functions separately, two or more estimated quantile functions may cross or overlap and consequently violate the basic properties of quantiles. In this paper, we propose a new stepwise method to estimate multiple non-crossing quantile functions using constraints on the kernel coefficients. A simulation study are presented to demonstrate satisfactory performance of the proposed method.

Penalized quantile regression tree (벌점화 분위수 회귀나무모형에 대한 연구)

  • Kim, Jaeoh;Cho, HyungJun;Bang, Sungwan
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.7
    • /
    • pp.1361-1371
    • /
    • 2016
  • Quantile regression provides a variety of useful statistical information to examine how covariates influence the conditional quantile functions of a response variable. However, traditional quantile regression (which assume a linear model) is not appropriate when the relationship between the response and the covariates is a nonlinear. It is also necessary to conduct variable selection for high dimensional data or strongly correlated covariates. In this paper, we propose a penalized quantile regression tree model. The split rule of the proposed method is based on residual analysis, which has a negligible bias to select a split variable and reasonable computational cost. A simulation study and real data analysis are presented to demonstrate the satisfactory performance and usefulness of the proposed method.

Variable selection with quantile regression tree (분위수 회귀나무를 이용한 변수선택 방법 연구)

  • Chang, Youngjae
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.6
    • /
    • pp.1095-1106
    • /
    • 2016
  • The quantile regression method proposed by Koenker et al. (1978) focuses on conditional quantiles given by independent variables, and analyzes the relationship between response variable and independent variables at the given quantile. Considering the linear programming used for the estimation of quantile regression coefficients, the model fitting job might be difficult when large data are introduced for analysis. Therefore, dimension reduction (or variable selection) could be a good solution for the quantile regression of large data sets. Regression tree methods are applied to a variable selection for quantile regression in this paper. Real data of Korea Baseball Organization (KBO) players are analyzed following the variable selection approach based on the regression tree. Analysis result shows that a few important variables are selected, which are also meaningful for the given quantiles of salary data of the baseball players.

Multivariate quantile regression tree (다변량 분위수 회귀나무 모형에 대한 연구)

  • Kim, Jaeoh;Cho, HyungJun;Bang, Sungwan
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.3
    • /
    • pp.533-545
    • /
    • 2017
  • Quantile regression models provide a variety of useful statistical information by estimating the conditional quantile function of the response variable. However, the traditional linear quantile regression model can lead to the distorted and incorrect results when analysing real data having a nonlinear relationship between the explanatory variables and the response variables. Furthermore, as the complexity of the data increases, it is required to analyse multiple response variables simultaneously with more sophisticated interpretations. For such reasons, we propose a multivariate quantile regression tree model. In this paper, a new split variable selection algorithm is suggested for a multivariate regression tree model. This algorithm can select the split variable more accurately than the previous method without significant selection bias. We investigate the performance of our proposed method with both simulation and real data studies.

A Study on the Determinants of Land Price in a New Town (신도시 택지개발사업지역에서 토지가격 결정요인에 관한 연구)

  • Jeong, Tae Yun
    • Korea Real Estate Review
    • /
    • v.28 no.1
    • /
    • pp.79-90
    • /
    • 2018
  • The purpose of this study was to estimate the pricing factors of residential lands in new cities by estimating the pricing model of residential lands. For this purpose, hedonic equations for each quantile of the conditional distribution of land prices were estimated using quantile regression methods and the sale price date of Jangyu New Town in Gimhae. In this study, a quantile regression method that models the relation between a set of explanatory variables and each quantile of land price was adopted. As a result, the differences in the effects of the characteristics by price quantile were confirmed. The number of years that elapsed after the completion of land construction is the quadratic effect in the model because its impact may give rise to a non-linear price pattern. Age appears to decrease the price until certain years after the construction, and increases the price afterward. In the estimation of the quantile regression, land age appears to have a statistically significant impact on land price at the traditional level, and the turning point appears to be shorter for the low quantiles than for the higher quantiles. The positive effects of the use of land for commercial and residential purposes were found to be the biggest. Land demand is preferred if there are more than two roads on the ground. In this case, the amount of sunshine will improve. It appears that the shape of a square wave is preferred to a free-looking land. This is because the square land is favorable for development. The variables of the land used for commercial and residential purposes have a greater impact on low-priced residential lands. This is because such lands tend to be mostly used for rental housing and have different characteristics from residential houses. Residential land prices have different characteristics depending on the price level, and it is necessary to consider this in the evaluation of the collateral value and the drafting of real estate policy.

Multivariate conditional tail expectations (다변량 조건부 꼬리 기대값)

  • Hong, C.S.;Kim, T.W.
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.7
    • /
    • pp.1201-1212
    • /
    • 2016
  • Value at Risk (VaR) for market risk management is a favorite method used by financial companies; however, there are some problems that cannot be explained for the amount of loss when a specific investment fails. Conditional Tail Expectation (CTE) is an alternative risk measure defined as the conditional expectation exceeded VaR. Multivariate loss rates are transformed into a univariate distribution in real financial markets in order to obtain CTE for some portfolio as well as to estimate CTE. We propose multivariate CTEs using multivariate quantile vectors. A relationship among multivariate CTEs is also derived by extending univariate CTEs. Multivariate CTEs are obtained from bivariate and trivariate normal distributions; in addition, relationships among multivariate CTEs are also explored. We then discuss the extensibility to high dimension as well as illustrate some examples. Multivariate CTEs (using variance-covariance matrix and multivariate quantile vector) are found to have smaller values than CTEs transformed to univariate. Therefore, it can be concluded that the proposed multivariate CTEs provides smaller estimates that represent less risk than others and that a drastic investment using this CTE is also possible when a diversified investment strategy includes many companies in a portfolio.

Intergenerational economic mobility in Korea using a quantile regression analysis (한국의 세대 간 경제적 이동성 - 분위수회귀분석을 중심으로 -)

  • Richey, Jeremiah;Jeong, Kiho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.4
    • /
    • pp.715-725
    • /
    • 2014
  • This study uses a quantile regression analysis to investigate intergenerational economic mobility in Korea. The analysis is based on data from the 1st through 11th waves of the Korean Labor and Income Panel Study (KLIPS) conducted from 1998-2008. The household nature of the data allows us to link parents' incomes to children's incomes at different points in time. Using a quantile regression analysis instead of mean one reveals that the effect of fathers' earnings are different across the conditional distribution of sons' earnings, particularly being larger on the upper quantile than on the lower quantile. After controlling effect of sons' college education by including a dummy variable for the degree, however, the pattern among quantile effects for fathers' earnings is no longer clear. Instead a new pattern emerges that education has a much larger effect on the upper quantiles than on the lower ones. Using nonparametric estimates of conditional density curves based on the quantile regression results, we derive some interesting features in graphical forms, which are not obvious in numerical analysis.