• Title/Summary/Keyword: 분위수 방법

Search Result 51, Processing Time 0.018 seconds

Stepwise Estimation for Multiple Non-Crossing Quantile Regression using Kernel Constraints (커널 제약식을 이용한 다중 비교차 분위수 함수의 순차적 추정법)

  • Bang, Sungwan;Jhun, Myoungshic;Cho, HyungJun
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.6
    • /
    • pp.915-922
    • /
    • 2013
  • Quantile regression can estimate multiple conditional quantile functions of the response, and as a result, it provide comprehensive information of the relationship between the response and the predictors. However, when estimating several conditional quantile functions separately, two or more estimated quantile functions may cross or overlap and consequently violate the basic properties of quantiles. In this paper, we propose a new stepwise method to estimate multiple non-crossing quantile functions using constraints on the kernel coefficients. A simulation study are presented to demonstrate satisfactory performance of the proposed method.

Quantile causality from dollar exchange rate to international oil price (원유가격에 대한 환율의 인과관계 : 비모수 분위수검정 접근)

  • Jeong, Kiho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.2
    • /
    • pp.361-369
    • /
    • 2017
  • This paper analyzes the causal relationship between dollar exchange rate and international oil price. Although large literature on the relationship has accumulated, results are not unique but diversified. Based on the idea that such diversified results may be due to different causality at different economic status, we considers an approach to test the causal relationship at each quantile. This approach is different from the mean causality analysis widely employed by the existing literature of the causal relationship. In this paper, monthly data from May 1987 to 2013 is used for the causal analysis in which Brent oil price and Major Currencies Dollar Index (MCDI) are considered. The test method is the nonparametric test for causality in quantile suggested by Jeong et al. (2012). The results show that although dollar exchange rate causes oil price in mean, the causal relationship does not exist at most quantiles.

Model selection via Bayesian information criterion for divide-and-conquer penalized quantile regression (베이즈 정보 기준을 활용한 분할-정복 벌점화 분위수 회귀)

  • Kang, Jongkyeong;Han, Seokwon;Bang, Sungwan
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.2
    • /
    • pp.217-227
    • /
    • 2022
  • Quantile regression is widely used in many fields based on the advantage of providing an efficient tool for examining complex information latent in variables. However, modern large-scale and high-dimensional data makes it very difficult to estimate the quantile regression model due to limitations in terms of computation time and storage space. Divide-and-conquer is a technique that divide the entire data into several sub-datasets that are easy to calculate and then reconstruct the estimates of the entire data using only the summary statistics in each sub-datasets. In this paper, we studied on a variable selection method using Bayes information criteria by applying the divide-and-conquer technique to the penalized quantile regression. When the number of sub-datasets is properly selected, the proposed method is efficient in terms of computational speed, providing consistent results in terms of variable selection as long as classical quantile regression estimates calculated with the entire data. The advantages of the proposed method were confirmed through simulation data and real data analysis.

On principal component analysis for interval-valued data (구간형 자료의 주성분 분석에 관한 연구)

  • Choi, Soojin;Kang, Kee-Hoon
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.1
    • /
    • pp.61-74
    • /
    • 2020
  • Interval-valued data, one type of symbolic data, are observed in the form of intervals rather than single values. Each interval-valued observation has an internal variation. Principal component analysis reduces the dimension of data by maximizing the variance of data. Therefore, the principal component analysis of the interval-valued data should account for the variance between observations as well as the variation within the observed intervals. In this paper, three principal component analysis methods for interval-valued data are summarized. In addition, a new method using a truncated normal distribution has been proposed instead of a uniform distribution in the conventional quantile method, because we believe think there is more information near the center point of the interval. Each method is compared using simulations and the relevant data set from the OECD. In the case of the quantile method, we draw a scatter plot of the principal component, and then identify the position and distribution of the quantiles by the arrow line representation method.

Multivariate quantile regression tree (다변량 분위수 회귀나무 모형에 대한 연구)

  • Kim, Jaeoh;Cho, HyungJun;Bang, Sungwan
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.3
    • /
    • pp.533-545
    • /
    • 2017
  • Quantile regression models provide a variety of useful statistical information by estimating the conditional quantile function of the response variable. However, the traditional linear quantile regression model can lead to the distorted and incorrect results when analysing real data having a nonlinear relationship between the explanatory variables and the response variables. Furthermore, as the complexity of the data increases, it is required to analyse multiple response variables simultaneously with more sophisticated interpretations. For such reasons, we propose a multivariate quantile regression tree model. In this paper, a new split variable selection algorithm is suggested for a multivariate regression tree model. This algorithm can select the split variable more accurately than the previous method without significant selection bias. We investigate the performance of our proposed method with both simulation and real data studies.

A new method for calculating quantiles of grouped data based on the frequency polygon (집단화된 통계자료의 도수다각형에 근거한 새로운 분위수 계산법)

  • Kim, Hyuk Joo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.2
    • /
    • pp.383-393
    • /
    • 2017
  • When we deal with grouped statistical data, it is desirable to use a calculation method that gives as close value to the true value of a statistic as possible. In this paper, we suggested a new method to calculate the quantiles of grouped data. The main idea of the suggested method is calculating the data values by partitioning the pentagons, that correspond to the class intervals in the frequency polygon drawn according to the histogram, into parts with equal area. We compared this method with existing methods through simulations using some datasets from introductory statistics textbooks. In the simulation study, we simulated as many data values as given in each class interval using the inverse transform method, on the basis of the distribution that has the shape given by the frequency polygon. Using the sum of squares of differences from quantiles of the simulated data as a criterion, the suggested method was found to have better performance than existing methods for almost all quartiles and deciles.

Comparison of estimation methods for expectile regression (평률 회귀분석을 위한 추정 방법의 비교)

  • Kim, Jong Min;Kang, Kee-Hoon
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.3
    • /
    • pp.343-352
    • /
    • 2018
  • We can use quantile regression and expectile regression analysis to estimate trends in extreme regions as well as the average trends of response variables in given explanatory variables. In this paper, we compare the performance between the parametric and nonparametric methods for expectile regression. We introduce each estimation method and analyze through various simulations and the application to real data. The nonparametric model showed better results if the model is complex and difficult to deduce the relationship between variables. The use of nonparametric methods can be recommended in terms of the difficulty of assuming a parametric model in expectile regression.

Selection of bandwidth for local linear composite quantile regression smoothing (국소 선형 복합 분위수 회귀에서의 평활계수 선택)

  • Jhun, Myoungshic;Kang, Jongkyeong;Bang, Sungwan
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.5
    • /
    • pp.733-745
    • /
    • 2017
  • Local composite quantile regression is a useful non-parametric regression method widely used for its high efficiency. Data smoothing methods using kernel are typically used in the estimation process with performances that rely largely on the smoothing parameter rather than the kernel. However, $L_2$-norm is generally used as criterion to estimate the performance of the regression function. In addition, many studies have been conducted on the selection of smoothing parameters that minimize mean square error (MSE) or mean integrated square error (MISE). In this paper, we explored the optimality of selecting smoothing parameters that determine the performance of non-parametric regression models using local linear composite quantile regression. As evaluation criteria for the choice of smoothing parameter, we used mean absolute error (MAE) and mean integrated absolute error (MIAE), which have not been researched extensively due to mathematical difficulties. We proved the uniqueness of the optimal smoothing parameter based on MAE and MIAE. Furthermore, we compared the optimal smoothing parameter based on the proposed criteria (MAE and MIAE) with existing criteria (MSE and MISE). In this process, the properties of the proposed method were investigated through simulation studies in various situations.

Extreme Quantile Estimation of Losses in KRW/USD Exchange Rate (원/달러 환율 투자 손실률에 대한 극단분위수 추정)

  • Yun, Seok-Hoon
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.5
    • /
    • pp.803-812
    • /
    • 2009
  • The application of extreme value theory to financial data is a fairly recent innovation. The classical annual maximum method is to fit the generalized extreme value distribution to the annual maxima of a data series. An alterative modern method, the so-called threshold method, is to fit the generalized Pareto distribution to the excesses over a high threshold from the data series. A more substantial variant is to take the point-process viewpoint of high-level exceedances. That is, the exceedance times and excess values of a high threshold are viewed as a two-dimensional point process whose limiting form is a non-homogeneous Poisson process. In this paper, we apply the two-dimensional non-homogeneous Poisson process model to daily losses, daily negative log-returns, in the data series of KBW/USD exchange rate, collected from January 4th, 1982 until December 31 st, 2008. The main question is how to estimate extreme quantiles of losses such as the 10-year or 50-year return level.

Outlier detection in time series data (시계열 자료에서의 특이치 발견)

  • Choi, Jeong In;Um, In Ok;Choa, Hyung Jun
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.5
    • /
    • pp.907-920
    • /
    • 2016
  • This study suggests an outlier detection algorithm that uses quantile autoregressive model in time series data, eventually applying it to actual stock manipulation cases by comparing its performance to existing methods. Studies on outlier detection have traditionally been conducted mostly in general data and those in time series data are insufficient. They have also been limited to a parametric model, which is not convenient as it is complicated with an analysis that takes a long time. Thus, we suggest a new algorithm of outlier detection in time series data and through various simulations, compare it to existing algorithms. Especially, the outlier detection algorithm in time series data can be useful in finding stock manipulation. If stock price which had a certain pattern goes out of flow and generates an outlier, it can be due to intentional intervention and manipulation. We examined how fast the model can detect stock manipulations by applying it to actual stock manipulation cases.