• Title/Summary/Keyword: quantile

Search Result 476, Processing Time 0.025 seconds

Quantile Regression-based regional frequency analysis techniques (Quantile-regression-based 지역빈도해석 기법)

  • Kang, Subin;Uranchimeg, Sumiya;Moon, Jangwon;Kwon, Hyun-Han
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.404-404
    • /
    • 2022
  • 효율적인 수자원 관리를 위해 빈도해석을 통한 수문 자료의 통계적 특성을 고려하여 정확한 확률강수량을 산정해야 한다. 지점빈도해석은 지점 자료만을 이용하여 확률강수량을 산정하기 때문에 정확도를 높이기 위해서는 자료 확충이 필요하지만, 지점별로 활용할 수 있는 자료가 제한적이며 지점마다 변동성이 크다. 지역빈도해석은 수문기상학적으로 동질한 주변 지점들의 자료를 모두 포함해서 빈도해석을 수행함으로써 지역에 대한 통합 결과를 제시하고 자료에 대한 신뢰성 확보가 가능하다. 일반적으로 빈도해석은 자료에 적합한 확률분포 기반으로 수행되지만 확률분포 선정과정에 따라 결과는 상이하다. 본 연구에서는 지역빈도해석에서 확률강수량 산정방법으로 Quantile Regression(QR)을 적용하였다. QR 기반의 빈도해석은 확률분포 아니라 자료 자체로 확률강수량을 산정하여 기존의 확률분포 기반의 빈도해석에서 발생했던 불확실성을 개선하였다. 또는, 확률강수량의 시간에 따른 변동성도 고려되어 바정상성 빈도해석도 가능하다. 최종적으로 본 연구에서 소개된 지역빈도해석 결과와 기존의 지역빈도해석 결과 비교 검증하였다.

  • PDF

Prediction of sharp change of particulate matter in Seoul via quantile mapping

  • Jeongeun Lee;Seoncheol Park
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.3
    • /
    • pp.259-272
    • /
    • 2023
  • In this paper, we suggest a new method for the prediction of sharp changes in particulate matter (PM10) using quantile mapping. To predict the current PM10 density in Seoul, we consider PM10 and precipitation in Baengnyeong and Ganghwa monitoring stations observed a few hours before. For the PM10 distribution estimation, we use the extreme value mixture model, which is a combination of conventional probability distributions and the generalized Pareto distribution. Furthermore, we also consider a quantile generalized additive model (QGAM) for the relationship modeling between precipitation and PM10. To prove the validity of our proposed model, we conducted a simulation study and showed that the proposed method gives lower mean absolute differences. Real data analysis shows that the proposed method could give a more accurate prediction when there are sharp changes in PM10 in Seoul.

Robust extreme quantile estimation for Pareto-type tails through an exponential regression model

  • Richard Minkah;Tertius de Wet;Abhik Ghosh;Haitham M. Yousof
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.6
    • /
    • pp.531-550
    • /
    • 2023
  • The estimation of extreme quantiles is one of the main objectives of statistics of extremes (which deals with the estimation of rare events). In this paper, a robust estimator of extreme quantile of a heavy-tailed distribution is considered. The estimator is obtained through the minimum density power divergence criterion on an exponential regression model. The proposed estimator was compared with two estimators of extreme quantiles in the literature in a simulation study. The results show that the proposed estimator is stable to the choice of the number of top order statistics and show lesser bias and mean square error compared to the existing extreme quantile estimators. Practical application of the proposed estimator is illustrated with data from the pedochemical and insurance industries.

A Study on the User Satisfaction of Demand Response Transport(DRT) by Quantile Regression Analysis (분위회귀분석에 의한 수요응답형교통 이용자 만족도 분석)

  • Jang, Tae Youn;Han, Woo Jin;Kim, Jeong Ho
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.15 no.3
    • /
    • pp.118-128
    • /
    • 2016
  • As the rural areas have experienced the population reduction and the aging, the service level of public transit decreases. This study analyzes the effecting factor to user satisfaction of demand response transport(DRT) as alternative to rural public transit by the quantile regression that aims at estimating either the conditional median or other quantiles of the response variable. Jeonbuk Province tested DRT operations in Dongsang of Wanju County and Sannae of Jeongup City each in 2015. The user DRT satisfaction of Wanju was higher than one of Jeongup in basic statistics analysis. The difference in satisfaction between higher quantile and lower quntile of Wanju is smaller than one of Jeongupy as a result of quantile regression analysis. Also, Wanju DRT continues the second test operation of DRT as satisfaction from Ordinary Least Squares(OLS) close to higher satisfaction quantile.

Assessment of Frequency Analysis using Daily Rainfall Data of HadGEM3-RA Climate Model (HadGEM3-RA 기후모델 일강우자료를 이용한 빈도해석 성능 평가)

  • Kim, Sunghun;Kim, Hanbeen;Jung, Younghun;Heo, Jun-Haeng
    • Journal of Wetlands Research
    • /
    • v.21 no.spc
    • /
    • pp.51-60
    • /
    • 2019
  • In this study, we performed At-site Frequency Analysis(AFA) and Regional Frequency Analysis(RFA) using the observed and climate change scenario data, and the relative root mean squared error(RMMSE) was compared and analyzed for both approaches through Monte Carlo simulation. To evaluate the rainfall quantile, the daily rainfall data were extracted for 615 points in Korea from HadGEM3-RA(12.5km) climate model data, one of the RCM(Regional Climate Model) data provided by the Korea Meteorological Administration(KMA). Quantile mapping(QM) and inverse distance squared methods(IDSM) were applied for bias correction and spatial disaggregation. As a result, it is shown that the RFA estimates more accurate rainfall quantile than AFA, and it is expected that the RFA could be reasonable when estimating the rainfall quantile based on climate change scenarios.

Multivariate quantile regression tree (다변량 분위수 회귀나무 모형에 대한 연구)

  • Kim, Jaeoh;Cho, HyungJun;Bang, Sungwan
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.3
    • /
    • pp.533-545
    • /
    • 2017
  • Quantile regression models provide a variety of useful statistical information by estimating the conditional quantile function of the response variable. However, the traditional linear quantile regression model can lead to the distorted and incorrect results when analysing real data having a nonlinear relationship between the explanatory variables and the response variables. Furthermore, as the complexity of the data increases, it is required to analyse multiple response variables simultaneously with more sophisticated interpretations. For such reasons, we propose a multivariate quantile regression tree model. In this paper, a new split variable selection algorithm is suggested for a multivariate regression tree model. This algorithm can select the split variable more accurately than the previous method without significant selection bias. We investigate the performance of our proposed method with both simulation and real data studies.

지자기 전달함수의 로버스트 추정

  • Yang, Jun-Mo;O, Seok-Hun;Lee, Deok-Gi;Yun, Yong-Hun
    • Journal of the Korean Geophysical Society
    • /
    • v.5 no.2
    • /
    • pp.131-142
    • /
    • 2002
  • Geomagnetic transfer function is generally estimated by choosing transfer to minimize the square sum of differences between observed values. If the error structure sccords to the Gaussian distribution, standard least square(LS) can be the estimation. However, for non-Gaussian error distribution, the LS estimation can be severely biased and distorted. In this paper, the Gaussian error assumption was tested by Q-Q(Quantile-Quantile) plot which provided information of real error structure. Therefore, robust estimation such as regression M-estimate that does not allow a few bad points to dominate the estimate was applied for error structure with non-Gaussian distribution. The results indicate that the performance of robust estimation is similar to the one of LS estimation for Gaussian error distribution, whereas the robust estimation yields more reliable and smooth transfer function estimates than standard LS for non-Gaussian error distribution.

  • PDF

Divide and conquer kernel quantile regression for massive dataset (대용량 자료의 분석을 위한 분할정복 커널 분위수 회귀모형)

  • Bang, Sungwan;Kim, Jaeoh
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.5
    • /
    • pp.569-578
    • /
    • 2020
  • By estimating conditional quantile functions of the response, quantile regression (QR) can provide comprehensive information of the relationship between the response and the predictors. In addition, kernel quantile regression (KQR) estimates a nonlinear conditional quantile function in reproducing kernel Hilbert spaces generated by a positive definite kernel function. However, it is infeasible to use the KQR in analysing a massive data due to the limitations of computer primary memory. We propose a divide and conquer based KQR (DC-KQR) method to overcome such a limitation. The proposed DC-KQR divides the entire data into a few subsets, then applies the KQR onto each subsets and derives a final estimator by aggregating all results from subsets. Simulation studies are presented to demonstrate the satisfactory performance of the proposed method.

Model selection via Bayesian information criterion for divide-and-conquer penalized quantile regression (베이즈 정보 기준을 활용한 분할-정복 벌점화 분위수 회귀)

  • Kang, Jongkyeong;Han, Seokwon;Bang, Sungwan
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.2
    • /
    • pp.217-227
    • /
    • 2022
  • Quantile regression is widely used in many fields based on the advantage of providing an efficient tool for examining complex information latent in variables. However, modern large-scale and high-dimensional data makes it very difficult to estimate the quantile regression model due to limitations in terms of computation time and storage space. Divide-and-conquer is a technique that divide the entire data into several sub-datasets that are easy to calculate and then reconstruct the estimates of the entire data using only the summary statistics in each sub-datasets. In this paper, we studied on a variable selection method using Bayes information criteria by applying the divide-and-conquer technique to the penalized quantile regression. When the number of sub-datasets is properly selected, the proposed method is efficient in terms of computational speed, providing consistent results in terms of variable selection as long as classical quantile regression estimates calculated with the entire data. The advantages of the proposed method were confirmed through simulation data and real data analysis.

Development of a method of the data generation with maintaining quantile of the sample data

  • Joohyung Lee;Young-Oh Kim
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.244-244
    • /
    • 2023
  • Both the frequency and the magnitude of hydrometeorological extreme events such as severe floods and droughts are increasing. In order to prevent a damage from the climatic disaster, hydrological models are often simulated under various meteorological conditions. While performing the simulations, a synthetic data generated through time series models which maintains the key statistical characteristics of the sample data are widely applied. However, the synthetic data can easily maintains both the average and the variance of the sample data, but the quantile is not maintained well. In this study, we proposes a data generation method which maintains the quantile of the sample data well. The equations of the former maintenance of variance extension (MOVE) are expanded to maintain quantile rather than the average or the variance of the sample data. The equations are derived and the coefficients are determined based on the characteristics of the sample data that we aim to preserve. Monte Carlo simulation is utilized to assess the performance of the proposed data generation method. A time series data (data length of 500) is regarded as the sample data and selected randomly from the sample data to create the data set (data length of 30) for simulation. Data length of the selected data set is expanded from 30 to 500 by using the proposed method. Then, the average, the variance, and the quantile difference between the sample data, and the expanded data are evaluated with relative root mean square error for each simulation. As a result of the simulation, each equation which is designed to maintain the characteristic of data performs well. Moreover, expanded data can preserve the quantile of sample data more precisely than that those expanded through the conventional time series model.

  • PDF