• 제목/요약/키워드: Quantile estimation

검색결과 138건 처리시간 0.026초

New Normalization Methods using Support Vector Machine Regression Approach in cDNA Microarray Analysis

  • Sohn, In-Suk;Kim, Su-Jong;Hwang, Chang-Ha;Lee, Jae-Won
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2005년도 BIOINFO 2005
    • /
    • pp.51-56
    • /
    • 2005
  • There are many sources of systematic variations in cDNA microarray experiments which affect the measured gene expression levels like differences in labeling efficiency between the two fluorescent dyes. Print-tip lowess normalization is used in situations where dye biases can depend on spot overall intensity and/or spatial location within the array. However, print-tip lowess normalization performs poorly in situation where error variability for each gene is heterogeneous over intensity ranges. We proposed the new print-tip normalization methods based on support vector machine regression(SVMR) and support vector machine quantile regression(SVMQR). SVMQR was derived by employing the basic principle of support vector machine (SVM) for the estimation of the linear and nonlinear quantile regressions. We applied our proposed methods to previous cDNA micro array data of apolipoprotein-AI-knockout (apoAI-KO) mice, diet-induced obese mice, and genistein-fed obese mice. From our statistical analysis, we found that the proposed methods perform better than the existing print-tip lowess normalization method.

  • PDF

확률 분포형의 극치 수문량 예측 능력 평가에 관한 연구 (A Study on the Estimation of Extreme Quantile of Probability Distribution)

  • 정진석;신홍준;안현준;허준행
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2017년도 학술발표회
    • /
    • pp.399-400
    • /
    • 2017
  • 홍수나 가뭄 등 극치 현상의 통계분석 및 빈도해석에 있어 극치분포형이 널리 사용되고 있으며, 이러한 극치분포형의 특성을 이해하기 위해서는 분포형의 오른쪽 꼬리(right tail) 부분 특성을 자세히 분석할 필요가 있다. 이에 따라 본 연구에서는 Monte Carlo 모의를 통하여 다양한 극치분포형의 오른쪽 꼬리 부분의 통계적 특성 및 그 예측 능력을 연구하였다. 극치분포형으로는 우리나라 확률수문량 산정에 널리 활용되고 있는 generalized extreme value (GEV), Gumbel, generalized logistic 분포를 사용하였으며, 매개변수 산정 방법으로는 확률가중모멘트법을 사용하였다. 모의실험의 모분포로는 수문빈도해석에서 많이 사용되는 GEV 분포를 사용하였고, 30년 이상 자료를 보유한 기상청 지점 자료의 왜곡도를 조사하여 모의실험에 사용되는 모집단의 왜곡도로 가정하여 표본 자료를 발생시켰다. 예측 능력의 평가는 재현기간 10~1000년의 확률수문량을 왜곡도계수를 고려한 GEV 도시위치공식을 이용하여 GEV 확률지에 도시하고, 평균제곱근오차(root mean square error), 편의(bias), 평균상대오차(mean relative difference), 평균절대상대오차(mean absolute relative difference)를 이용하여 최적 분포형을 선정함으로써 이루어진다. 또한 예측 능력 평가결과의 타당성 확인을 위해 극치분포형의 적합정도를 잘 나타낸다고 알려진 modified Anderson-Darling 방법의 검정결과와 비교하여 적절성을 확인하였다.

  • PDF

유효가뭄지수(EDI)를 이용한 한반도 미래 가뭄 특성 전망 (Projection of Future Changes in Drought Characteristics in Korea Peninsula Using Effective Drought Index)

  • 곽용석;조재필;정임국;김도우;장상민
    • 한국기후변화학회지
    • /
    • 제9권1호
    • /
    • pp.31-45
    • /
    • 2018
  • This study implemented the prediction of drought properties (number of drought events, intensity, duration) using the user-oriented systematical procedures of downscaling climate change scenarios based the multiple global climate models (GCMs), AIMS (APCC Integrated Modeling Solution) program. The drought properties were defined and estimated with Effective Drought Index (EDI). The optimal 10 models among 29 GCMs were selected, by the estimation of the spatial and temporal reproducibility about the five climate change indices related with precipitation. In addition, Simple Quantile Mapping (SQM) as the downscaling technique is much better in describing the observed precipitation events than Spatial Disaggregation Quantile Delta Mapping (SDQDM). Even though the procedure was systematically applied, there are still limitations in describing the observed spatial precipitation properties well due to the offset of spatial variability in multi-model ensemble (MME) analysis. As a result, the farther into the future, the duration and the number of drought generation will be decreased, while the intensity of drought will be increased. Regionally, the drought at the central regions of the Korean Peninsula is expected to be mitigated, while that at the southern regions are expected to be severe.

Is It Possible to Achieve IMO Carbon Emission Reduction Targets at the Current Pace of Technological Progress?

  • Choi, Gun-Woo;Yun, Heesung;Hwang, Soo-Jin
    • Journal of Korea Trade
    • /
    • 제26권1호
    • /
    • pp.113-125
    • /
    • 2022
  • Purpose - The primary purpose of this study is to verify whether the target set out by the International Maritime Organization (IMO) for reducing carbon emissions from ships can be achieved by quantitatively analyzing the trends in technological advances of fuel oil consumption in the container shipping market. To achieve this purpose, several scenarios are designed considering various options such as eco-friendly fuels, low-speed operation, and the growth in ship size. Design/methodology - The vessel size and speed used in prior studies are utilized to estimate the fuel oil consumption of container ships and the pace of technological progress and Energy Efficiency Design Index (EEDI) regulations are added. A database of 5,260 container ships, as of 2019, is used for multiple linear regression and quantile regression analyses. Findings - The fuel oil consumption of vessels is predominantly affected by their speed, followed by their size, and the annual technological progress is estimated to be 0.57%. As the quantile increases, the influence of ship size and pace of technological progress increases, while the influence of speed and coefficient of EEDI variables decreases. Originality/value - The conservative estimation of carbon emission drawn by a quantitative analysis of the technological progress concerning the fuel efficiency of container vessels shows that it is not possible to achieve IMO targets. Therefore, innovative efforts beyond the current scope of technological progress are required.

원/달러 환율 투자 손실률에 대한 극단분위수 추정 (Extreme Quantile Estimation of Losses in KRW/USD Exchange Rate)

  • 윤석훈
    • Communications for Statistical Applications and Methods
    • /
    • 제16권5호
    • /
    • pp.803-812
    • /
    • 2009
  • 금융자료에 극단값이론을 적용하는 것은 위험관리에서 중요한 최신 통계기법 중의 하나라고 할 수 있다. 극단값분석에서 전통적으로 사용해 오던 연간 최대값방법은 시계열자료의 연간 최대값들에 대하여 일반화 극단값분포를 적합시키는 것이고, 최근 대안으로 널리 사용되고 있는 분계점 방법은 시계열자료 중 충분히 큰 하나의 분계점을 넘어서는 초과값들에 대하여 일반화파레토분포를 적합시키는 것이다. 그러나, 보다 실질적인 방법은 분계점을 넘어서는 초과값들을 하나의 점과정으로 해석하는 것인데, 즉 초과값들의 초과시점과 초과여분을 점근적으로 비동질 포아송과정을 갖는 하나의 2차원 점과정으로 간주하는 것이다. 본 논문에서는 이러한 2차원 비동질 포아송과정 모형을 1982.1.4부터 2008.12.31까지 수집된 원/달러 환율 시계열자료로부터 계산된 일별 환율투자손실률, 즉 일별 로그 손실률에 적용한다. 여기서 주된 관심은 10년 혹은 50년에 한번 정도 발생하는 대형 손실률 수준과 같은 극단분위수를 어떻게 추정하느냐 하는 것이다.

기후변화 시나리오 편의보정 기법에 따른 강우-유출 특성 분석 (Analysis of Rainfall-Runoff Characteristics on Bias Correction Method of Climate Change Scenarios)

  • 금동혁;박윤식;정영훈;신민환;류지철;박지형;양재의;임경재
    • 한국물환경학회지
    • /
    • 제31권3호
    • /
    • pp.241-252
    • /
    • 2015
  • Runoff behaviors by five bias correction methods were analyzed, which were Change Factor methods using past observed and estimated data by the estimation scenario with average annual calibration factor (CF_Y) or with average monthly calibration factor (CF_M), Quantile Mapping methods using past observed and estimated data considering cumulative distribution function for entire estimated data period (QM_E) or for dry and rainy season (QM_P), and Integrated method of CF_M+QM_E(CQ). The peak flow by CF_M and QM_P were twice as large as the measured peak flow, it was concluded that QM_P method has large uncertainty in monthly runoff estimation since the maximum precipitation by QM_P provided much difference to the other methods. The CQ method provided the precipitation amount, distribution, and frequency of the smallest differences to the observed data, compared to the other four methods. And the CQ method provided the rainfall-runoff behavior corresponding to the carbon dioxide emission scenario of SRES A1B. Climate change scenario with bias correction still contained uncertainty in accurate climate data generation. Therefore it is required to consider the trend of observed precipitation and the characteristics of bias correction methods so that the generated precipitation can be used properly in water resource management plan establishment.

자녀유무별 여성임금격차(Family gap) : 소득분위에 따른 비교연구 (Family Gaps Across the Wages Distribution in Korea)

  • 허수연
    • 사회복지연구
    • /
    • 제43권2호
    • /
    • pp.345-366
    • /
    • 2012
  • 본 연구는 소득계층에 따른 '자녀유무별 여성임금격차(Family gap)'의 크기를 비교하는 것을 목적으로 한다. 2008년 한국노동패널의 조사자료를 활용하여 헤크만 2단계 추정법(Heckman's two stage estimation) 모형을 통해 분석대상 여성의 경제활동참가 선택을 결정하는 조건부 기댓값을 통제한 후, 소득분위에 따른 자녀양육의 영향력을 파악하기 위해 분위회귀분석(Quantile regression) 방법을 사용하였다. 분석결과 저소득계층(10분위)과 고소득계층(90분위)을 제외한 모든 소득분위에서 자녀를 한 명 양육하는 경우 자녀가 없는 경우에 비해 시간당임금이 낮아지는 결과, 즉 Family gap이 발견되었다. 또한 모든 소득계층에서 둘 이상의 자녀를 양육하는 경우 자녀가 없는 경우에 비해 시간당임금이 낮아지는 Family gap이 발견되었다. Family gap은 자녀가 하나 있는 경우와 둘 이상 있는 경우 모두 소득계층 25분위에서 가장 크게 나타났다. 이러한 연구결과를 바탕으로 자녀양육으로 인한 여성의 노동시장 불평등과 여성 간(間)의 불평등 완화를 위한 보편적인 가족정책의 확대에 대해 논의하였다.

Robustness, Data Analysis, and Statistical Modeling: The First 50 Years and Beyond

  • Barrios, Erniel B.
    • Communications for Statistical Applications and Methods
    • /
    • 제22권6호
    • /
    • pp.543-556
    • /
    • 2015
  • We present a survey of contributions that defined the nature and extent of robust statistics for the last 50 years. From the pioneering work of Tukey, Huber, and Hampel that focused on robust location parameter estimation, we presented various generalizations of these estimation procedures that cover a wide variety of models and data analysis methods. Among these extensions, we present linear models, clustered and dependent observations, times series data, binary and discrete data, models for spatial data, nonparametric methods, and forward search methods for outliers. We also present the current interest in robust statistics and conclude with suggestions on the possible future direction of this area for statistical science.

Transmuted new generalized Weibull distribution for lifetime modeling

  • Khan, Muhammad Shuaib;King, Robert;Hudson, Irene Lena
    • Communications for Statistical Applications and Methods
    • /
    • 제23권5호
    • /
    • pp.363-383
    • /
    • 2016
  • The Weibull family of lifetime distributions play a fundamental role in reliability engineering and life testing problems. This paper investigates the potential usefulness of transmuted new generalized Weibull (TNGW) distribution for modeling lifetime data. This distribution is an important competitive model that contains twenty-three lifetime distributions as special cases. We can obtain the TNGW distribution using the quadratic rank transmutation map (QRTM) technique. We derive the analytical shapes of the density and hazard functions for graphical illustrations. In addition, we explore some mathematical properties of the TNGW model including expressions for the quantile function, moments, entropies, mean deviation, Bonferroni and Lorenz curves and the moments of order statistics. The method of maximum likelihood is used to estimate the model parameters. Finally the applicability of the TNGW model is presented using nicotine in cigarettes data for illustration.

Adaptive M-estimation in Regression Model

  • Han, Sang-Moon
    • Communications for Statistical Applications and Methods
    • /
    • 제10권3호
    • /
    • pp.859-871
    • /
    • 2003
  • In this paper we introduce some adaptive M-estimators using selector statistics to estimate the slope of regression model under the symmetric and continuous underlying error distributions. This selector statistics is based on the residuals after the preliminary fit L$_1$ (least absolute estimator) and the idea of Hogg(1983) and Hogg et. al. (1988) who used averages of some order statistics to discriminate underlying symmetric distributions in the location model. If we use L$_1$ as a preliminary fit to get residuals, we find the asymptotic distribution of sample quantiles of residual are slightly different from that of sample quantiles in the location model. If we use the functions of sample quantiles of residuals as selector statistics, we find the suitable quantile points of residual based on maximizing the asymptotic distance index to discriminate distributions under consideration. In Monte Carlo study, this adaptive M-estimation method using selector statistics works pretty good in wide range of underlying error distributions.