• Title/Summary/Keyword: LOWESS regression

Search Result 11, Processing Time 0.03 seconds

Robust Nonparametric Regression Method using Rank Transformation

    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.2
    • /
    • pp.574-574
    • /
    • 2000
  • Consider the problem of estimating regression function from a set of data which is contaminated by a long-tailed error distribution. The linear smoother is a kind of a local weighted average of response, so it is not robust against outliers. The kernel M-smoother and the lowess attain robustness against outliers by down-weighting outliers. However, the kernel M-smoother and the lowess requires the iteration for computing the robustness weights, and as Wang and Scott(1994) pointed out, the requirement of iteration is not a desirable property. In this article, we propose the robust nonparametic regression method which does not require the iteration. Robustness can be achieved not only by down-weighting outliers but also by transforming outliers. The rank transformation is a simple procedure where the data are replaced by their corresponding ranks. Iman and Conover(1979) showed the fact that the rank transformation is a robust and powerful procedure in the linear regression. In this paper, we show that we can also use the rank transformation to nonparametric regression to achieve the robustness.

Robust Nonparametric Regression Method using Rank Transformation

  • Park, Dongryeon
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.2
    • /
    • pp.575-583
    • /
    • 2000
  • Consider the problem of estimating regression function from a set of data which is contaminated by a long-tailed error distribution. The linear smoother is a kind of a local weighted average of response, so it is not robust against outliers. The kernel M-smoother and the lowess attain robustness against outliers by down-weighting outliers. However, the kernel M-smoother and the lowess requires the iteration for computing the robustness weights, and as Wang and Scott(1994) pointed out, the requirement of iteration is not a desirable property. In this article, we propose the robust nonparametic regression method which does not require the iteration. Robustness can be achieved not only by down-weighting outliers but also by transforming outliers. The rank transformation is a simple procedure where the data are replaced by their corresponding ranks. Iman and Conover(1979) showed the fact that the rank transformation is a robust and powerful procedure in the linear regression. In this paper, we show that we can also use the rank transformation to nonparametric regression to achieve the robustness.

  • PDF

New Normalization Methods using Support Vector Machine Regression Approach in cDNA Microarray Analysis

  • Sohn, In-Suk;Kim, Su-Jong;Hwang, Chang-Ha;Lee, Jae-Won
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.51-56
    • /
    • 2005
  • There are many sources of systematic variations in cDNA microarray experiments which affect the measured gene expression levels like differences in labeling efficiency between the two fluorescent dyes. Print-tip lowess normalization is used in situations where dye biases can depend on spot overall intensity and/or spatial location within the array. However, print-tip lowess normalization performs poorly in situation where error variability for each gene is heterogeneous over intensity ranges. We proposed the new print-tip normalization methods based on support vector machine regression(SVMR) and support vector machine quantile regression(SVMQR). SVMQR was derived by employing the basic principle of support vector machine (SVM) for the estimation of the linear and nonlinear quantile regressions. We applied our proposed methods to previous cDNA micro array data of apolipoprotein-AI-knockout (apoAI-KO) mice, diet-induced obese mice, and genistein-fed obese mice. From our statistical analysis, we found that the proposed methods perform better than the existing print-tip lowess normalization method.

  • PDF

Characteristics of Inter-monthly Climatic Change Appeared in Long-term Seoul Rainfall (장기간의 서울지점 강우자료에 나타난 월간 기후변화 특성)

  • Hwang, Seok Hwan;Kim, Joong Hoon;Yoo, Chul Sang;Lee, Jung Ho
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.30 no.1B
    • /
    • pp.1-11
    • /
    • 2010
  • In this study, To analyzed the monthly long-term change characteristics of Chukwooki rainfall data set (CWK) and modern rain gage rainfall data set (MRG), tests of trend or variation were performed of each data sets using five statistical trend or variation test method. furthermore, changing characteristics of rainfall was analyzed through the accomplishment of the 2-dimensional LOWESS regression (or smoothing) which can consider both annual time-variation and inter-monthly time-variation. From the trend test, it is difficult to confirm that given data sets have significant trends. From the 2-dimensional LOWESS analysis for four rainfall characteristics, after near A.D. 1980, inter-monthly variation width in addition to quantative increment of rainfall are increased rapidly and persistently.

Epidemiological application of the cycle threshold value of RT-PCR for estimating infection period in cases of SARS-CoV-2

  • Soonjong Bae;Jong-Myon Bae
    • Journal of Medicine and Life Science
    • /
    • v.20 no.3
    • /
    • pp.107-114
    • /
    • 2023
  • Epidemiological control of coronavirus disease 2019 (COVID-19) is needed to estimate the infection period of confirmed cases and identify potential cases. The present study, targeting confirmed cases for which the time of COVID-19 symptom onset was disclosed, aimed to investigate the relationship between intervals (day) from symptom onset to testing the cycle threshold (CT) values of real-time reverse transcription-polymerase chain reaction. Of the COVID-19 confirmed cases, those for which the date of suspected symptom onset in the epidemiological investigation was specifically disclosed were included in this study. Interval was defined as the number of days from symptom onset (as disclosed by the patient) to specimen collection for testing. A locally weighted regression smoothing (LOWESS) curve was applied, with intervals as explanatory variables and CT values (CTR for RdRp gene and CTE for E gene) as outcome variables. After finding its non-linear relationship, a polynomial regression model was applied to estimate the 95% confidence interval values of CTR and CTE by interval. The application of LOWESS in 331 patients identified a U-shaped curve relationship between the CTR and CTE values according to the number of interval days, and both CTR and CTE satisfied the quadratic model for interval days. Active application of these results to epidemiological investigations would minimize the chance of failing to identify individuals who are in contact with COVID-19 confirmed cases, thereby reducing the potential transmission of the virus to local communities.

A Study on the Change of Occurrence Characteristics of Daily Seoul Rainfall using Markov Chain (마코프 연쇄를 이용한 서울지점 일강우의 발생특성 변화 연구)

  • Hwang, Seok-Hwan;Kim, Joong-Hoon;Yoo, Chul-Sang;Jung, Sung-Won;Joo, Jin-Gul
    • Journal of Korea Water Resources Association
    • /
    • v.42 no.9
    • /
    • pp.747-758
    • /
    • 2009
  • In this study, long-term variabilities of rainfall-occurrence characteristics are analyzed using rainfall data at Seoul, which is the longest data record existing in world. first, the accuracy of Chukwooki data set (CWK) are evaluated in view of rainfall-occurrence probability by analyzing the transition probabilities and occurrence characteristics based on Markov chain. And long-term inter-monthly variabilities of transition probabilities are analyzed using two dimensional LOWESS regression. From the results of analyzed transition probabilities and occurrence characteristics, it is different that rainfall-occurrence characteristics between CWK and modern rain gage data set (MRG) for original rainfall data sets (M00). For characteristics of rainfall series, occurrences probabilities of rainfall are increased and durations of each rainfall are shorter than past. And from the results of analyzing the long-term inter-monthly variabilities of transition probabilities, in case of M20, lengths of dry spells between CWK and MRG are not different significantly and lengths of wet spells are decreased persistently after A.D. 1830. Especially, decreasing trend for lengths of wet spells at recent september are appeared significantly. These results are considered with increasing trend of recent rainfall, it is concluded that recent frequencies and intensities of rainfall are increasing.

Number of sampling leaves for reflectance measurement of Chinese cabbage and kale

  • Chung, Sun-Ok;Ngo, Viet-Duc;Kabir, Md. Shaha Nur;Hong, Soon-Jung;Park, Sang-Un;Kim, Sun-Ju;Park, Jong-Tae
    • Korean Journal of Agricultural Science
    • /
    • v.41 no.3
    • /
    • pp.169-175
    • /
    • 2014
  • Objective of this study was to investigate effects of pre-processing method and number of sampling leaves on stability of the reflectance measurement for Chinese cabbage and kale leaves. Chinese cabbage and kale were transplanted and cultivated in a plant factory. Leaf samples of the kale and cabbage were collected at 4 weeks after transplanting of the seedlings. Spectra data were collected with an UV/VIS/NIR spectrometer in the wavelength region from 190 to 1130 nm. All leaves (mature and young leaves) were measured on 9 and 12 points in the blade part in the upper area for kale and cabbage leaves, respectively. To reduce the spectral noise, the raw spectral data were preprocessed by different methods: i) moving average, ii) Savitzky-Golay filter, iii) local regression using weighted linear least squares and a $1^{st}$ degree polynomial model (lowess), iv) local regression using weighted linear least squares and a $2^{nd}$ degree polynomial model (loess), v) a robust version of 'lowess', vi) a robust version of 'loess', with 7, 11, 15 smoothing points. Effects of number of sampling leaves were investigated by reflectance difference (RD) and cross-correlation (CC) methods. Results indicated that the contribution of the spectral data collected at 4 sampling leaves were good for both of the crops for reflectance measurement that does not change stability of measurement much. Furthermore, moving average method with 11 smoothing points was believed to provide reliable pre-processed data for further analysis.

Lowess and outlier analysis of biological oxygen demand on Nakdong main stream river (낙동강 본류 측정소들의 생물학적 산소요구량 수치에 대한 비모수적 회귀분석과 특이점분석)

  • Kim, Jong Tae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.1
    • /
    • pp.119-130
    • /
    • 2014
  • This paper is based on water information system of NIE, National Institute of Environmental Research. We used monthly data of water quality from January, 2013 to August, 2013 starting from measuring point A (nbA) to measuring point N (nbN) located along the Nakdong river main stream. Statistical water quality analysis of BOD (biological oxygen demand) is specified by R programming depending on month, year, and points. Based on BOD measured from Nakdong river's measuring points, we used exploratory data analysis and locally weighted scatter plot smoother (Lowess) trend analysis, which is a method of non-parametic regression analysis, to analyze long-term water tendency and water quality distribution depending on points. Also, we analyzed the period and the measuring point of which the outliers are abundant. As a result, compared to BOD measured in nbM located in Busan along the downstream, BOD measured in nbG located in Daegu and nbI located in Changwon along the midstream showed higher rate of water pollution at a severe level.

Monitoring of Gene Regulations Using Average Rank in DNA Microarray: Implementation of R

  • Park, Chang-Soon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.4
    • /
    • pp.1005-1021
    • /
    • 2007
  • Traditional procedures for DNA microarray data analysis are to preprocess and normalize the gene expression data, and then to analyze the normalized data using statistical tests. Drawbacks of the traditional methods are: genuine biological signal may be unwillingly eliminated together with artifacts, the limited number of arrays per gene make statistical tests difficult to use the normality assumption or nonparametric method, and genes are tested independently without consideration of interrelationships among genes. A novel method using average rank in each array is proposed to eliminate such drawbacks. This average rank method monitors differentially regulated genes among genetically different groups and the selected genes are somewhat different from those selected by traditional P-value method. Addition of genes selected by the average rank method to the traditional method will provide better understanding of genetic differences of groups.

  • PDF

Improved Trend Estimation of Non-monotonic Time Series Through Increased Homogeneity in Direction of Time-variation (시변동의 동질성 증가에 의한 비단조적 시계열자료의 경향성 탐지력 향상)

  • Oh, Kyoung-Doo;Park, Soo-Yun;Lee, Soon-Cheol;Jun, Byong-Ho;Ahn, Won-Sik
    • Journal of Korea Water Resources Association
    • /
    • v.38 no.8 s.157
    • /
    • pp.617-629
    • /
    • 2005
  • In this paper, a hypothesis is tested that division of non-monotonic time series into monotonic parts will improve the estimation of trends through increased homogeneity in direction of time-variation using LOWESS smoothing and seasonal Kendall test. From the trend analysis of generated time series and water temperature, discharge, air temperature and solar radiation of Lake Daechung, it is shown that the hypothesis is supported by improved estimation of trends and slopes. Also, characteristics in homogeneity variation of seasonal changes seems to be more clearly manifested as homogeneity in direction of time-variation is increased. And this will help understand the effects of human intervention on natural processes and seems to warrant more in-depth study on this subject. The proposed method can be used for trend analysis to detect monotonic trends and it is expected to improve understanding of long-term changes in natural environment.