• Title/Summary/Keyword: Generalized cross-validation

Search Result 76, Processing Time 0.03 seconds

Variable selection in censored kernel regression

  • Choi, Kook-Lyeol;Shim, Jooyong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.1
    • /
    • pp.201-209
    • /
    • 2013
  • For censored regression, it is often the case that some input variables are not important, while some input variables are more important than others. We propose a novel algorithm for selecting such important input variables for censored kernel regression, which is based on the penalized regression with the weighted quadratic loss function for the censored data, where the weight is computed from the empirical survival function of the censoring variable. We employ the weighted version of ANOVA decomposition kernels to choose optimal subset of important input variables. Experimental results are then presented which indicate the performance of the proposed variable selection method.

The Family Relationship Scale : Re-validation ("가족관계척도" 활용을 위한 타당도 연구)

  • Yang, Ok-Kyung;Lee, Min-Young
    • Korean Journal of Social Welfare
    • /
    • v.54
    • /
    • pp.5-33
    • /
    • 2003
  • This study is about the re-validation evaluation of the family Relationship Scale (FRS), developed to measure the family relationship in the social work practice. This study aims at re-validating the FRS, developed and validated in by Yang in 2001 for more general utilization. The sample was married mates and females residing in Seoul. For Face Validity, the content analysis was performed, and the FRS was re-validated in the dimensions of Love & Caring, Acceptance, and Recognition, positive affection, empathy, and autonomy and flexibility for each area. Internal reliability was .93, and internal consistency among three dimensions was 93%. For Empirical Validity, the Construct validity, the Criterion validity, and the Discriminant validity were performed. Construct Validity was validated through factor analyses. Commonalities for the factor analysis was 54%, and the factor loading for each factor was over .45. The confirmative factor analysis also confirmed the fitness of the scale. For Predictive Validity of Criterion Validity, regression analysis showed that the family stress scores became lower as the scores of the family relationship became higher; the discriminant analysis revealed that the family stress turned low ill tile group of high scores of family relationship. The Correlation analysis for Concurrent Validity was performed and the results showed the positive and significant relationship with a couple communication level (r=54) and a parent-child communication level (r=64). Life satisfaction and mental health level also revealed significantly positive correlation to prove Convergent Validity. Physical health level revealed a weak relationship with family relationship providing the evidence of Discriminant Validity. Discriminance was also proved by the analysis of variance with demographics. Thus, Cross Validation was confirmed the validation of the FRS through the various analyses with the married population. This study result improved the validity generalization of the Scale and verify the generalized usage of this sociometric scale in the field of social work practice.

  • PDF

Generating high resolution of daily mean temperature using statistical models (통계적모형을 통한 고해상도 일별 평균기온 산정)

  • Yoon, Sanghoo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.5
    • /
    • pp.1215-1224
    • /
    • 2016
  • Climate information of the high resolution grid units is an important factor to explain the phenomenon in a variety of research field. Statistical linear interpolation models are computationally inexpensive and applicable to any climate data compared to the dynamic simulation method at regional scales. In this paper, we considered four different linear-based statistical interpolation models: general linear model, generalized additive model, spatial linear regression model, and Bayesian spatial linear regression model. The climate variable of interest was the daily mean temperature, where the spatial variability was explained using geographic terrain information: latitude, longitude, elevation. The data were collected by weather stations in January from 2003 and 2012. In the sense of RMSE and correlation coefficient, Bayesian spatial linear regression model showed better performance in reflecting the spatial pattern compared to the other models.

Effect of Ambient Air Pollution on Years of Life Lost from Deaths due to Injury in Seoul, South Korea (대기오염물질이 손상으로 인한 손실수명연수에 미치는 영향: 서울특별시를 중심으로)

  • Sun-Woo Kang;Subin Jeong;Hyewon Lee
    • Journal of Environmental Health Sciences
    • /
    • v.49 no.3
    • /
    • pp.149-158
    • /
    • 2023
  • Background: Injury is one of the major health problems in South Korea. Few studies have evaluated both intentional and unintentional injury when investigating the association between exposure to air pollutants and injury. Objectives: We aimed to explore the association between short-term exposure to ambient air pollution and years of life lost (YLLs) due to injury. Methods: Data on daily YLLs for 2002~2019 were obtained from the the Death Statistics Database of the Korean National Statistical Office. This study estimated short-term exposure to particulate matter with an aerodynamic diameter of <10 ㎛ (PM10), particulate matter with an aerodynamic diameter of <2.5 ㎛ (PM2.5), sulfur dioxide (SO2), nitrogen dioxide (NO2), carbon monoxide (CO), and ozone (O3). This time series study was conducted using a generalized additive model (GAM) assuming a Gaussian distribution. We also evaluated a delayed effect of ambient air pollution by constructing a lag structure up to seven days. The best-fitting lag was selected based on smallest generalized cross validation (GCV) value. To explore effect modification by intentionality of injury (i.e., intentional injury [self-harm, assault] and unintentional injury), we conducted stratified subgroup analyses. Additionally, we stratified unintentional injury by mechanism (traffic accident, fall, etc.). Results: During the study period, the average daily YLLs due to injury was 307.5 years. In the intentional injury, YLLs due to self-harm and assault showed positive association with air pollutants. In the unintentional injury, YLLs due to fall, electric current, fire and poisoning showed positive association with air pollutants, whereas YLLs due to traffic accident, mechanical force and drowning/submersion showed negative associations with air pollutants. Conclusions: Injury is recognized as preventable, and effective strategies to create a safe society are important. Therefore, we need to establish strategies to prevent injury and consider air pollutants in this regard.

Numerical study on the characteristics of the flow through injector orifice by multi-block computations (다중블럭계산에 의한 분사기 오리피스 유동특성 해석)

  • Kim, Yeong-Mok
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.21 no.3
    • /
    • pp.414-426
    • /
    • 1997
  • Numerical computations were conducted to characterize the three-dimensional laminar flow through an injector orifice having an inclined angle of 30 .deg.. For this study, the incompressible Navier-Stokes equations in generalized curvilinear coordinates, using a pseudocompressibility approach for continuity equation, were solved. The computations were performed using the finite difference implicit, approximately factored scheme of Beam and Warming and multi-block grids of complete continuity at block interfaces. The multi-block computations were validated for the steady state using direct comparison of multi-block solutions with equivalent single-block ones, including 2-D 180.deg. TAD and 3-D 90.deg. pipe bend. The comparisons between the numerical solutions and the flow field measurements for a tube with sudden contraction were presented in this work for solution validation. Computational results showed the nature of complex flow fields within the inclined injector orifice, including strong pressure-driven secondary flows in the cross stream induced by the effect of streamline curvature. In addition, asymmetric secondary flows were induced in the Reynolds number range above assumed laminar flow regime considered. However, turbulence calculations and grid dependency studies are needed for more accurate computations.

Quantile regression using asymmetric Laplace distribution (비대칭 라플라스 분포를 이용한 분위수 회귀)

  • Park, Hye-Jung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.6
    • /
    • pp.1093-1101
    • /
    • 2009
  • Quantile regression has become a more widely used technique to describe the distribution of a response variable given a set of explanatory variables. This paper proposes a novel modelfor quantile regression using doubly penalized kernel machine with support vector machine iteratively reweighted least squares (SVM-IRWLS). To make inference about the shape of a population distribution, the widely popularregression, would be inadequate, if the distribution is not approximately Gaussian. We present a likelihood-based approach to the estimation of the regression quantiles that uses the asymmetric Laplace density.

  • PDF

Self-Regularization Method for Image Restoration (영상 복원을 위한 자기 정규화 방법)

  • Yoo, Jae-Hung
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.11 no.1
    • /
    • pp.45-52
    • /
    • 2016
  • This paper suggests a new method of finding regularization parameter for image restoration problems. Wiener filter requires priori information such that power spectrums of original image and noise. Constrained least squares restoration also requires knowledge of the noise level. If the prior information is not available, separate optimization functions for Tikhonov regularization parameter are suggested in the literature such as generalized cross validation and L-curve criterion. In this paper, self-regularization method that connects bias term of augmented linear system and smoothing term of Tikhonov regularization is introduced in the frequency domain and applied to the image restoration problems. Experimental results show the effectiveness of the proposed method.

A study on semi-supervised kernel ridge regression estimation (준지도 커널능형회귀모형에 관한 연구)

  • Seok, Kyungha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.2
    • /
    • pp.341-353
    • /
    • 2013
  • In many practical machine learning and data mining applications, unlabeled data are inexpensive and easy to obtain. Semi-supervised learning try to use such data to improve prediction performance. In this paper, a semi-supervised regression method, semi-supervised kernel ridge regression estimation, is proposed on the basis of kernel ridge regression model. The proposed method does not require a pilot estimation of the label of the unlabeled data. This means that the proposed method has good advantages including less number of parameters, easy computing and good generalization ability. Experiments show that the proposed method can effectively utilize unlabeled data to improve regression estimation.

A Unified Bayesian Tikhonov Regularization Method for Image Restoration (영상 복원을 위한 통합 베이즈 티코노프 정규화 방법)

  • Yoo, Jae-Hung
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.11 no.11
    • /
    • pp.1129-1134
    • /
    • 2016
  • This paper suggests a new method of finding regularization parameter for image restoration problems. If the prior information is not available, separate optimization functions for Tikhonov regularization parameter are suggested in the literature such as generalized cross validation and L-curve criterion. In this paper, unified Bayesian interpretation of Tikhonov regularization is introduced and applied to the image restoration problems. The relationship between Tikhonov regularization parameter and Bayesian hyper-parameters is established. Update formular for the regularization parameter using both maximum a posteriori(: MAP) and evidence frameworks is suggested. Experimental results show the effectiveness of the proposed method.

Application of universal kriging for modeling a groundwater level distribution 2. Restricted maximum likelihood method (지하수위 분포 모델링을 위한 UNIVERSAL KRIGING의 응용 2. 제한적 최대 우도법)

  • 정상용
    • The Journal of Engineering Geology
    • /
    • v.3 no.1
    • /
    • pp.51-61
    • /
    • 1993
  • Restricted maximum likelihood(RML) method was used to determine the parameters of generalized covariance, and universal krigig with RML was applied to estimate a groundwater level distribution of nonstationarv random function. Universal kriging with RML was compared to IRF-k with weighted least squares method for the comparison of their accuracies. Cross validation shows that two methods have nearly the same ability for the estimation of groundwater levels. Scattergram of estimates versus true values and contour maps of groundwater levels have nearly the same results. The reason why two methods produced the same results is thought to be the non-Gaussian distribution and the snaall number of sample data.

  • PDF