• Title/Summary/Keyword: Rank regression

Search Result 291, Processing Time 0.023 seconds

The Representation of Cancer Risk by Korean Health Journalism: Comparing the Crude Rates of 10 Cancers to the Amount of Cancer News in the Three Major Newspapers(1990-2010) (10대암 조발생률과 신문 보도량의 비교: 3대 일간지 보도(1990년~2010년)를 중심으로)

  • Ju, Youngkee;Jeong, Da-Eun;You, Myoungsoon
    • Korean Journal of Health Education and Promotion
    • /
    • v.30 no.5
    • /
    • pp.201-210
    • /
    • 2013
  • Objectives: The public relies on the news media to understand health risks. To examine the surveillance function of Korean health journalism, this study compared the rank-order of the 10 most frequently diagnosed cancers with that of the 10 cancers most frequently covered by three major Korean newspapers. Methods: News stories published between 1999 and 2010 by the Chosun-Ilbo, Joong-Ang-Ilbo, and Dong-A-Ilbo were examined. Data on cancer incidence were collected using the epidemiological data published by a governmental public health institution. To compare the level of the crude rates and the amount of news coverage, rank-order correlation tests and regression analyses were employed. Results: A reduction in the rank-ordered correlation coefficient was observed despite an increase in the overall number of cancer news stories released. The significance of the correlation disappeared after 2006. The big difference of the rank order between the crude rate and the amount of news coverage was observed in the cancer of breast, uteri, thyroid, and gallbladder/biliary. Finally, the three newspapers did not follow the amount change in stomach, lung, liver, and uterine cervix cancer. The four cancers' rank orders of crude rate were lowering, signifying a reduction of the comparative dangerousness of the four cancers. Conclusions: The news media's customization of news content and the negative bias in journalism are suggested as possible influences on the news media's inaccurate representation of cancer risk.

An improvement of estimators for the multinormal mean vector with the known norm

  • Kim, Jaehyun;Baek, Hoh Yoo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.2
    • /
    • pp.435-442
    • /
    • 2017
  • Consider the problem of estimating a $p{\times}1$ mean vector ${\theta}$ (p ${\geq}$ 3) under the quadratic loss from multi-variate normal population. We find a James-Stein type estimator which shrinks towards the projection vectors when the underlying distribution is that of a variance mixture of normals. In this case, the norm ${\parallel}{\theta}-K{\theta}{\parallel}$ is known where K is a projection vector with rank(K) = q. The class of this type estimator is quite general to include the class of the estimators proposed by Merchand and Giri (1993). We can derive the class and obtain the optimal type estimator. Also, this research can be applied to the simple and multiple regression model in the case of rank(K) ${\geq}2$.

Marginal Likelihoods for Bayesian Poisson Regression Models

  • Kim, Hyun-Joong;Balgobin Nandram;Kim, Seong-Jun;Choi, Il-Su;Ahn, Yun-Kee;Kim, Chul-Eung
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.2
    • /
    • pp.381-397
    • /
    • 2004
  • The marginal likelihood has become an important tool for model selection in Bayesian analysis because it can be used to rank the models. We discuss the marginal likelihood for Poisson regression models that are potentially useful in small area estimation. Computation in these models is intensive and it requires an implementation of Markov chain Monte Carlo (MCMC) methods. Using importance sampling and multivariate density estimation, we demonstrate a computation of the marginal likelihood through an output analysis from an MCMC sampler.

Unsupervised feature selection using orthogonal decomposition and low-rank approximation

  • Lim, Hyunki
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.5
    • /
    • pp.77-84
    • /
    • 2022
  • In this paper, we propose a novel unsupervised feature selection method. Conventional unsupervised feature selection method defines virtual label and uses a regression analysis that projects the given data to this label. However, since virtual labels are generated from data, they can be formed similarly in the space. Thus, in the conventional method, the features can be selected in only restricted space. To solve this problem, in this paper, features are selected using orthogonal projections and low-rank approximations. To solve this problem, in this paper, a virtual label is projected to orthogonal space and the given data set is also projected to this space. Through this process, effective features can be selected. In addition, projection matrix is restricted low-rank to allow more effective features to be selected in low-dimensional space. To achieve these objectives, a cost function is designed and an efficient optimization method is proposed. Experimental results for six data sets demonstrate that the proposed method outperforms existing conventional unsupervised feature selection methods in most cases.

A Comparison of Construction Cost Estimation Using Multiple Regression Analysis and Neural Network in Elementary School Project

  • Cho, Hong-Gyu;Kim, Kyong-Gon;Kim, Jang-Young;Kim, Gwang-Hee
    • Journal of the Korea Institute of Building Construction
    • /
    • v.13 no.1
    • /
    • pp.66-74
    • /
    • 2013
  • In the early stages of a construction project, the most important thing is to predict construction costs in a rational way. For this reason, many studies have been performed on the estimation of construction costs for apartment housing and office buildings at early stage using artificial intelligence, statistics, and the like. In this study, cost data held by a provincial Office of Education on elementary schools constructed from 2004 to 2007 were used to compare the multiple regression model with an artificial neural network model. A total of 96 historical data were classified into 76 historical data for constructing models and 20 historical data for comparing the constructed regression model with the artificial neural network model. The results of an analysis of predicted construction costs were that the error rate of the artificial neural network model is lower than that of the multiple regression model.

Development of Evaluation Model for Black Spot Improvement Priorities by using Emperical Bayes Method (EB기법을 이용한 사고잦은 곳 개선사업 우선순위 판정기법 개발)

  • Jeong, Seong-Bong;Hwang, Bo-Hui;Seong, Nak-Mun;Lee, Seon-Ha
    • Journal of Korean Society of Transportation
    • /
    • v.27 no.3
    • /
    • pp.81-90
    • /
    • 2009
  • The safety management of a road network comprises four basic inter-related components:identification of sites(black spot) requiring safety investigation, diagnosis of safety problems, selection of feasible treatments for potential treatment candidates, and prioritization of treatments given limited budgets(Persaud, 2001). Identification process of selecting black spot is very important for efficient investigation of sites. In this study, the accident prediction model for EB method was developed by using accident data and geometric conditions of black spots selected from four-leg signalized intersections in In-cheon City for three years (2004-2006). In addition, by comparing the rank nomination technique using EB method to that by using accident counts, we managed to show the problems which the existing method have and the necessity for developing rational prediction model. As a result, in terms of total number of accidents, both the counts predicted by existing non-linear regression model and that by EB method have high good of fitness, but EB method, considering both the accident counts by sites and total number of accident, has better good of fitness than non-linear poison model. According to the result of the comparison of ranks nominated for treatment between two methods, the rank for treatment of almost sites does not change but SeoHae intersection and a few other intersections have significant changes in their rank. This shows that, with the technique proposed in the study, the RTM problem caused by using real accident counts can be overcome.

Relationship of earnings and credit rating before and after IFRS (IFRS 전후 이익조정과 신용평가등급의 관계)

  • An, Kyung-Su;Kim, Kwang-Yong
    • Journal of Digital Convergence
    • /
    • v.12 no.11
    • /
    • pp.99-112
    • /
    • 2014
  • This study the impact on the real earnings management credit rating (RANK), and looked at the impact on the real earnings management grade credit rating changes (decrease, increase) the effects in detail. firm for a total of 06 years for firm that are listed on the Korea Stock Exchange from 2008 to 2013 for the hypothesis - using the proceeds of the year 2,583 sample were analyzed to study. A regression analysis of the relevance of the credit rating (RANK) and real earnings measured results between the credit rating and a measure of real earnings management ACFO and ADE (+) between AMC (-) IFRS and receive relevant ADE between(+) between AMC (-) if the credit rating (RANK) is increased ACFO and is significantly sound level at 1% showed the relevance of (+) did not significantly ADE (+) 10% of AMC if the credit rating fell ACFO is (-) from AMC show the relevance of positive credit rating is dropped capital letter showed for performing real earnings management of positive even give up the future cash flow in order to reduce the cost.

Efficiency Benchmarking of Hospitals Using DEA (DEA를 이용한 의료기관의 효율성 벤치마킹)

  • Seo, Su-Kyong;Kwon, Soon-Man
    • Korea Journal of Hospital Management
    • /
    • v.5 no.1
    • /
    • pp.84-104
    • /
    • 2000
  • This paper analyzes the technical efficiency of thirty two hospitals in Korea using DEA(Data Envelopment Analysis). DEA provides an efficiency measure for each hospital compared to the most efficient one. The amount and sources of inefficiency that are identified by the DEA are useful for benchmarking to improve efficiency. The results from multiple regression analysis and Wilcoxon Rank Sum test show that bed turnover, hospital size, and average length of stay are related to hospital efficiency.

  • PDF

Regression Trees with. Unbiased Variable Selection (변수선택 편향이 없는 회귀나무를 만들기 위한 알고리즘)

  • 김진흠;김민호
    • The Korean Journal of Applied Statistics
    • /
    • v.17 no.3
    • /
    • pp.459-473
    • /
    • 2004
  • It has well known that an exhaustive search algorithm suggested by Breiman et. a1.(1984) has a trend to select the variable having relatively many possible splits as an splitting rule. We propose an algorithm to overcome this variable selection bias problem and then construct unbiased regression trees based on the algorithm. The proposed algorithm runs two steps of selecting a split variable and determining a split rule for binary split based on the split variable. Simulation studies were performed to compare the proposed algorithm with Breiman et a1.(1984)'s CART(Classification and Regression Tree) in terms of degree of variable selection bias, variable selection power, and MSE(Mean Squared Error). Also, we illustrate the proposed algorithm with real data sets.

A cautionary note on the use of Cook's distance

  • Kim, Myung Geun
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.3
    • /
    • pp.317-324
    • /
    • 2017
  • An influence measure known as Cook's distance has been used for judging the influence of each observation on the least squares estimate of the parameter vector. The distance does not reflect the distributional property of the change in the least squares estimator of the regression coefficients due to case deletions: the distribution has a covariance matrix of rank one and thus it has a support set determined by a line in the multidimensional Euclidean space. As a result, the use of Cook's distance may fail to correctly provide information about influential observations, and we study some reasons for the failure. Three illustrative examples will be provided, in which the use of Cook's distance fails to give the right information about influential observations or it provides the right information about the most influential observation. We will seek some reasons for the wrong or right provision of information.