• Title/Summary/Keyword: Statistics data

Search Result 13,842, Processing Time 0.035 seconds

Zooming Statistics: Inference across scales

  • Hannig, Jan;Marron, J.S.;Riedi, R.H.
    • Journal of the Korean Statistical Society
    • /
    • v.30 no.2
    • /
    • pp.327-345
    • /
    • 2001
  • New statistical methods are ended to analyzed data in a multi-scale way. Some multi-scale extensions of stand methods, including novel visualization using dynamic graphics are proposed. These tools are used to explore non-standard structure in internet traffic data.

  • PDF

Jackknife Estimation for Mean in Exponential Model with Grouped and Censored Data

  • Kil Ho Cho;Yong Ku Kim;Seong Kwa Jeong
    • Communications for Statistical Applications and Methods
    • /
    • v.5 no.3
    • /
    • pp.869-878
    • /
    • 1998
  • In this paper, we propose some jackknife estimators for mean in the exponential model with grouped and censored data. Also, we compare the proposed jackknife estimators to other approximate estimators in terms of the mean square error and bias.

  • PDF

Test for the Presence of Seasonality in Time Series Models

  • Lee, Sung-Duck
    • Journal of the Korean Data and Information Science Society
    • /
    • v.12 no.1
    • /
    • pp.71-78
    • /
    • 2001
  • Three test statistics are proposed for the presence of seasonality in multiplicative seasonal time series models. Further their common limiting distribution is derived under some assumptions.

  • PDF

The Teaching of Statistics using Excel VBA

  • Choi, Hyun-Seok
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.3
    • /
    • pp.811-820
    • /
    • 2006
  • We introduce a program that enhances the interest and understanding of students in Statistics. This program explains various statistical concepts and procedures by showing detailed steps of calculations with graphs and simulations. This program utilizes a readily accessible Excel VBA.

  • PDF

A comparison of synthetic data approaches using utility and disclosure risk measures (유용성과 노출 위험성 지표를 이용한 재현자료 기법 비교 연구)

  • Seongbin An;Trang Doan;Juhee Lee;Jiwoo Kim;Yong Jae Kim;Yunji Kim;Changwon Yoon;Sungkyu Jung;Dongha Kim;Sunghoon Kwon;Hang J Kim;Jeongyoun Ahn;Cheolwoo Park
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.2
    • /
    • pp.141-166
    • /
    • 2023
  • This paper investigates synthetic data generation methods and their evaluation measures. There have been increasing demands for releasing various types of data to the public for different purposes. At the same time, there are also unavoidable concerns about leaking critical or sensitive information. Many synthetic data generation methods have been proposed over the years in order to address these concerns and implemented in some countries, including Korea. The current study aims to introduce and compare three representative synthetic data generation approaches: Sequential regression, nonparametric Bayesian multiple imputations, and deep generative models. Several evaluation metrics that measure the utility and disclosure risk of synthetic data are also reviewed. We provide empirical comparisons of the three synthetic data generation approaches with respect to various evaluation measures. The findings of this work will help practitioners to have a better understanding of the advantages and disadvantages of those synthetic data methods.

Generating censored data from Cox proportional hazards models (Cox 비례위험모형을 따르는 중도절단자료 생성)

  • Kim, Ji-Hyun;Kim, Bongseong
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.6
    • /
    • pp.761-769
    • /
    • 2018
  • Simulations are important for survival analyses that deal with censored data. Cox models are widely used in survival analyses, therefore, we investigate how to generate censored data that can simulate the Cox model. Bender et al. (Statistics in Medicine, 24, 1713-1723, 2005) provided a parametric method for generating survival times, but we need to generate censoring times as well as survival times to simulate the censored data. In addition to the parametric method for generating censored data, a nonparametric method is also proposed and applied to a real data set.

Investigations into Coarsening Continuous Variables

  • Jeong, Dong-Myeong;Kim, Jay-J.
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.2
    • /
    • pp.325-333
    • /
    • 2010
  • Protection against disclosure of survey respondents' identifiable and/or sensitive information is a prerequisite for statistical agencies that release microdata files from their sample surveys. Coarsening is one of popular methods for protecting the confidentiality of the data. Grouped data can be released in the form of microdata or tabular data. Instead of releasing the data in a tabular form only, having microdata available to the public with interval codes with their representative values greatly enhances the utility of the data. It allows the researchers to compute covariance between the variables and build statistical models or to run a variety of statistical tests on the data. It may be conjectured that the variance of the interval data is lower that of the ungrouped data in the sense that the coarsened data do not have the within interval variance. This conjecture will be investigated using the uniform and triangular distributions. Traditionally, midpoint is used to represent all the values in an interval. This approach implicitly assumes that the data is uniformly distributed within each interval. However, this assumption may not hold, especially in the last interval of the economic data. In this paper, we will use three distributional assumptions - uniform, Pareto and lognormal distribution - in the last interval and use either midpoint or median for other intervals for wage and food costs of the Statistics Korea's 2006 Household Income and Expenditure Survey(HIES) data and compare these approaches in terms of the first two moments.

Statistical Properties of News Coverage Data

  • Lim, Eunju;Hahn, Kyu S.;Lim, Johan;Kim, Myungsuk;Park, Jeongyeon;Yoon, Jihee
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.6
    • /
    • pp.771-780
    • /
    • 2012
  • In the current analysis, we examine news coverage data widely used in media studies. News coverage data is usually time series data to capture the volume or the tone of the news media's coverage of a topic. We first describe the distributional properties of autoregressive conditionally heteroscadestic(ARCH) effects and compare two major American newspaper's coverage of U.S.-North Korea relations. Subsequently, we propose a change point detection model and apply it to the detection of major change points in the tone of American newspaper coverage of U.S.-North Korea relations.

Testing and Adjustment for Inhomogeneity Temperature Series Using the SNHT Method

  • Lee, Yung-Seop;Kim, Hee-Kyung;Lee, Jung-In;Lee, Jae-Won;Kim, Hee-Soo
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.6
    • /
    • pp.977-985
    • /
    • 2012
  • Data quality and climate forecasting performance deteriorates because of long climate data contaminated by non-climatic factors such as the station relocation or new instrument replacement. For a trusted climate forecast, it is necessary to implement data quality control and test inhomogeneous data. Before the inhomogeneity test, a reference series was created by $d$ index to measure the temperature series relationship between the candidate and surrounding stations. In this study, a inhomogeneity test to each season and climatological station was performed on the daily mean temperatures, daily minimum temperatures and daily maximum temperatures. After comparing two inhomogeneity tests, the traditional and the adjusted SNHT method, we found the adjusted SNHT method was slightly superior to the traditional one.