• Title/Summary/Keyword: Missing

Search Result 2,817, Processing Time 0.03 seconds

Missing Pattern Analysis of the GOCI-I Optical Satellite Image Data

  • Jeon, Ho-Kun;Cho, Hong Yeon
    • Ocean and Polar Research
    • /
    • v.44 no.2
    • /
    • pp.179-190
    • /
    • 2022
  • Data missing in optical satellite images caused by natural variations have been a crucial barrier in observing the status of marine surfaces. Although there have been many attempts to fill the gaps of non-observation, there is little research to analyze the ratio of missing grids to overall sea grids and their seasonal patterns. This report introduces the method of quantifying the distribution of missing points and then shows how the missing points have spatial correlation and seasonal trends. Both temporal and spatial integration methods are compared to assess the effectiveness of reducing missing data. The temporal integration shows more outstanding performance than the spatial integration. Moran's I and K-function with statistical hypothesis testing show that missing grids are clustered and there is a non-random distribution from daily integration. The result of the seasonality test for Moran's I through a periodogram shows dependency on full-year, half-year, and quarter-year periods respectively. These analysis results can be used to deduce appropriate integration periods with permissible estimation errors.

Imputation of Missing Data Based on Hot Deck Method Using K-nn (K-nn을 이용한 Hot Deck 기반의 결측치 대체)

  • Kwon, Soonchang
    • Journal of Information Technology Services
    • /
    • v.13 no.4
    • /
    • pp.359-375
    • /
    • 2014
  • Researchers cannot avoid missing data in collecting data, because some respondents arbitrarily or non-arbitrarily do not answer questions in studies and experiments. Missing data not only increase and distort standard deviations, but also impair the convenience of estimating parameters and the reliability of research results. Despite widespread use of hot deck, researchers have not been interested in it, since it handles missing data in ambiguous ways. Hot deck can be complemented using K-nn, a method of machine learning, which can organize donor groups closest to properties of missing data. Interested in the role of k-nn, this study was conducted to impute missing data based on the hot deck method using k-nn. After setting up imputation of missing data based on hot deck using k-nn as a study objective, deletion of listwise, mean, mode, linear regression, and svm imputation were compared and verified regarding nominal and ratio data types and then, data closest to original values were obtained reasonably. Simulations using different neighboring numbers and the distance measuring method were carried out and better performance of k-nn was accomplished. In this study, imputation of hot deck was re-discovered which has failed to attract the attention of researchers. As a result, this study shall be able to help select non-parametric methods which are less likely to be affected by the structure of missing data and its causes.

Comparison of missing data methods in clustered survival data using Bayesian adaptive B-Spline estimation

  • Yoo, Hanna;Lee, Jae Won
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.2
    • /
    • pp.159-172
    • /
    • 2018
  • In many epidemiological studies, missing values in the outcome arise due to censoring. Such censoring is what makes survival analysis special and differentiated from other analytical methods. There are many methods that deal with censored data in survival analysis. However, few studies have dealt with missing covariates in survival data. Furthermore, studies dealing with missing covariates are rare when data are clustered. In this paper, we conducted a simulation study to compare results of several missing data methods when data had clustered multi-structured type with missing covariates. In this study, we modeled unknown baseline hazard and frailty with Bayesian B-Spline to obtain more smooth and accurate estimates. We also used prior information to achieve more accurate results. We assumed the missing mechanism as MAR. We compared the performance of five different missing data techniques and compared these results through simulation studies. We also presented results from a Multi-Center study of Korean IBD patients with Crohn's disease(Lee et al., Journal of the Korean Society of Coloproctology, 28, 188-194, 2012).

Pattern-Mixture Model of the Cox Proportional Hazards Model with Missing Binary Covariates (결측이 있는 이산형 공변량에 대한 Cox비례위험모형의 패턴-혼합 모델)

  • Youk, Tae-Mi;Song, Ju-Won
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.2
    • /
    • pp.279-291
    • /
    • 2012
  • When fitting a Cox proportional hazards model with missing covariates, it is inefficient to exclude observations with missing values in the analysis. Furthermore, if the missing-data mechanism is not Missing Completely At Random(MCAR), it may lead to biased parameter estimation. Many approaches have been suggested to handle the Cox proportional hazards model when covariates are sometimes missing, but they are based on the selection model. This paper suggest an approach to handle Cox proportional hazards model with missing covariates by using the pattern-mixture model (Little, 1993). The pattern-mixture model is expressed by the joint distribution of survival time and the missing-data mechanism. In the pattern-mixture model, many models can be considered by setting up various restrictions, and different results under various restrictions indicate the sensitivity of the model due to missing covariates. A simulation study was conducted to show the sensitivity of parameter estimation under different restrictions in a pattern-mixture model. The proposed approach was also applied to mouse leukemia data.

Proposal for enhancement of managing missing cases: through analysis of newspaper articles (실종 대응체계 개선방안에 관한 연구: 언론기사분석을 중심으로)

  • Lee, Young-Lim;Lee, Kwon Cheol
    • Journal of Digital Convergence
    • /
    • v.18 no.4
    • /
    • pp.91-100
    • /
    • 2020
  • The purpose of this article was to propose improvement of countermeasure for missing person cases. While current related other studies examined practice of the countermeasure from inner viewpoint of police authorities itself, this study focused on and analyzed concern or criticism expressed through mass media, outside of the authorities. For this purpose, we analyzed newspaper articles dealing with missing person issued during past 5 years with qualitative data analysis software. The analysis revealed that civil community demands immediacy of coping with missing, phased expertise, systemicity of the countermeasure, improvement of relating policy, and active liaison with community. According to the needs, we proposed advancing risk assessment procedure, enlarging dedicated team for missing case, improving profiling-input system, adding duty for family of missing person, and enhancing of function of control tower.

MissingFound: An Assistant System for Finding Missing Companions via Mobile Crowdsourcing

  • Liu, Weiqing;Li, Jing;Zhou, Zhiqiang;He, Jiling
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.10
    • /
    • pp.4766-4786
    • /
    • 2016
  • Looking for missing companions who are out of touch in public places might suffer a long and painful process. With the help of mobile crowdsourcing, the missing person's location may be reported in a short time. In this paper, we propose MissingFound, an assistant system that applies mobile crowdsourcing for finding missing companions. Discovering valuable users who have chances to see the missing person is the most important task of MissingFound but also a big challenge with the requirements of saving battery and protecting users' location privacy. A customized metric is designed to measure the probability of seeing, according to users' movement traces represented by WiFi RSSI fingerprints. Since WiFi RSSI fingerprints provide no knowledge of users' physical locations, the computation of probability is too complex for practical use. By parallelizing the original sequential algorithms under MapReduce framework, the selecting process can be accomplished within a few minutes for 10 thousand users with records of several days. Experimental evaluation with 23 volunteers shows that MissingFound can select out the potential witnesses in reality and achieves a high accuracy (76.75% on average). We believe that MissingFound can help not only find missing companions, but other public services (e.g., controlling communicable diseases).

Veri cation of Improving a Clustering Algorith for Microarray Data with Missing Values

  • Kim, Su-Young
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.2
    • /
    • pp.315-321
    • /
    • 2011
  • Gene expression microarray data often include multiple missing values. Most gene expression analysis (including gene clustering analysis); however, require a complete data matric as an input. In ordinary clustering methods, just a single missing value makes one abandon the whole data of a gene even if the rest of data for that gene was intact. The quality of analysis may decrease seriously as the missing rate is increased. In the opposite aspect, the imputation of missing value may result in an artifact that reduces the reliability of the analysis. To clarify this contradiction in microarray clustering analysis, this paper compared the accuracy of clustering with and without imputation over several microarray data having different missing rates. This paper also tested the clustering efficiency of several imputation methods including our propose algorithm. The results showed it is worthwhile to check the clustering result in this alternative way without any imputed data for the imperfect microarray data.

Comparison of binary data imputation methods in clinical trials (임상시험에서 이분형 결측치 처리방법의 비교연구)

  • An, Koosung;Kim, Dongjae
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.3
    • /
    • pp.539-547
    • /
    • 2016
  • We discussed how to handle missing binary data clinical trials. Patterns of occurring missing data are discussed and introduce missing binary data imputation methods that include the modified method. A simulation is performed by modifying actual data for each method. The condition of this simulation is controlled by a response rate and a missing value rate. We list the simulation results for each method and discussed them at the end of this paper.

On statistical Computing via EM Algorithm in Logistic Linear Models Involving Non-ignorable Missing data

  • Jun, Yu-Na;Qian, Guoqi;Park, Jeong-Soo
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2005.11a
    • /
    • pp.181-186
    • /
    • 2005
  • Many data sets obtained from surveys or medical trials often include missing observations. When these data sets are analyzed, it is general to use only complete cases. However, it is possible to have big biases or involve inefficiency. In this paper, we consider a method for estimating parameters in logistic linear models involving non-ignorable missing data mechanism. A binomial response and normal exploratory model for the missing data are used. We fit the model using the EM algorithm. The E-step is derived by Metropolis-hastings algorithm to generate a sample for missing data and Monte-carlo technique, and the M-step is by Newton-Raphson to maximize likelihood function. Asymptotic variances of the MLE's are derived and the standard error and estimates of parameters are compared.

  • PDF

Comparison of Shape Variability in Principal Component Biplot with Missing Values

  • Shin, Sang-Min;Choi, Yong-Seok;Lee, Nae-Young
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.6
    • /
    • pp.1109-1116
    • /
    • 2008
  • Biplots are the multivariate analogue of scatter plots. They are useful for giving a graphical description of the data matrix, for detecting patterns and for displaying results found by more formal methods of analysis. Nevertheless, when some values are missing in data matrix, most biplots are not directly applicable. In particular, we are interested in the shape variability of principal component biplot which is the most popular in biplots with missing values. For this, we estimate the missing data using the EM algorithm and mean imputation according to missing rates. Even though we estimate missing values of biplot of incomplete data, we have different shapes of biplots according to the imputation methods and missing rates. Therefore we propose a RMS(root mean square) for measuring and comparing the shape variability between the original biplots and the estimated biplots.