• Title/Summary/Keyword: Missing data

Search Result 1,303, Processing Time 0.028 seconds

Bayesian Analysis for Categorical Data with Missing Traits Under a Multivariate Threshold Animal Model (다형질 Threshold 개체모형에서 Missing 기록을 포함한 이산형 자료에 대한 Bayesian 분석)

  • Lee, Deuk-Hwan
    • Journal of Animal Science and Technology
    • /
    • v.44 no.2
    • /
    • pp.151-164
    • /
    • 2002
  • Genetic variance and covariance components of the linear traits and the ordered categorical traits, that are usually observed as dichotomous or polychotomous outcomes, were simultaneously estimated in a multivariate threshold animal model with concepts of arbitrary underlying liability scales with Bayesian inference via Gibbs sampling algorithms. A multivariate threshold animal model in this study can be allowed in any combination of missing traits with assuming correlation among the traits considered. Gibbs sampling algorithms as a hierarchical Bayesian inference were used to get reliable point estimates to which marginal posterior means of parameters were assumed. Main point of this study is that the underlying values for the observations on the categorical traits sampled at previous round of iteration and the observations on the continuous traits can be considered to sample the underlying values for categorical data and continuous data with missing at current cycle (see appendix). This study also showed that the underlying variables for missing categorical data should be generated with taking into account for the correlated traits to satisfy the fully conditional posterior distributions of parameters although some of papers (Wang et al., 1997; VanTassell et al., 1998) presented that only the residual effects of missing traits were generated in same situation. In present study, Gibbs samplers for making the fully Bayesian inferences for unknown parameters of interests are played rolls with methodologies to enable the any combinations of the linear and categorical traits with missing observations. Moreover, two kinds of constraints to guarantee identifiability for the arbitrary underlying variables are shown with keeping the fully conditional posterior distributions of those parameters. Numerical example for a threshold animal model included the maternal and permanent environmental effects on a multiple ordered categorical trait as calving ease, a binary trait as non-return rate, and the other normally distributed trait, birth weight, is provided with simulation study.

Structural Relationships Between Fear of Missing Out, SNS-addictive Tendencies, and Depression in Colleges (대학생의 소외에 대한 두려움, SNS 중독경향성과 우울의 구조적 관계에 관한 조사연구)

  • Jnag, Cheul;Kim, In-Seob
    • Journal of The Korean Society of Integrative Medicine
    • /
    • v.10 no.3
    • /
    • pp.151-159
    • /
    • 2022
  • Purpose : The purpose of this study was to investigate the structural relationships between fear of missing out, addictive tendencies toward social network services (SNSs), and depression in colleges. Methods : The target subjects were students in colleges across gyeongnam & busan, to whom the purpose of the study was explained and who spontaneously agreed to participate. A survey was conducted with 302 participants over 31 days from March 7, 2022, and data from 299 responses was analyzed. Results : 1. Women felt a higher fear of missing out than men. 2. Women showed greater inability to control their use of SNSs, more SNS-related disorders in daily life, and greater immersion in and tolerance of SNSs when compared to men. 3. Women were more depressed than men. 4. Positive correlations were observed between the fear of missing out and SNS-addictive tendencies, between the fear of missing out and depression, and between SNS-addictive tendencies and depression. Conclusion : A comprehensive review of these findings suggests that women had overall higher levels of isolation fear, SNS-addictive tendencies, and depression than men. Based on this, universities should provide gender-specific educational programs around these issues; this student cohort will ultimately work in healthcare, and this kind of awareness will be essential for treating patients. Considering that the current situation poses unusual challenges due to the COVID-19 pandemic, the study's results can serve as basic data for planning educational programs in the future. Over the coming years, comprehensive and continuous education and counselling relating to the fear of missing out, SNS addiction, and depression will be urgently required.

A Certification of Linear Programming Method for Estimating Missing Precipitation Values Ungauged (미계측 결측 강수자료 보완을 위한 선형계획법의 검정)

  • Yoo, Ju-Hwan
    • Journal of Korea Water Resources Association
    • /
    • v.43 no.3
    • /
    • pp.257-264
    • /
    • 2010
  • The amount and continuity of precipitation data used in a hydrological analysis may exert a big influence on the reliability of the analysis. It is a fundamental process to estimate the missing data caused by such as a breakdown of the rainfall recording machine or to expand a short period of rainfall data. In this study a linear programming method treated as a data-driven approach for estimating the missing rainfall data is compared with seven other methods widely used and its superiority is certified. The data used in this research are annual precipitation ones during 17 years at the Cheolwon station including an ungauged period of 15 years and its five surrounding stations. By use of this certified method the ungauged precipitation values at the Cheolweon station are estimated and the areal averages of annual precipitation data for 32 years at the Han River basin are calculated.

Missing Hydrological Data Estimation using Neural Network and Real Time Data Reconciliation (신경망을 이용한 결측 수문자료 추정 및 실시간 자료 보정)

  • Oh, Jae-Woo;Park, Jin-Hyeog;Kim, Young-Kuk
    • Journal of Korea Water Resources Association
    • /
    • v.41 no.10
    • /
    • pp.1059-1065
    • /
    • 2008
  • Rainfall data is the most basic input data to analyze the hydrological phenomena and can be missing due to various reasons. In this research, a neural network based model to estimate missing rainfall data as approximate values was developed for 12 rainfall stations in the Soyang river basin to improve existing methods. This approach using neural network has shown to be useful in many applications to deal with complicated natural phenomena and displayed better results compared to the popular offline estimating methods, such as RDS(Reciprocal Distance Squared) method and AMM(Arithmetic Mean Method). Additionally, we proposed automated data reconciliation systems composed of a neural network learning processer to be capable of real-time reconciliation to transmit reliable hydrological data online.

Application of Artificial Neural Networks to the prediction of out-of-plane response of infill walls subjected to shake table

  • Onat, Onur;Gul, Muhammet
    • Smart Structures and Systems
    • /
    • v.21 no.4
    • /
    • pp.521-535
    • /
    • 2018
  • The main purpose of this paper is to predict missing absolute out-of-plane displacements and failure limits of infill walls by artificial neural network (ANN) models. For this purpose, two shake table experiments are performed. These experiments are conducted on a 1:1 scale one-bay one-story reinforced concrete frame (RCF) with an infill wall. One of the experimental models is composed of unreinforced brick model (URB) enclosures with an RCF and other is composed of an infill wall with bed joint reinforcement (BJR) enclosures with an RCF. An artificial earthquake load is applied with four acceleration levels to the URB model and with five acceleration levels to the BJR model. After a certain acceleration level, the accelerometers are detached from the wall to prevent damage to them. The removal of these instruments results in missing data. The missing absolute maximum out-of-plane displacements are predicted with ANN models. Failure of the infill wall in the out-of-plane direction is also predicted at the 0.79 g acceleration level. An accuracy of 99% is obtained for the available data. In addition, a benchmark analysis with multiple regression is performed. This study validates that the ANN-based procedure estimates missing experimental data more accurately than multiple regression models.

Different penalty methods for assessing interval from first to successful insemination in Japanese Black heifers

  • Setiaji, Asep;Oikawa, Takuro
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.32 no.9
    • /
    • pp.1349-1354
    • /
    • 2019
  • Objective: The objective of this study was to determine the best approach for handling missing records of first to successful insemination (FS) in Japanese Black heifers. Methods: Of a total of 2,367 records of heifers born between 2003 and 2015 used, 206 (8.7%) of open heifers were missing. Four penalty methods based on the number of inseminations were set as follows: C1, FS average according to the number of inseminations; C2, constant number of days, 359; C3, maximum number of FS days to each insemination; and C4, average of FS at the last insemination and FS of C2. C5 was generated by adding a constant number (21 d) to the highest number of FS days in each contemporary group. The bootstrap method was used to compare among the 5 methods in terms of bias, mean squared error (MSE) and coefficient of correlation between estimated breeding value (EBV) of non-censored data and censored data. Three percentages (5%, 10%, and 15%) were investigated using the random censoring scheme. The univariate animal model was used to conduct genetic analysis. Results: Heritability of FS in non-censored data was $0.012{\pm}0.016$, slightly lower than the average estimate from the five penalty methods. C1, C2, and C3 showed lower standard errors of estimated heritability but demonstrated inconsistent results for different percentages of missing records. C4 showed moderate standard errors but more stable ones for all percentages of the missing records, whereas C5 showed the highest standard errors compared with noncensored data. The MSE in C4 heritability was $0.633{\times}10^{-4}$, $0.879{\times}10^{-4}$, $0.876{\times}10^{-4}$ and $0.866{\times}10^{-4}$ for 5%, 8.7%, 10%, and 15%, respectively, of the missing records. Thus, C4 showed the lowest and the most stable MSE of heritability; the coefficient of correlation for EBV was 0.88; 0.93 and 0.90 for heifer, sire and dam, respectively. Conclusion: C4 demonstrated the highest positive correlation with the non-censored data set and was consistent within different percentages of the missing records. We concluded that C4 was the best penalty method for missing records due to the stable value of estimated parameters and the highest coefficient of correlation.

Sensitivity analysis of missing mechanisms for the 19th Korean presidential election poll survey (19대 대선 여론조사에서 무응답 메카니즘의 민감도 분석)

  • Kim, Seongyong;Kwak, Dongho
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.1
    • /
    • pp.29-40
    • /
    • 2019
  • Categorical data with non-responses are frequently observed in election poll surveys, and can be represented by incomplete contingency tables. To estimate supporting rates of candidates, the identification of the missing mechanism should be pre-determined because the estimates of non-responses can be changed depending on the assumed missing mechanism. However, it has been shown that it is not possible to identify the missing mechanism when using observed data. To overcome this problem, sensitivity analysis has been suggested. The previously proposed sensitivity analysis can be applicable only to two-way incomplete contingency tables with binary variables. The previous sensitivity analysis is inappropriate to use since more than two of the factors such as region, gender, and age are usually considered in election poll surveys. In this paper, sensitivity analysis suitable to an multi-dimensional incomplete contingency table is devised, and also applied to the 19th Korean presidential election poll survey data. As a result, the intervals of estimates from the sensitivity analysis include actual results as well as estimates from various missing mechanisms. In addition, the properties of the missing mechanism that produce estimates nearest to actual election results are investigated.

A Fast EM Algorithm for Gaussian Mixtures

  • Jung, Hye-Kyung;Seo, Byung-Tae
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.1
    • /
    • pp.157-168
    • /
    • 2012
  • The EM algorithm is the most important tool to obtain the maximum likelihood estimator in finite mixture models due to its stability and simplicity. However, its convergence rate is often slow because the conventional EM algorithm is based on a large missing data space. Several techniques have been proposed in the literature to reduce the missing data space. In this paper, we review existing methods and propose a new EM algorithm for Gaussian mixtures, which reduces the missing data space while preserving the stability of the conventional EM algorithm. The performance of the proposed method is evaluated with other existing methods via simulation studies.

Cluster Analysis of Incomplete Microarray Data with Fuzzy Clustering

  • Kim, Dae-Won
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.3
    • /
    • pp.397-402
    • /
    • 2007
  • In this paper, we present a method for clustering incomplete Microarray data using alternating optimization in which a prior imputation method is not required. To reduce the influence of imputation in preprocessing, we take an alternative optimization approach to find better estimates during iterative clustering process. This method improves the estimates of missing values by exploiting the cluster Information such as cluster centroids and all available non-missing values in each iteration. The clustering results of the proposed method are more significantly relevant to the biological gene annotations than those of other methods, indicating its effectiveness and potential for clustering incomplete gene expression data.

Study on Weather Data Interpolation of a Buoy Based on Machine Learning Techniques (기계 학습을 이용한 항로표지 기상 자료의 보간에 관한 연구)

  • Seong-Hun Jeong;Jun-Ik Ma;Seong-Hyun Jo;Gi-Ryun Lim;Jun-Woo Lee;Jun-Hee Han
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • 2022.06a
    • /
    • pp.72-74
    • /
    • 2022
  • Several types of data are collected from buoy due to the development of hardware technology.. However, the collected data are difficult to use due to errors including missing values and outliers depending on mechanical faults and meteorological environment. Therefore, in this study, linear interpolation is performed by adding the missing time data to enable machine learning to the insufficient meteorological data. After the linear interpolation, XGBoost and KNN-regressor, are used to forecast error data and suggested model is evaluated by using real-world data of a buoy.

  • PDF