• Title/Summary/Keyword: missing value

Search Result 308, Processing Time 0.027 seconds

Measurement of missing video frames in NPP control room monitoring system using Kalman filter

  • Mrityunjay Chaubey;Lalit Kumar Singh;Manjari Gupta
    • Nuclear Engineering and Technology
    • /
    • v.55 no.1
    • /
    • pp.37-44
    • /
    • 2023
  • Using the Kalman filtering technique, we propose a novel method for estimating the missing video frames to monitor the activities inside the control room of a nuclear power plant (NPP). The purpose of this study is to reinforce the existing security and safety procedures in the control room of an NPP. The NPP control room serves as the nervous system of the plant, with instrumentation and control systems used to monitor and control critical plant parameters. Because the safety and security of the NPP control room are critical, it must be monitored closely by security cameras in order to assess and reduce the onset of any incidents and accidents that could adversely impact the safety of the NPP. However, for a variety of technical and administrative reasons, continuous monitoring may be interrupted. Because of the interruption, one or more frames of the video may be distorted or missing, making it difficult to identify the activity during this time period. This could endanger overall safety. The demonstrated Kalman filter model estimates the value of the missing frame pixel-by-pixel using information from the frame that occurred in the video sequence before it and the frame that will occur in the video sequence after it. The results of the experiment provide evidence of the effectiveness of the algorithm.

Exploiting Patterns for Handling Incomplete Coevolving EEG Time Series

  • Thi, Ngoc Anh Nguyen;Yang, Hyung-Jeong;Kim, Sun-Hee
    • International Journal of Contents
    • /
    • v.9 no.4
    • /
    • pp.1-10
    • /
    • 2013
  • The electroencephalogram (EEG) time series is a measure of electrical activity received from multiple electrodes placed on the scalp of a human brain. It provides a direct measurement for characterizing the dynamic aspects of brain activities. These EEG signals are formed from a series of spatial and temporal data with multiple dimensions. Missing data could occur due to fault electrodes. These missing data can cause distortion, repudiation, and further, reduce the effectiveness of analyzing algorithms. Current methodologies for EEG analysis require a complete set of EEG data matrix as input. Therefore, an accurate and reliable imputation approach for missing values is necessary to avoid incomplete data sets for analyses and further improve the usage of performance techniques. This research proposes a new method to automatically recover random consecutive missing data from real world EEG data based on Linear Dynamical System. The proposed method aims to capture the optimal patterns based on two main characteristics in the coevolving EEG time series: namely, (i) dynamics via discovering temporal evolving behaviors, and (ii) correlations by identifying the relationships between multiple brain signals. From these exploits, the proposed method successfully identifies a few hidden variables and discovers their dynamics to impute missing values. The proposed method offers a robust and scalable approach with linear computation time over the size of sequences. A comparative study has been performed to assess the effectiveness of the proposed method against interpolation and missing values via Singular Value Decomposition (MSVD). The experimental simulations demonstrate that the proposed method provides better reconstruction performance up to 49% and 67% improvements over MSVD and interpolation approaches, respectively.

Symptom Pattern Classification using Neural Networks in the Ubiquitous Healthcare Environment with Missing Values (손실 값을 갖는 유비쿼터스 헬스케어 환경에서 신경망을 이용한 에이전트 기반 증상 패턴 분류)

  • Salvo, Michael Angelo G.;Lee, Jae-Wan;Lee, Mal-Rey
    • Journal of Internet Computing and Services
    • /
    • v.11 no.2
    • /
    • pp.129-142
    • /
    • 2010
  • The ubiquitous healthcare environment is one of the systems that benefit from wireless sensor network. But one of the challenges with wireless sensor network is its high loss rates when transmitting data. Data from the biosensors may not reach the base stations which can result in missing values. This paper proposes the Health Monitor Agent (HMA) to gather data from the base stations, predict missing values, classify symptom patterns into medical conditions, and take appropriate action in case of emergency. This agent is applied in the Ubiquitous Healthcare Environment and uses data from the biosensors and from the patient’s medical history as symptom patterns to recognize medical conditions. In the event of missing data, the HMA uses a predictive algorithm to fill missing values in the symptom patterns before classification. Simulation results show that the predictive algorithm using the HMA makes classification of the symptom patterns more accurate than other methods.

Different penalty methods for assessing interval from first to successful insemination in Japanese Black heifers

  • Setiaji, Asep;Oikawa, Takuro
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.32 no.9
    • /
    • pp.1349-1354
    • /
    • 2019
  • Objective: The objective of this study was to determine the best approach for handling missing records of first to successful insemination (FS) in Japanese Black heifers. Methods: Of a total of 2,367 records of heifers born between 2003 and 2015 used, 206 (8.7%) of open heifers were missing. Four penalty methods based on the number of inseminations were set as follows: C1, FS average according to the number of inseminations; C2, constant number of days, 359; C3, maximum number of FS days to each insemination; and C4, average of FS at the last insemination and FS of C2. C5 was generated by adding a constant number (21 d) to the highest number of FS days in each contemporary group. The bootstrap method was used to compare among the 5 methods in terms of bias, mean squared error (MSE) and coefficient of correlation between estimated breeding value (EBV) of non-censored data and censored data. Three percentages (5%, 10%, and 15%) were investigated using the random censoring scheme. The univariate animal model was used to conduct genetic analysis. Results: Heritability of FS in non-censored data was $0.012{\pm}0.016$, slightly lower than the average estimate from the five penalty methods. C1, C2, and C3 showed lower standard errors of estimated heritability but demonstrated inconsistent results for different percentages of missing records. C4 showed moderate standard errors but more stable ones for all percentages of the missing records, whereas C5 showed the highest standard errors compared with noncensored data. The MSE in C4 heritability was $0.633{\times}10^{-4}$, $0.879{\times}10^{-4}$, $0.876{\times}10^{-4}$ and $0.866{\times}10^{-4}$ for 5%, 8.7%, 10%, and 15%, respectively, of the missing records. Thus, C4 showed the lowest and the most stable MSE of heritability; the coefficient of correlation for EBV was 0.88; 0.93 and 0.90 for heifer, sire and dam, respectively. Conclusion: C4 demonstrated the highest positive correlation with the non-censored data set and was consistent within different percentages of the missing records. We concluded that C4 was the best penalty method for missing records due to the stable value of estimated parameters and the highest coefficient of correlation.

Cluster Analysis of Incomplete Microarray Data with Fuzzy Clustering

  • Kim, Dae-Won
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.3
    • /
    • pp.397-402
    • /
    • 2007
  • In this paper, we present a method for clustering incomplete Microarray data using alternating optimization in which a prior imputation method is not required. To reduce the influence of imputation in preprocessing, we take an alternative optimization approach to find better estimates during iterative clustering process. This method improves the estimates of missing values by exploiting the cluster Information such as cluster centroids and all available non-missing values in each iteration. The clustering results of the proposed method are more significantly relevant to the biological gene annotations than those of other methods, indicating its effectiveness and potential for clustering incomplete gene expression data.

A Naive Multiple Imputation Method for Ignorable Nonresponse

  • Lee, Seung-Chun
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.2
    • /
    • pp.399-411
    • /
    • 2004
  • A common method of handling nonresponse in sample survey is to delete the cases, which may result in a substantial loss of cases. Thus in certain situation, it is of interest to create a complete set of sample values. In this case, a popular approach is to impute the missing values in the sample by the mean or the median of responders. The difficulty with this method which just replaces each missing value with a single imputed value is that inferences based on the completed dataset underestimate the precision of the inferential procedure. Various suggestions have been made to overcome the difficulty but they might not be appropriate for public-use files where the user has only limited information for about the reasons for nonresponse. In this note, a multiple imputation method is considered to create complete dataset which might be used for all possible inferential procedures without misleading or underestimating the precision.

A Research on the Energy Data Analysis using Machine Learning (머신러닝 기법을 활용한 에너지 데이터 분석에 관한 연구)

  • Kim, Dongjoo;Kwon, Seongchul;Moon, Jonghui;Sim, Gido;Bae, Moonsung
    • KEPCO Journal on Electric Power and Energy
    • /
    • v.7 no.2
    • /
    • pp.301-307
    • /
    • 2021
  • After the spread of the data collection devices such as smart meters, energy data is increasingly collected in a variety of ways, and its importance continues to grow. However, due to technical or practical limitations, errors such as missing or outliers in the data occur during data collection process. Especially in the case of customer-related data, billing problems may occur, so energy companies are conducting various research to process such data. In addition, efforts are being made to create added value from data, which makes it difficult to provide such services unless reliability of data is guaranteed. In order to solve these challenges, this research analyzes prior research related to bad data processing specifically in the energy field, and propose new missing value processing methods to improve the reliability and field utilization of energy data.

A Sparse Data Preprocessing Using Support Vector Regression (Support Vector Regression을 이용한 희소 데이터의 전처리)

  • Jun, Sung-Hae;Park, Jung-Eun;Oh, Kyung-Whan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.6
    • /
    • pp.789-792
    • /
    • 2004
  • In various fields as web mining, bioinformatics, statistical data analysis, and so forth, very diversely missing values are found. These values make training data to be sparse. Largely, the missing values are replaced by predicted values using mean and mode. We can used the advanced missing value imputation methods as conditional mean, tree method, and Markov Chain Monte Carlo algorithm. But general imputation models have the property that their predictive accuracy is decreased according to increase the ratio of missing in training data. Moreover the number of available imputations is limited by increasing missing ratio. To settle this problem, we proposed statistical learning theory to preprocess for missing values. Our statistical learning theory is the support vector regression by Vapnik. The proposed method can be applied to sparsely training data. We verified the performance of our model using the data sets from UCI machine learning repository.

A study to improve the accuracy of the naive propensity score adjusted estimator using double post-stratification method (나이브 성향점수보정 추정량의 정확성 향상을 위한 이중 사후층화 방법 연구)

  • Leesu Yeo;Key-Il Shin
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.6
    • /
    • pp.547-559
    • /
    • 2023
  • Proper handling of nonresponse in sample survey improves the accuracy of the parameter estimation. Various studies have been conducted to properly handle MAR (missing at random) nonresponse or MCAR (missing completely at random) nonresponse. When nonresponse occurs, the PSA (propensity score adjusted) estimator is commonly used as a mean estimator. The PSA estimator is known to be unbiased when known sample weights and properly estimated response probabilities are used. However, for MNAR (missing not at random) nonresponse, which is affected by the value of the study variable, since it is very difficult to obtain accurate response probabilities, bias may occur in the PSA estimator. Chung and Shin (2017, 2022) proposed a post-stratification method to improve the accuracy of mean estimation when MNAR nonresponse occurs under a non-informative sample design. In this study, we propose a double post-stratification method to improve the accuracy of the naive PSA estimator for MNAR nonresponse under an informative sample design. In addition, we perform simulation studies to confirm the superiority of the proposed method.

On the Use of Sequential Adaptive Nearest Neighbors for Missing Value Imputation (순차 적응 최근접 이웃을 활용한 결측값 대치법)

  • Park, So-Hyun;Bang, Sung-Wan;Jhun, Myoung-Shic
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.6
    • /
    • pp.1249-1257
    • /
    • 2011
  • In this paper, we propose a Sequential Adaptive Nearest Neighbor(SANN) imputation method that combines the Adaptive Nearest Neighbor(ANN) method and the Sequential k-Nearest Neighbor(SKNN) method. When choosing the nearest neighbors of missing observations, the proposed SANN method takes the local feature of the missing observations into account as well as reutilizes the imputed observations in a sequential manner. By using a Monte Carlo study and a real data example, we demonstrate the characteristics of the SANN method and its potential performance.