• Title/Summary/Keyword: missing data

Search Result 1,260, Processing Time 0.036 seconds

A mathematical model to recover missing monitoring data of foundation pit

  • Liu, Jiangang;Zhou, Dongdong;Liu, Kewen
    • Geomechanics and Engineering
    • /
    • v.9 no.3
    • /
    • pp.275-286
    • /
    • 2015
  • A new method is presented to recover missing deformation data of lateral walls of foundation pit when the monitoring is interrupted; the method is called Dynamic Mathematical Model - Parameter Interpolation. The deformation of lateral walls of foundation pit is mainly affected by the type of supporting structure and the situation of constraints, therefore, this paper mainly studies the two different kinds of variation law of deep horizontal displacement when the lateral walls are constrained or not, proposes two dynamic curve models of normal distribution type and logarithmic type, deals with model parameters by interpolating and obtains the parameters of missing data, then missing monitoring data could be Figured out by these parameters. Compared with the result from the common average method which is used to recover missing data, in the upper 2/3 of the inclinometer tube, the result by using this method is closer to the actual monitoring data, in the lower 1/3 part of the inclinometer tube, the result from the common average method is closer to the actual monitoring data.

Method of Processing the Outliers and Missing Values of Field Data to Improve RAM Analysis Accuracy (RAM 분석 정확도 향상을 위한 야전운용 데이터의 이상값과 결측값 처리 방안)

  • Kim, In Seok;Jung, Won
    • Journal of Applied Reliability
    • /
    • v.17 no.3
    • /
    • pp.264-271
    • /
    • 2017
  • Purpose: Field operation data contains missing values or outliers due to various causes of the data collection process, so caution is required when utilizing RAM analysis results by field operation data. The purpose of this study is to present a method to minimize the RAM analysis error of the field data to improve the accuracy. Methods: Statistical methods are presented for processing of the outliers and the missing values of the field operating data, and after analyzing the RAM, the differences between before and after applying the technique are discussed. Results: The availability is estimated to be lower by 6.8 to 23.5% than that before processing, and it is judged that the processing of the missing values and outliers greatly affect the RAM analysis result. Conclusion: RAM analysis of OO weapon system was performed and suggestions for improvement of RAM analysis were presented through comparison with the new and current method. Data analysis results without appropriate treatment of error values may result in incorrect conclusions leading to inappropriate decisions and actions.

Exploiting Patterns for Handling Incomplete Coevolving EEG Time Series

  • Thi, Ngoc Anh Nguyen;Yang, Hyung-Jeong;Kim, Sun-Hee
    • International Journal of Contents
    • /
    • v.9 no.4
    • /
    • pp.1-10
    • /
    • 2013
  • The electroencephalogram (EEG) time series is a measure of electrical activity received from multiple electrodes placed on the scalp of a human brain. It provides a direct measurement for characterizing the dynamic aspects of brain activities. These EEG signals are formed from a series of spatial and temporal data with multiple dimensions. Missing data could occur due to fault electrodes. These missing data can cause distortion, repudiation, and further, reduce the effectiveness of analyzing algorithms. Current methodologies for EEG analysis require a complete set of EEG data matrix as input. Therefore, an accurate and reliable imputation approach for missing values is necessary to avoid incomplete data sets for analyses and further improve the usage of performance techniques. This research proposes a new method to automatically recover random consecutive missing data from real world EEG data based on Linear Dynamical System. The proposed method aims to capture the optimal patterns based on two main characteristics in the coevolving EEG time series: namely, (i) dynamics via discovering temporal evolving behaviors, and (ii) correlations by identifying the relationships between multiple brain signals. From these exploits, the proposed method successfully identifies a few hidden variables and discovers their dynamics to impute missing values. The proposed method offers a robust and scalable approach with linear computation time over the size of sequences. A comparative study has been performed to assess the effectiveness of the proposed method against interpolation and missing values via Singular Value Decomposition (MSVD). The experimental simulations demonstrate that the proposed method provides better reconstruction performance up to 49% and 67% improvements over MSVD and interpolation approaches, respectively.

A New Method for Imputation of Missing Genotype using Linkage Disequilibrium and Haplotype Information (결측치가 존재하는 유전형 자료에서의 연관불균형과 일배체형을 사용한 결측치 대치 방법)

  • Park Yun-Ju;Kim Young-Jin;Park Jung-Sun;Kim Kuchan;Koh Insong;Jung Ho-Youl
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.2
    • /
    • pp.99-107
    • /
    • 2005
  • In this paper, wc propose a now missing imputation method for minimizing loss of information linkage disequilibrium-based and haplotype-based imputation method, which estimate missing values of the data based on the specificity of Single Nucleotide Polymorphism(SNP) genotype data. Method for imputing data is needed to minimize the loss of information caused by experimental missing data. In general, missing imputation of biological data has used major allele imputation method. but this approach is not optima]. 1'his method has high error rates of missing values estimation since the characteristics of the genotype data are not considered not take into consideration the specific structure of the data. In this paper, we show the results of the comparative evaluation of our model methods and major imputation method for the estimation of missing values.

Imputation Procedures in Weibull Regression Analysis in the presence of missing values

  • Kim Soon-kwi;Jeong Bong-Bin
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2001.11a
    • /
    • pp.143-148
    • /
    • 2001
  • A dataset having missing observations is often completed by using imputed values. In this paper the performances and accuracy of complete case methods and four imputation procedures are evaluated when missing values exist only on the response variables in the Weibull regression model. Our simulation results show that compared to other imputation procedures, in particular, hotdeck and Weibull regression imputation procedure can be well used to compensate for missing data. In addition an illustrative real data is given.

  • PDF

A Comparative Study of Assessing Average Bioequivalence in $2{\times}2$ Crossover Design with Missing Observations

  • Park, Sang-Gue;Choi, Ji-Yun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.1
    • /
    • pp.245-257
    • /
    • 2006
  • A modified Anderson and Hauck(1983) test for analyzing a two-sequence two-period crossover design in bioequivalence trials is proposed when some observations at the second period are missing. It is based on the maximum likelihood estimators of average bioequivalence model and designed for handling missing at random(MAR) situation. The performance of the proposed test is compared to other tests using Monte Carlo simulations.

  • PDF

A comparison of imputation methods for the consecutive missing temperature data (연속적 결측이 존재하는 기온 자료에 대한 결측복원 기법의 비교)

  • Kim, Hee-Kyung;Kang, In-Kyeong;Lee, Jae-Won;Lee, Yung-Seop
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.3
    • /
    • pp.549-557
    • /
    • 2016
  • Consecutive missing values are likely to occur in long climate data due to system error or defective equipment. Furthermore, it is difficult to impute missing values. However, these complicated problems can be overcame by imputing missing values with reference time series. Reference time series must be composed of similar time series to time series that include missing values. We performed a simulation to compare three missing imputation methods (the adjusted normal ratio method, the regression method and the IDW method) to complete the missing values of time series. A comparison of the three missing imputation methods for the daily mean temperatures at 14 climatological stations indicated that the IDW method was better thanx others at south seaside stations. We also found the regression method was better than others at most stations (except south seaside stations).

Association Rule Mining Algorithm and Analysis of Missing Values

  • Lee, Jae-Wan;Bobby D. Gerardo;Kim, Gui-Tae;Jeong, Jin-Seob
    • Journal of information and communication convergence engineering
    • /
    • v.1 no.3
    • /
    • pp.150-156
    • /
    • 2003
  • This paper explored the use of an algorithm for the data mining and method in handling missing data which had generated enhanced association patterns observed using the data illustrated here. The evaluations showed that more association patterns are generated in the second analysis which suggests more meaningful rules than in the first situation. It showed that the model offer more precise and important association rules that is more valuable when applied for business decision making. With the discovery of accurate association rules or business patterns, strategies could be efficiently planned out and implemented to improve marketing schemes. This investigation gives rise to a number of interesting issues that could be explored further like the effect of outliers and missing data for detecting fraud and devious database entries.

Development of a Model Combining Covariance Matrices Derived from Spatial and Temporal Data to Estimate Missing Rainfall Data (공간 데이터와 시계열 데이터로부터 유도된 공분산행렬을 결합한 강수량 결측값 추정 모형)

  • Sung, Chan Yong
    • Journal of Environmental Science International
    • /
    • v.22 no.3
    • /
    • pp.303-308
    • /
    • 2013
  • This paper proposed a new method for estimating missing values in time series rainfall data. The proposed method integrated the two most widely used estimation methods, general linear model(GLM) and ordinary kriging(OK), by taking a weighted average of covariance matrices derived from each of the two methods. The proposed method was cross-validated using daily rainfall data at thirteen rain gauges in the Hyeong-san River basin. The goodness-of-fit of the proposed method was higher than those of GLM and OK, which can be attributed to the weighting algorithm that was designed to minimize errors caused by violations of assumptions of the two existing methods. This result suggests that the proposed method is more accurate in missing values in time series rainfall data, especially in a region where the assumptions of existing methods are not met, i.e., rainfall varies by season and topography is heterogeneous.

HANDLING MISSING VALUES IN FUZZY c-MEANS

  • Miyamoto, Sadaaki;Takata, Osamu;Unayahara, Kazutaka
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 1998.06a
    • /
    • pp.139-142
    • /
    • 1998
  • Missing values in data for fuzzy c-menas clustering is discussed. Two basic methods of fuzzy c-means, i.e., the standard fuzzy c-means and the entropy method are considered and three options of handling missing values are proposed, among which one is to define a new distance between data with missing values, second is to alter a weight in the new distance, and the third is to fill the missing values by an appropriate numbers. Experimental Results are shown.

  • PDF