• Title/Summary/Keyword: missing data estimation method

Search Result 88, Processing Time 0.021 seconds

Proposal of a Time-dependent Method for Determining the Forming Limit of Sheet Metal (판재의 성형한계 결정을 위한 시간의존적 방법의 제안)

  • Kim, S.G.;Kim, H.J.
    • Transactions of Materials Processing
    • /
    • v.27 no.2
    • /
    • pp.115-122
    • /
    • 2018
  • Most domestic and international standards on the forming limit diagram (FLD) including ISO 12004-2, use a 'position-dependent method,' which determines the forming limit from a strain distribution measured on the specimen after necking or fracture. However, the position-dependent method has inherent problems such as the incidence of asymmetry of a strain distribution, the estimation of missing data near fracture, the termination time of test, and the deformation due to the new stress equilibrium after a fracture, which is blamed for causing sometimes a significant lab-to-lab variation. The 'time-dependent method,' which is anticipated to be a new international standard for evaluating the forming limit, is expected to greatly improve these intrinsic disadvantages of the position-dependent method. It is because the time-dependent method makes it possible to identify and accurately determine the forming limit, just before the necking point from the strain data as continuously measured in a short time interval. In this study, we propose a new time-dependent method based on a Gaussian fitting of strain acceleration with the introduction of 'normalized correlation coefficient.' It has been shown in this study that this method can determine the forming limit very stably and gives a higher value, which is in comparison with the results of the previously studied position-dependent and time-dependent methods.

Effective Elimination of False Alarms by Variable Section Size in CFAR Algorithm (CFAR 적용시 섹션 크기 가변화를 이용한 오표적의 효율적 제거)

  • Roh, Ji-Eun;Choi, Beyung-Gwan;Lee, Hee-Young
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.14 no.1
    • /
    • pp.100-105
    • /
    • 2011
  • Generally, because received signals from radar are very bulky, the data are divided into manageable size called section, and sections are distributed into several digital signal processors. And then, target detection algorithms are applied simultaneously in each processor. CFAR(Constant False Alarm Rate) algorithm, which is the most popular target detection algorithm, can estimate accurate threshold values to determine which signals are targets or noises within center-cut of section allocated to each processor. However, its estimation precision is diminished in section edge data because of insufficient surrounding data to be referred. Especially this edge problem of CFAR is too serious if we have many sections to be processed, because it causes many false alarms in most every section edges. This paper describes false alarm issues on MCA(Minimum Cell Average)-CFAR, and proposes a false alarm elimination method by changing section size alternatively. Real received data from multi-function radar were used to evaluate a proposed method, and we show that our method drastically decreases false alarms without missing real targets, and improves detection performance.

Additive hazards models for interval-censored semi-competing risks data with missing intermediate events (결측되었거나 구간중도절단된 중간사건을 가진 준경쟁적위험 자료에 대한 가산위험모형)

  • Kim, Jayoun;Kim, Jinheum
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.4
    • /
    • pp.539-553
    • /
    • 2017
  • We propose a multi-state model to analyze semi-competing risks data with interval-censored or missing intermediate events. This model is an extension of the three states of the illness-death model: healthy, disease, and dead. The 'diseased' state can be considered as the intermediate event. Two more states are added into the illness-death model to incorporate the missing events, which are caused by a loss of follow-up before the end of a study. One of them is a state of the lost-to-follow-up (LTF), and the other is an unobservable state that represents an intermediate event experienced after the occurrence of LTF. Given covariates, we employ the Lin and Ying additive hazards model with log-normal frailty and construct a conditional likelihood to estimate transition intensities between states in the multi-state model. A marginalization of the full likelihood is completed using adaptive importance sampling, and the optimal solution of the regression parameters is achieved through an iterative quasi-Newton algorithm. Simulation studies are performed to investigate the finite-sample performance of the proposed estimation method in terms of empirical coverage probability of true regression parameters. Our proposed method is also illustrated with a dataset adapted from Helmer et al. (2001).

Model selection method for categorical data with non-response (무응답을 가지고 있는 범주형 자료에 대한 모형 선택 방법)

  • Yoon, Yong-Hwa;Choi, Bo-Seung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.4
    • /
    • pp.627-641
    • /
    • 2012
  • We consider a model estimation and model selection methods for the multi-way contingency table data with non-response or missing values. We also consider hierarchical Bayesian model in order to handle a boundary solution problem that can happen in the maximum likelihood estimation under non-ignorable non-response model and we deal with a model selection method to find the best model for the data. We utilized Bayes factors to handle model selection problem under Bayesian approach. We applied proposed method to the pre-election survey for the 2004 Korean National Assembly race. As a result, we got the non-ignorable non-response model was favored and the variable of voting intention was most suitable.

An estimation method for non-response model using Monte-Carlo expectation-maximization algorithm (Monte-Carlo expectation-maximaization 방법을 이용한 무응답 모형 추정방법)

  • Choi, Boseung;You, Hyeon Sang;Yoon, Yong Hwa
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.3
    • /
    • pp.587-598
    • /
    • 2016
  • In predicting an outcome of election using a variety of methods ahead of the election, non-response is one of the major issues. Therefore, to address the non-response issue, a variety of methods of non-response imputation may be employed, but the result of forecasting tend to vary according to methods. In this study, in order to improve electoral forecasts, we studied a model based method of non-response imputation attempting to apply the Monte Carlo Expectation Maximization (MCEM) algorithm, introduced by Wei and Tanner (1990). The MCEM algorithm using maximum likelihood estimates (MLEs) is applied to solve the boundary solution problem under the non-ignorable non-response mechanism. We performed the simulation studies to compare estimation performance among MCEM, maximum likelihood estimation, and Bayesian estimation method. The results of simulation studies showed that MCEM method can be a reasonable candidate for non-response model estimation. We also applied MCEM method to the Korean presidential election exit poll data of 2012 and investigated prediction performance using modified within precinct error (MWPE) criterion (Bautista et al., 2007).

Use of the Extended Kalman Filter for the Real-Time Quality Improvement of Runoff Data: 1. Algorithm Construction and Application to One Station (확장 칼만 필터를 이용한 유량자료의 실시간 품질향상: 1. 알고리즘 구축 및 단일지점에의 적용)

  • Yoo, Chul-Sang;Hwang, Jung-Ho;Kim, Jung-Ho
    • Journal of Korea Water Resources Association
    • /
    • v.45 no.7
    • /
    • pp.697-711
    • /
    • 2012
  • This study applied the extended Kalman Filter, a data assimilation method, for the real-time quality improvement of runoff measurements. The state-space model of the extended Kalman Filter was composed of a rainfall-runoff model and the runoff measurement. This study divided the purpose of quality improvement of runoff measurements into two; one is to suppress the abnormally high variation of dam inflow data, and the other to amend the missing or erroneous measurements. For each case, a proper model of extended Kalman Filter was proposed, and the main difference between two models is whether only the variation is considered or both the bias and variation are considered in the estimation of covariance function. This study was applied to the Chungju Dam Basin to confirm the proposed models were effectively worked to improve the quality of both the dam inflow data and the runoff measurements with some missing and erroneous part.

Efficiency and Robustness of Fully Adaptive Simulated Maximum Likelihood Method

  • Oh, Man-Suk;Kim, Dai-Gyoung
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.3
    • /
    • pp.479-485
    • /
    • 2009
  • When a part of data is unobserved the marginal likelihood of parameters given the observed data often involves analytically intractable high dimensional integral and hence it is hard to find the maximum likelihood estimate of the parameters. Simulated maximum likelihood(SML) method which estimates the marginal likelihood via Monte Carlo importance sampling and optimize the estimated marginal likelihood has been used in many applications. A key issue in SML is to find a good proposal density from which Monte Carlo samples are generated. The optimal proposal density is the conditional density of the unobserved data given the parameters and the observed data, and attempts have been given to find a good approximation to the optimal proposal density. Algorithms which adaptively improve the proposal density have been widely used due to its simplicity and efficiency. In this paper, we describe a fully adaptive algorithm which has been used by some practitioners but has not been well recognized in statistical literature, and evaluate its estimation performance and robustness via a simulation study. The simulation study shows a great improvement in the order of magnitudes in the mean squared error, compared to non-adaptive or partially adaptive SML methods. Also, it is shown that the fully adaptive SML is robust in a sense that it is insensitive to the starting points in the optimization routine.

Deduction of Data Quality Control Strategy for High Density Rain Gauge Network in Seoul Area (서울시 고밀도 지상강우자료 품질관리방안 도출)

  • Yoon, Seongsim;Lee, Byongju;Choi, Youngjean
    • Journal of Korea Water Resources Association
    • /
    • v.48 no.4
    • /
    • pp.245-255
    • /
    • 2015
  • This study used high density network of integrated meteorological sensor, which are operated by SK Planet, with KMA weather stations to estimate the quantitative precipitation field in Seoul area. We introduced SK Planet network and analyzed quality of the observed data for 3 months data from 1 July to 30 September 2013. As the quality analysis result, we checked most SK Planet stations observed similar with previous KMA stations. We developed the real-time quality check and adjustment method to reduce the error effect for hydrological application by missing and outlier value and we confirmed the developed method can be corrected the missing and outlier value. Through this method, we used the 190 stations(KMA 34 stations, SK Planet 156 stations) that missing ratio is less than 20% and the effect of the outlier was the smallest for quantitative precipitation estimation. Moreover, we evaluated reproducibility of rainfall field high density rain gauge network has $3km^2$/gauge. As the result, the spatial relative frequency of rainfall field using SK Planet and KMA stations is similar with radar rainfall field. And, it supplement the blank of KMA observation network. Especially, through this research we will take advantage of the density of the network to estimate rainfall field which can be considered as a very good approximation of the true value.

Study on the Method of Development of Road Flood Risk Index by Estimation of Real-time Rainfall Using the Coefficient of Correlation Weighting Method (상관계수가중치법을 적용한 실시간 강우량 추정에 따른 도로 침수위험지수 개발 방법에 대한 연구)

  • Kim, Eunmi;Rhee, Kyung Hyun;Kim, Chang Soo
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.4
    • /
    • pp.478-489
    • /
    • 2014
  • Recently, flood damage by frequent localized downpours in cities are on the increase on account of abnormal climate phenomena and growth of impermeable area by urbanization. In this study, we are focused on flooding on roads which is the basis of all means of transportation. To calculate real-time accumulated rainfall on a road link, we use the Coefficient of Correlation Weighting method (CCW) which is one of the revised methods of missing rainfall as we consider a road link as a unobserved rainfall site. CCW and real-time accumulated rainfall entered through the Internet are used to estimate the real-time rainfall on a road link. Together with the real-time accumulated rainfall, flooding history, rainfall range causing flooding of a road link and frequency probability precipitation for road design are used as factors to determine the Flood Risk Index on roads. We simulated two cases in the past, July, 7th, 2009 and July, 15th, 2012 in Busan. As a result, all of road links included in the actual flooded roads at that time got the high level of flood risk index.

Probabilistic Modeling of Fish Growth in Smart Aquaculture Systems

  • Jongwon Kim;Eunbi Park;Sungyoon Cho;Kiwon Kwon;Young Myoung Ko
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.8
    • /
    • pp.2259-2277
    • /
    • 2023
  • We propose a probabilistic fish growth model for smart aquaculture systems equipped with IoT sensors that monitor the ecological environment. As IoT sensors permeate into smart aquaculture systems, environmental data such as oxygen level and temperature are collected frequently and automatically. However, there still exists data on fish weight, tank allocation, and other factors that are collected less frequently and manually by human workers due to technological limitations. Unlike sensor data, human-collected data are hard to obtain and are prone to poor quality due to missing data and reading errors. In a situation where different types of data are mixed, it becomes challenging to develop an effective fish growth model. This study explores the unique characteristics of such a combined environmental and weight dataset. To address these characteristics, we develop a preprocessing method and a probabilistic fish growth model using mixed data sampling (MIDAS) and overlapping mixtures of Gaussian processes (OMGP). We modify the OMGP to be applicable to prediction by setting a proper prior distribution that utilizes the characteristic that the ratio of fish groups does not significantly change as they grow. We conduct a numerical study using the eel dataset collected from a real smart aquaculture system, which reveals the promising performance of our model.