• Title/Summary/Keyword: Outliers

Search Result 666, Processing Time 0.024 seconds

1D-CNN-LSTM Hybrid-Model-Based Pet Behavior Recognition through Wearable Sensor Data Augmentation

  • Hyungju Kim;Nammee Moon
    • Journal of Information Processing Systems
    • /
    • v.20 no.2
    • /
    • pp.159-172
    • /
    • 2024
  • The number of healthcare products available for pets has increased in recent times, which has prompted active research into wearable devices for pets. However, the data collected through such devices are limited by outliers and missing values owing to the anomalous and irregular characteristics of pets. Hence, we propose pet behavior recognition based on a hybrid one-dimensional convolutional neural network (CNN) and long short- term memory (LSTM) model using pet wearable devices. An Arduino-based pet wearable device was first fabricated to collect data for behavior recognition, where gyroscope and accelerometer values were collected using the device. Then, data augmentation was performed after replacing any missing values and outliers via preprocessing. At this time, the behaviors were classified into five types. To prevent bias from specific actions in the data augmentation, the number of datasets was compared and balanced, and CNN-LSTM-based deep learning was performed. The five subdivided behaviors and overall performance were then evaluated, and the overall accuracy of behavior recognition was found to be about 88.76%.

Voronoi Diagram-based USBL Outlier Rejection for AUV Localization

  • Hyeonmin Sim;Hangil Joe
    • Journal of Ocean Engineering and Technology
    • /
    • v.38 no.3
    • /
    • pp.115-123
    • /
    • 2024
  • USBL systems are essential for providing accurate positions of autonomous underwater vehicles (AUVs). On the other hand, the accuracy can be degraded by outliers because of the environmental conditions. A failure to address these outliers can significantly impact the reliability of underwater localization and navigation systems. This paper proposes a novel outlier rejection algorithm for AUV localization using Voronoi diagrams and query point calculation. The Voronoi diagram divides data space into Voronoi cells that center on ultra-short baseline (USBL) data, and the calculated query point determines if the corresponding USBL data is an inlier. This study conducted experiments acquiring GPS and USBL data simultaneously and optimized the algorithm empirically based on the acquired data. In addition, the proposed method was applied to a sensor fusion algorithm to verify its effectiveness, resulting in improved pose estimations. The proposed method can be applied to various sensor fusion algorithms as a preprocess and could be used for outlier rejection for other 2D-based location sensors.

Combined Filtering Model Using Voting Rule and Median Absolute Deviation for Travel Time Estimation (통행시간 추정을 위한 Voting Rule과 중위절대편차법 기반의 복합 필터링 모형)

  • Jeong, Youngje;Park, Hyun Suk;Kim, Byung Hwa;Kim, Youngchan
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.12 no.6
    • /
    • pp.10-21
    • /
    • 2013
  • This study suggested combined filtering model to eliminate outlier travel time data in transportation information system, and it was based on Median Absolute Deviation and Voting Rule. This model applied Median Absolute Deviation (MAD) method to follow normal distribution as first filtering process. After that, Voting rule is applied to eliminate remaining outlier travel time data after Median Absolute Deviation. In Voting Rule, travel time samples are judged as outliers according to travel-time difference between sample data and mean data. Elimination or not of outliers are determined using a majority rule. In case study of national highway No. 3, combined filtering model selectively eliminated outliers only and could improve accuracy of estimated travel time.

Firework Plot as a Graphical Exploratory Data Analysis Tool to Evaluate the Impact of Outliers in a Mixture Experiment (혼합물 실험에서 특이값의 영향을 평가하기 위한 그래픽 탐색적 자료분석 도구로서의 불꽃그림)

  • Jang, Dae-Heung;Ahn, SoJin;Kim, Youngil
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.4
    • /
    • pp.629-643
    • /
    • 2014
  • It is common to check the validity of an assumed model with the heavy use of diagnostics tools when conducting data analysis with regression techniques; however, outliers and influential data points often distort the regression output in undesired manner. Jang and Anderson-Cook (2013) proposed a graphical method called a firework plot for exploratory analysis that could visualize the trace of the impact of possible outlying and/or influential data points on individual regression coefficients and the overall residual sum of squares(SSE) measure. They developed 3-D plot as well as pair-wise plot for the appropriate measures of interest. In this paper, the approach was extended further to tell the strength of their approach; in addition, a more meaningful interpretation was possible by adding a measure not mentioned in their paper. This approach was applied to the mixture experiment because we felt that a detailed analysis of statistical measure sensitivity is required in a small experiment.

Outlier Detection of the Coastal Water Temperature Monitoring Data Using the Approximate and Detail Components (어림과 나머지 성분을 이용한 연안 수온자료의 이상자료 감지)

  • Cho, Hong-Yeon;Oh, Ji-Hee
    • Journal of the Korean Society for Marine Environment & Energy
    • /
    • v.15 no.2
    • /
    • pp.156-162
    • /
    • 2012
  • Outlier detection and treatment process is highly required as the first step for the statistical analysis of the monitoring data having many outliers frequently occurred in the coastal environmental monitoring projects. In this study, the outlier detection method using the approximate and detail (or residual) components of the (raw) data is suggested. The approximate and detail components of the data can be separated by the diverse filtering and smoothing methods. The decomposition of the data is carried out by the harmonic analysis and local regression curve, respectively. Then, the Grubbs' test and modified z-score method widely used to detect outliers in the data are applied to the detail components of the water temperature data. The new data set is reconstructed after removed the outliers detected by these methods. It can be shown that the suggested process is successfully applied to the outlier detection of the coastal water temperature monitoring data provided by the Real-time Information System for Aquaculture Environment, National Fisheries Research and Development Institute (NFRDI).

A Study on Forest Fire Detection from MODIS Data Using Local Spatial Association Analysis (국지적 공간상관분석을 이용한 MODIS영상에서의 산불탐지에 관한 연구)

  • Byun, Young-Gi;Huh, Yong;Kim, Yong-Min;Yu, Ki-Yun
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.15 no.1 s.39
    • /
    • pp.23-29
    • /
    • 2007
  • Spatial outliers in remotely sensed imagery represent observed quantities showing unusual values compared to their neighbor pixel values. There have been various methods to detect the spatial outliers based on spatial autocorrelations in statistics and data mining. These methods may be applied in detecting forest fire pixels in the MODIS imageries from NASA's AQUA satellite. This is because the forest fire detection can be referred to as finding spatial outliers using spatial variation of brightness temperature. In this paper, we propose a new forest fire detection algorithm which is based on local spatial association analysis, and test the proposed algorithm to evaluate its applicability. In order to evaluate the proposed algorithm, the results were compared with the MODIS fire product provided by the NASA MODIS Science Team, which showed the possibility of the proposed algorithm in detecting the fire pixels.

  • PDF

Principal Components Logistic Regression based on Robust Estimation (로버스트추정에 바탕을 둔 주성분로지스틱회귀)

  • Kim, Bu-Yong;Kahng, Myung-Wook;Jang, Hea-Won
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.3
    • /
    • pp.531-539
    • /
    • 2009
  • Logistic regression is widely used as a datamining technique for the customer relationship management. The maximum likelihood estimator has highly inflated variance when multicollinearity exists among the regressors, and it is not robust against outliers. Thus we propose the robust principal components logistic regression to deal with both multicollinearity and outlier problem. A procedure is suggested for the selection of principal components, which is based on the condition index. When a condition index is larger than the cutoff value obtained from the model constructed on the basis of the conjoint analysis, the corresponding principal component is removed from the logistic model. In addition, we employ an algorithm for the robust estimation, which strives to dampen the effect of outliers by applying the appropriate weights and factors to the leverage points and vertical outliers identified by the V-mask type criterion. The Monte Carlo simulation results indicate that the proposed procedure yields higher rate of correct classification than the existing method.

A Study on Outlier Adjustment for Multibeam Echosounder Data (다중빔 음향측심기 자료의 이상치 보정에 관한 연구)

  • Lee, Jung-Sook;Kim, Soo-Young;Lee, Yong-Kook;Shin, Dong-Wan;Jou, Hyeong-Tae;Kim, Han-Joon
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.6 no.1
    • /
    • pp.35-39
    • /
    • 2001
  • Multibeam echosounder data, collected to investigate seabed features and topography, are usually subject to outliers resulting from the ship's irregular movements and insufficient correction for pressure calibration to the positions of beams. We introduce a statistical method which adjusts the outliers using the ARMA (Autoregressive Moving Average) technique. Our method was applied to a set of real data acquired in the East Sea. In our approach, autocorrelation of the data is modeled by an AR (1) model. If an observation is substantially different from that obtained from the estimated AR (1) model, it is declared as an outlier and adjusted using the estimated AR (1) model. This procedure is repeated until no outlier is found. The result of processing shows that outliers that are far greater than signals in amplitude were successfully removed.

  • PDF

Robust Parameter Estimation using Fuzzy RANSAC (퍼지 RANSAC을 이용한 강건한 인수 예측)

  • Lee Joong-Jae;Jang Hyo-Jong;Kim Gye-Young;Choi Hyung-il
    • Journal of KIISE:Software and Applications
    • /
    • v.33 no.2
    • /
    • pp.252-266
    • /
    • 2006
  • Many problems in computer vision are mainly based on mathematical models. Their optimal solutions can be found by estimating the parameters of each model. However, provided an input data set is involved outliers which are relative]V larger than normal noises, they lead to incorrect results. RANSAC is a representative robust algorithm which is used to resolve the problem. One major problem with RANSAC is that it needs priori knowledge(i.e. a percentage of outliers) of the distribution of data. To solve this problem, we propose a FRANSAC algorithm which improves the rejection rate of outliers and the accuracy of solutions. This is peformed by categorizing all data into good sample set, bad sample set and vague sample set using a fuzzy classification at each iteration and sampling in only good sample set. In the experimental results, we show that the performance of the proposed algorithm when it is applied to the linear regression and the calculation of a homography.

Natural Background Level Analysis of Heavy Metal Concentration in Korean Coastal Sediments (한국 연안 퇴적물 내 중금속 원소의 자연적 배경농도 연구)

  • Lim, Dhong-Il;Choi, Jin-Yong;Jung, Hoi-Soo;Choi, Hyun-Woo;Kim, Young-Ok
    • Ocean and Polar Research
    • /
    • v.29 no.4
    • /
    • pp.379-389
    • /
    • 2007
  • This paper presents an attempt to determine natural background levels of heavy metals which could be used for assessing heavy metal contamination. For this study, a large archive dataset of heavy metal concentration (Cu, Cr, Ni, Pb, Zn) for more than 900 surface sediment samples from various Korean coastal environments was newly compiled. These data were normalized for aluminum (grain-size normalizer) concentration to isolate natural factors from anthropogenic ones. The normalization was based on the hypothesis that heavy metal concentrations vary consistently with the concentration of aluminum, unless these metals are of anthropogenic origin. So, the samples (outliers) suspected of receivingany anthropogenic input were removed from regression to ascertain the "background" relationship between the metals and aluminum. Identification of these outliers was tested using a model of predicted limits at 95%. The process of testing for normality (Kolmogorov-Smirnov Test) and selection of outliers was iterated until a normal distribution was achieved. On the basis of the linear regression analysis of the large archive (please check) dataset, background levels, which are applicable to heavy metal assessment of Korean coastal sediments, were successfully developed for Cu, Cr, Ni, Zn. As an example, we tested the applicability of this baseline level for metal pollution assessment of Masan Bay sediments.