• Title/Summary/Keyword: outlier identification

Search Result 36, Processing Time 0.02 seconds

Identification of Regression Outliers Based on Clustering of LMS-residual Plots

  • Kim, Bu-Yong;Oh, Mi-Hyun
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.3
    • /
    • pp.485-494
    • /
    • 2004
  • An algorithm is proposed to identify multiple outliers in linear regression. It is based on the clustering of residuals from the least median of squares estimation. A cut-height criterion for the hierarchical cluster tree is suggested, which yields the optimal clustering of the regression outliers. Comparisons of the effectiveness of the procedures are performed on the basis of the classic data and artificial data sets, and it is shown that the proposed algorithm is superior to the one that is based on the least squares estimation. In particular, the algorithm deals very well with the masking and swamping effects while the other does not.

Online condition assessment of high-speed trains based on Bayesian forecasting approach and time series analysis

  • Zhang, Lin-Hao;Wang, You-Wu;Ni, Yi-Qing;Lai, Siu-Kai
    • Smart Structures and Systems
    • /
    • v.21 no.5
    • /
    • pp.705-713
    • /
    • 2018
  • High-speed rail (HSR) has been in operation and development in many countries worldwide. The explosive growth of HSR has posed great challenges for operation safety and ride comfort. Among various technological demands on high-speed trains, vibration is an inevitable problem caused by rail/wheel imperfections, vehicle dynamics, and aerodynamic instability. Ride comfort is a key factor in evaluating the operational performance of high-speed trains. In this study, online monitoring data have been acquired from an in-service high-speed train for condition assessment. The measured dynamic response signals at the floor level of a train cabin are processed by the Sperling operator, in which the ride comfort index sequence is used to identify the train's operation condition. In addition, a novel technique that incorporates salient features of Bayesian inference and time series analysis is proposed for outlier detection and change detection. The Bayesian forecasting approach enables the prediction of conditional probabilities. By integrating the Bayesian forecasting approach with time series analysis, one-step forecasting probability density functions (PDFs) can be obtained before proceeding to the next observation. The change detection is conducted by comparing the current model and the alternative model (whose mean value is shifted by a prescribed offset) to determine which one can well fit the actual observation. When the comparison results indicate that the alternative model performs better, then a potential change is detected. If the current observation is a potential outlier or change, Bayes factor and cumulative Bayes factor are derived for further identification. A significant change, if identified, implies that there is a great alteration in the train operation performance due to defects. In this study, two illustrative cases are provided to demonstrate the performance of the proposed method for condition assessment of high-speed trains.

Implementation of Deep-sea UUV Precise Underwater Navigation based on Multiple Sensor Fusion (다중센서융합 기반의 심해무인잠수정 정밀수중항법 구현)

  • Kim, Ki-Hun;Choi, Hyun-Taek;Kim, Sea-Moon;Lee, Pan-Mook;Lee, Chong-Moo;Cho, Seong-Kwon
    • Journal of Ocean Engineering and Technology
    • /
    • v.24 no.3
    • /
    • pp.46-51
    • /
    • 2010
  • This paper describes the implementation of a precise underwater navigation solution using a multi-sensor fusion technique based on USBL, DVL, and IMU measurements. To implement this precise underwater navigation solution, three strategies are chosen. The first involves heading alignment angle identification to enhance the performance of a standalone dead-reckoning algorithm. In the second, the absolute position is found quickly to prevent the accumulation of integration error. The third one is the introduction of an effective outlier rejection algorithm. The performance of the developed algorithm was verified with experimental data acquired by the deep-sea ROV, Hemire, in the East-sea during a survey of a methane gas seepage area at a 1,500 m depth.

Algorithm for the Robust Estimation in Logistic Regression (로지스틱회귀모형의 로버스트 추정을 위한 알고리즘)

  • Kim, Bu-Yong;Kahng, Myung-Wook;Choi, Mi-Ae
    • The Korean Journal of Applied Statistics
    • /
    • v.20 no.3
    • /
    • pp.551-559
    • /
    • 2007
  • The maximum likelihood estimation is not robust against outliers in the logistic regression. Thus we propose an algorithm for the robust estimation, which identifies the bad leverage points and vertical outliers by the V-mask type criterion, and then strives to dampen the effect of outliers. Our main finding is that, by an appropriate selection of weights and factors, we could obtain the logistic estimates with high breakdown point. The proposed algorithm is evaluated by means of the correct classification rate on the basis of real-life and artificial data sets. The results indicate that the proposed algorithm is superior to the maximum likelihood estimation in terms of the classification.

Bayesian forecasting approach for structure response prediction and load effect separation of a revolving auditorium

  • Ma, Zhi;Yun, Chung-Bang;Shen, Yan-Bin;Yu, Feng;Wan, Hua-Ping;Luo, Yao-Zhi
    • Smart Structures and Systems
    • /
    • v.24 no.4
    • /
    • pp.507-524
    • /
    • 2019
  • A Bayesian dynamic linear model (BDLM) is presented for a data-driven analysis for response prediction and load effect separation of a revolving auditorium structure, where the main loads are self-weight and dead loads, temperature load, and audience load. Analyses are carried out based on the long-term monitoring data for static strains on several key members of the structure. Three improvements are introduced to the ordinary regression BDLM, which are a classificatory regression term to address the temporary audience load effect, improved inference for the variance of observation noise to be updated continuously, and component discount factors for effective load effect separation. The effects of those improvements are evaluated regarding the root mean square errors, standard deviations, and 95% confidence intervals of the predictions. Bayes factors are used for evaluating the probability distributions of the predictions, which are essential to structural condition assessments, such as outlier identification and reliability analysis. The performance of the present BDLM has been successfully verified based on the simulated data and the real data obtained from the structural health monitoring system installed on the revolving structure.

Wheel tread defect detection for high-speed trains using FBG-based online monitoring techniques

  • Liu, Xiao-Zhou;Ni, Yi-Qing
    • Smart Structures and Systems
    • /
    • v.21 no.5
    • /
    • pp.687-694
    • /
    • 2018
  • The problem of wheel tread defects has become a major challenge for the health management of high-speed rail as a wheel defect with small radius deviation may suffice to give rise to severe damage on both the train bogie components and the track structure when a train runs at high speeds. It is thus highly desirable to detect the defects soon after their occurrences and then conduct wheel turning for the defective wheelsets. Online wheel condition monitoring using wheel impact load detector (WILD) can be an effective solution, since it can assess the wheel condition and detect potential defects during train passage. This study aims to develop an FBG-based track-side wheel condition monitoring method for the detection of wheel tread defects. The track-side sensing system uses two FBG strain gauge arrays mounted on the rail foot, measuring the dynamic strains of the paired rails excited by passing wheelsets. Each FBG array has a length of about 3 m, slightly longer than the wheel circumference to ensure a full coverage for the detection of any potential defect on the tread. A defect detection algorithm is developed for using the online-monitored rail responses to identify the potential wheel tread defects. This algorithm consists of three steps: 1) strain data pre-processing by using a data smoothing technique to remove the trends; 2) diagnosis of novel responses by outlier analysis for the normalized data; and 3) local defect identification by a refined analysis on the novel responses extracted in Step 2. To verify the proposed method, a field test was conducted using a test train incorporating defective wheels. The train ran at different speeds on an instrumented track with the purpose of wheel condition monitoring. By using the proposed method to process the monitoring data, all the defects were identified and the results agreed well with those from the static inspection of the wheelsets in the depot. A comparison is also drawn for the detection accuracy under different running speeds of the test train, and the results show that the proposed method can achieve a satisfactory accuracy in wheel defect detection when the train runs at a speed higher than 30 kph. Some minor defects with a depth of 0.05 mm~0.06 mm are also successfully detected.

The Consideration on Calculation of Optimal Travel Speeds based on Analysis of AVI Data (AVI 수집 자료 분석에 근거한 최적 통행속도 산출에 관한 고찰)

  • Jeong, Yeon Tak;Jung, Hun Young
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.35 no.3
    • /
    • pp.625-637
    • /
    • 2015
  • This study aims to calculate optimal travel speeds based on analysis of the AVI data collected in the uninterrupted traffic flow, and the results are as follows. Firstly, we looked into the distribution of the sectional travel times of each probe vehicle and compared the difference in the sectional travel speeds of each probe vehicle. As a result, it is shown that outliers should be removed for the distribution of the sectional travel times. Secondly, there were differences among type 1(passenger automobiles) & type 2(automobiles for passengers and freight) and type 4(special automobiles) in the non-congestion section. thus it was revealed that there is a necessity to remove type 4(special automobiles) when calculating the sectional travel speeds. Thirdly, Based on the results of these, the optimal outlier removal procedures were applied to this study. As a result, it showed that the MAPE was between 0.3% and 2.0% and RMSE was between 0.3 and 2.3 which are very similar figures to the actual average traffic speed. Also, the minimum sample size was satisfied at the confidence level of 95%. The result of study is expected to serve as a useful basis for the local government to build the AVI. In the future, it will be necessary to study to integrate AVI data and other data for more accurate traffic information.

A Novel of Data Clustering Architecture for Outlier Detection to Electric Power Data Analysis (전력데이터 분석에서 이상점 추출을 위한 데이터 클러스터링 아키텍처에 관한 연구)

  • Jung, Se Hoon;Shin, Chang Sun;Cho, Young Yun;Park, Jang Woo;Park, Myung Hye;Kim, Young Hyun;Lee, Seung Bae;Sim, Chun Bo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.10
    • /
    • pp.465-472
    • /
    • 2017
  • In the past, researchers mainly used the supervised learning technique of machine learning to analyze power data and investigated the identification of patterns through the data mining technique. Data analysis research, however, faces its limitations with the old data classification and analysis techniques today when the size of electric power data has increased with the possible real-time provision of data. This study thus set out to propose a clustering architecture to analyze large-sized electric power data. The clustering process proposed in the study supplements the K-means algorithm, an unsupervised learning technique, for its problems and is capable of automating the entire process from the collection of electric power data to their analysis. In the present study, power data were categorized and analyzed in total three levels, which include the row data level, clustering level, and user interface level. In addition, the investigator identified K, the ideal number of clusters, based on principal component analysis and normal distribution and proposed an altered K-means algorithm to reduce data that would be categorized as ideal points in order to increase the efficiency of clustering.

Automatic Identification of the Lumen Border in Intravascular Ultrasound Images (혈관 내 초음파 영상에서 내강 경계면 자동 분할)

  • Park, Jun-Oh;Ko, Byoung-Chul;Park, Hee-Jun;Nam, Jae-Yeal
    • The KIPS Transactions:PartB
    • /
    • v.19B no.3
    • /
    • pp.201-208
    • /
    • 2012
  • Accurately segmenting lumen border in intravascular ultrasound images (IVUS) is very important to study vascular wall architecture for diagnosis of the cardiovascular diseases. After each of IVUS image is transformed to a polar coordinated image, initial points are detected using wavelet transform. Then, lumen border is initialized as the set of important points using non parametric probability density function and smoothing function by removing outlier initial points occurred by noises and artifacts. Finally, polynomial curve fitting is applied to obtain real lumen border using filtered important points. The evaluation of proposed method was performed with related method and the proposed method produced accurate lumen contour detection when compared to another method in most types of IVUS images.

Damage detection in beam-like structures using deflections obtained by modal flexibility matrices

  • Koo, Ki-Young;Lee, Jong-Jae;Yun, Chung-Bang;Kim, Jeong-Tae
    • Smart Structures and Systems
    • /
    • v.4 no.5
    • /
    • pp.605-628
    • /
    • 2008
  • In bridge structures, damage may induce an additional deflection which may naturally contain essential information about the damage. However, inverse mapping from the damage-induced deflection to the actual damage location and severity is generally complex, particularly for statically indeterminate systems. In this paper, a new load concept, called the positive-bending-inspection-load (PBIL) is proposed to construct a simple inverse mapping from the damage-induced deflection to the actual damage location. A PBIL for an inspection region is defined as a load or a system of loads which guarantees the bending moment to be positive in the inspection region. From the theoretical investigations, it was proven that the damage-induced chord-wise deflection (DI-CD) has the maximum value with the abrupt change in its slope at the damage location under a PBIL. Hence, a novel damage localization method is proposed based on the DI-CD under a PBIL. The procedure may be summarized as: (1) identification of the modal flexibility matrices from acceleration measurements, (2) design for a PBIL for an inspection region of interest in a structure, (3) calculation of the chord-wise deflections for the PBIL using the modal flexibility matrices, and (4) damage localization by finding the location with the maximum DI-CD with the abrupt change in its slope within the inspection region. Procedures from (2)-(4) can be repeated for several inspection regions to cover the whole structure complementarily. Numerical verification studies were carried out on a simply supported beam and a three-span continuous beam model. Experimental verification study was also carried out on a two-span continuous beam structure with a steel box-girder. It was found that the proposed method can identify the damage existence and damage location for small damage cases with narrow cuts at the bottom flange.