• 제목/요약/키워드: Outlier Detection Method

검색결과 130건 처리시간 0.023초

OUTLIER DETECTION BASED ON A CHANGE OF LIKELIHOOD

  • Kim, Myung-Geun
    • Journal of applied mathematics & informatics
    • /
    • 제26권5_6호
    • /
    • pp.1133-1138
    • /
    • 2008
  • A general method of detecting outliers based on a change of likelihood by using the influence function is suggested. It can be applied to all kinds of distributions that are specified by parameters. For the multivariate normal case, specific computations are made to get the corresponding conditional influence function. A numerical example is provided for illustration.

  • PDF

이상치 탐지법을 이용한 강건 이분산 검정 (Robust tests for heteroscedasticity using outlier detection methods)

  • 서한손;윤민
    • 응용통계연구
    • /
    • 제29권3호
    • /
    • pp.399-408
    • /
    • 2016
  • 회귀분석에서 이분산이 발생할 경우 표준적 추정절차에 따른 결과는 유효하지 않게 되므로 이를 확인하는 것이 필요하다. 이분산 문제와 더불어 이상치가 함께 존재하면 이분산에 관한 진단은 왜곡될 수 있다. 이상치가 존재할 때 이분산을 진단하는 기존의 방법들은 강건통계량을 이용하거나 이상치를 제거하는 접근법을 사용한다. 이분산 문제에서 이상치를 탐지하기 위하여 여러 가지 접근법이 제시되었다. 본 연구에서는 이분산 진단과정에서 이상치를 배제하기 위하여 기존의 이분산 검정과정에 순차적 이상치 탐지법을 적용하는 절차를 제시한다. 제시된 방법은 모의실험 및 예제를 통해 기존의 검정방법과 검정력을 비교한다.

고혈압 예측을 위한 이상치 탐지 알고리즘 및 데이터 통합 기법 (An Outlier Detection Algorithm and Data Integration Technique for Prediction of Hypertension)

  • 홍고르출;김미혜 ;송미화
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2023년도 춘계학술발표대회
    • /
    • pp.417-419
    • /
    • 2023
  • Hypertension is one of the leading causes of mortality worldwide. In recent years, the incidence of hypertension has increased dramatically, not only among the elderly but also among young people. In this regard, the use of machine-learning methods to diagnose the causes of hypertension has increased in recent years. In this study, we improved the prediction of hypertension detection using Mahalanobis distance-based multivariate outlier removal using the KNHANES database from the Korean national health data and the COVID-19 dataset from Kaggle. This study was divided into two modules. Initially, the data preprocessing step used merged datasets and decision-tree classifier-based feature selection. The next module applies a predictive analysis step to remove multivariate outliers using the Mahalanobis distance from the experimental dataset and makes a prediction of hypertension. In this study, we compared the accuracy of each classification model. The best results showed that the proposed MAH_RF algorithm had an accuracy of 82.66%. The proposed method can be used not only for hypertension but also for the detection of various diseases such as stroke and cardiovascular disease.

Sequence-based 5-mers highly correlated to epigenetic modifications in genes interactions

  • Salimi, Dariush;Moeini, Ali;Masoudi?Nejad, Ali
    • Genes and Genomics
    • /
    • 제40권12호
    • /
    • pp.1363-1371
    • /
    • 2018
  • One of the main concerns in biology is extracting sophisticated features from DNA sequence for gene interaction determination, receiving a great deal of researchers' attention. The epigenetic modifications along with their patterns have been intensely recognized as dominant features affecting on gene expression. However, studying sequenced-based features highly correlated to this key element has remained limited. The main objective in this research was to propose a new feature highly correlated to epigenetic modifications capable of classification of genes. In this paper, classification of 34 genes in PPAR signaling pathway associated with muscle fat tissue in human was performed. Using different statistical outlier detection methods, we proposed that 5-mers highly correlated to epigenetic modifications can correctly categorize the genes involved in the same biological pathway or process. Thirty-four genes in PPAR signaling pathway were classified via applying a proposed feature, 5-mers strongly associated to 17 different epigenetic modifications. For this, diverse statistical outlier detection methods were applied to specify the group of thoroughly correlated genes. The results indicated that these 5-mers can appropriately identify correlated genes. In addition, our results corresponded to GeneMania interaction information, leading to support the suggested method. The appealing findings imply that not only epigenetic modifications but also their highly correlated 5-mers can be applied for reconstructing gene regulatory networks as supplementary data as well as other applications like physical interaction, genes prioritization, indicating some sort of data fusion in this analysis.

비 가우시안 잡음이 존재하는 무선 센서 네트워크에서 Robust Statistics를 활용하는 수신신호세기기반의 위치 추정 기법 (A RSS-Based Localization Method Utilizing Robust Statistics for Wireless Sensor Networks under Non-Gaussian Noise)

  • 안태준;구인수
    • 한국인터넷방송통신학회논문지
    • /
    • 제11권3호
    • /
    • pp.23-30
    • /
    • 2011
  • 무선 센서 네트워크에서, 각 센서 노드들로부터 수집된 정보를 효율적으로 활용하기 위해 센서 노드의 정확한 위치 정보는 필수적이다. 센서 노드의 위치를 추정하는 다양한 기법들 중, 일반적으로 많이 사용되는 수신신호세기(RSS)기법은 추가적인 하드웨어 자원 없이 쉽게 구현될 수 있으나 채널 환경에 따라 다양한 표본 데이터들이 수집 될 수 있고, 특히 이상점(outlier)이 포함 될 수 있다. 이러한 이상점들은, 수집된 표본들로부터 통계적 분석(statistical analysis)에 상당한 요인을 미치며 위치 추정 오차를 발생시키는 주요한 원인이 된다. 따라서 본 논문에서는, 이상점이 포함 된 표본들로부터 정확한 위치 추정을 위해 Robust Statistics를 적용한 가우시안 필터 알고리즘을 제안한다. 제안한 알고리즘은 이상점이 포함된 표본들로부터 이상점을 제거하고, 낮은 확률값의 표본들을 배제함으로써 위치 추정의 정확도를 향상시킨다. 시뮬레이션 결과로부터, 이상점이 포함 된 표본들로부터 비 가우시안적 환경에서 제안된 방법의 위치 추정의 정확성 향상과 강인성을 확인하였다.

지능형 다짐값의 공간적 분포를 고려한 이상치 분석 기법 연구 (Study on Outlier Analysis Considering the Spatial Distribution of Intelligent Compaction Measurement Values)

  • 정택규;조진우;정충기;백성하
    • 한국지반공학회논문집
    • /
    • 제40권4호
    • /
    • pp.91-103
    • /
    • 2024
  • 본 연구에서는 전체 시공영역에 대해 연속적으로 도출되는 지능형 다짐값의 높은 변동성과 관련한 문제를 해결하기 위해서, 지능형 다짐값의 공간적 분포를 고려한 이상치 분석 기법을 제안하였다. 제안된 기법에서는 다짐횟수 증가에도 불구하고 특정 위치에서 측정된 CMV가 감소하는 경우를 1차적으로 선별하고, 유효반경 1.5m 내에서 측정된 값들과의 차이가 큰 값들을 이상치로 판별한다. 본 연구에서 제안된 이상치 분석 기법을 현장시험에서 측정된 CMV 데이터에 적용한 결과, 지반의 내재적 불균질성은 고려하면서 다짐 품질과 관계없는 다짐롤러 구동조건의 변화에 따른 영향만을 배제할 수 있는 것으로 나타났다. 이상치 제거 후 CMV의 변동계수는 21.4~26.3%로 산정되었으며 관련 기준(20%)에서 제시하고 있는 수치보다 크게 나타났다. 추후 제안된 이상치 분석 기법에 여러 현장시험 데이터를 적용하여 고도화하고 지능형 다짐값의 변동성에 대한 합리적인 기준을 제안해야 할 것으로 판단된다.

이상치를 이용한 관측적 침하예측기법의 개발 (Development of a Observational Settlement Analysis Method Using Outliers)

  • 우철웅;장병욱
    • 한국농공학회지
    • /
    • 제45권5호
    • /
    • pp.140-150
    • /
    • 2003
  • Observational methods such as the Asaoka's method and the hyperbolic method are widely applied on the settlement analysis using observed settlement. The most unreliable aspects in those methods is arose from the subjective discretion of initial non-linearity on linear regression. The initial non-linearity is inevitable due to the settlement behaviour itself. Therefore an objective method is essential to achieve more reliable results on settlement analysis. It was found that the initial non-linear data are statistical outliers. New automation algorithms of the hyperbolic and the Asaoka's method were developed based on outlier detection method. The methods are a successive detection of outliers and a searching method of suitable hyperbolic range for the Asaoka's and the hyperbolic method respectively. Applicability of the algorithms was verified through case studies.

An Effective Anomaly Detection Approach based on Hybrid Unsupervised Learning Technologies in NIDS

  • Kangseok Kim
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제18권2호
    • /
    • pp.494-510
    • /
    • 2024
  • Internet users are exposed to sophisticated cyberattacks that intrusion detection systems have difficulty detecting. Therefore, research is increasing on intrusion detection methods that use artificial intelligence technology for detecting novel cyberattacks. Unsupervised learning-based methods are being researched that learn only from normal data and detect abnormal behaviors by finding patterns. This study developed an anomaly-detection method based on unsupervised machines and deep learning for a network intrusion detection system (NIDS). We present a hybrid anomaly detection approach based on unsupervised learning techniques using the autoencoder (AE), Isolation Forest (IF), and Local Outlier Factor (LOF) algorithms. An oversampling approach that increased the detection rate was also examined. A hybrid approach that combined deep learning algorithms and traditional machine learning algorithms was highly effective in setting the thresholds for anomalies without subjective human judgment. It achieved precision and recall rates respectively of 88.2% and 92.8% when combining two AEs, IF, and LOF while using an oversampling approach to learn more unknown normal data improved the detection accuracy. This approach achieved precision and recall rates respectively of 88.2% and 94.6%, further improving the detection accuracy compared with the hybrid method. Therefore, in NIDS the proposed approach provides high reliability for detecting cyberattacks.

상시 온도변화 효과를 고려한 모드 유연도행렬 기반의 교량의 손상탐색기법 (Damage Detection in Bridges Using Modal Flexibility Matrices Under Temperature Variation)

  • 구기영;이종재;윤정방
    • 한국전산구조공학회:학술대회논문집
    • /
    • 한국전산구조공학회 2007년도 정기 학술대회 논문집
    • /
    • pp.651-656
    • /
    • 2007
  • Changes in measured structural responses induced by a damage could be significantly smaller than those by environmental effects such as temperature and temperature gradients. It is highly desirable to develop a methodology to distinguish the changes due to the structural damage from those by the environmental variations. In this study, a novel method to extract the damage-induced deflection under temperature variations is presented using the outlier analysis on the deflections obtained using the modal flexibility matrices. The main idea is that temperature change in a bridge would produce global increase or decrease in deflections over the whole bridge while structural damages may cause local variations in deflections near the damage locations. Hence, the correlation between the deflection measurements may show high abnormality near the damage locations. A series of laboratory tests were carried out on a bridge model with a steel box-girder for 14 days. It has been found that the damage existence assessment and localization can carried out for a case with relatively small damage under the temperature variations

  • PDF

Rényi Divergence 기반 이상치 검출을 통한 적응형 센서/이종 인프라 통합 보행자 항법 기술 (Adaptive Sensor/Heterogeneous Infrastructure Integrated Pedestrian Navigation Technology using Rényi Divergence-based Outlier Detection)

  • 권재욱;조성윤;유재준;서성훈
    • Journal of Positioning, Navigation, and Timing
    • /
    • 제13권3호
    • /
    • pp.289-299
    • /
    • 2024
  • In the Pedestrian Dead Reckoning (PDR)/Global Positioning System (GPS)/Wi-Fi-integrated navigation system for indoor/outdoor continuous positioning of pedestrians, the process of detecting outliers in measurements is very important. When accurate location information from measurements is used, reliable correction data can be generated during the fusion filtering process. However, abnormal measurements may occur in certain situations, such as indoor/outdoor transitions, which can degrade filter performance and lead to significant errors in the estimated position. To address this issue, this paper proposes a method for detecting outliers in measurements based on Rényi Divergence (RD). When the deviation of the RD value is large, the measurements are considered outliers, and positioning is performed using only pure PDR. Based on experiments conducted with real data, it was confirmed that outliers were effectively detected for abnormal measurements, leading to an improvement in the performance of pedestrian navigation.